Feature Engineering

Machine learning and domain expertise are often viewed as two competing approaches for solving problems in any area, including agriculture. Our strategy is to use the best from both worlds. Machine learning algorithms show excellent performance in finding hidden dependencies within the data, but there are already well-established theories that explain the natural processes and they have been proven in practice. We are fully exploiting this fact by incorporating the domain knowledge in our models. This is usually done in the process of feature selection and feature engineering.

 

One of the examples of feature engineering, based on the domain expertise, was used for the selection of stress-tolerant maize hybrids in Syngenta Crop Challenge in Analytics 2019. Drought is a phenomenon related to longer period of time without precipitation accompanied with high temperatures and low air humidity. However, there is no strict mathematical formulation of drought and many indicators have been used in practice.

 

Based on daily values of maximum and minimum temperature, precipitation and vapor pressure in several stages throughout the growing season, we designed features indicating to drought and heat stress at the specific environment. They included maximum number of consecutive dry days, low precipitation sum in a certain growth stage and vapor pressure deficit. In this problem, the engineered features helped us to determine the drought-resistant hybrids that should be planted to maximise the yield and minimise the risk of low income for the farmer, but they can be used in other as well. Other examples include engineering of meteorological features for prediction of airborne pollen concentrations, engineering of elevation features to better predict the flow of nutrients in the field and engineering of features that describe the animal behaviour for prediction of disease and mammary inflammation.