  •  Data Cleaning and harmonization: Different datasets were collected using different protocols, and therefore require bringing under a common format. Any erroneous datapoints or outliers are also removed. Some data is log scaled. 
  • Data loaded into a database uder a consistent format
  • Covariates prepared: Cloud-free Sentinel 2, Landsat and MODIS satellite images prepared for Africa. Other climactic and terrain variables were also used. 30m and 250m resolution covariates were prepared.
  • Preparation of models: model fine-tuning and feature selection, prepared for each soil property
  • Model running: An ensemble of 5 regression modelling algorithms was used to predict soil properties: Random Forest, Gradient Boosting, Cubist, Neural networks, Generalised Linear Modelling with Lasso or Elasticnet regularization 
  • Predictions at coarse (250m) and fine (30m) resolution: Created independently and then merged using ensembling
  • Properties are predicted at 0cm, 20cm and 50cm: Properties are then aggregated to 2 standards depths of 0-20 cm and 20-50cm
  • Quality control : Predictions undergo review by soil science experts, feedback and improvements made
  • Repeat:  Based upon feedback with experts, the entire process is tweaked and rerun. During the generation of the iSDAsoil maps, this process was run at least 5 times 
  • We have predicted soil properties for Africa at 30m resolution, for 20 soil properties at 2 depths (0-20cm and 20-50cm)
  • 30m resolution equates to ~24 billion locations predicted across Africa per soil property
  • Soil property predictions were made using ensemble machine learning, incorporating high resolution satellite information
  • Fertiliser recommendations
  • Yield forecasting
  • Crop suitability mapping
  • Carbon monitoring



