Recent Project Notebooks

The example notebooks presented below will change and we will add any new ones as they arise in our work. These are generally our longer content posts, which include open data and code (available on OSF). Hit the green buttons to get the whole data story. Please also feel free to reuse any of the data, images, or code that you find useful here. Alternatively, give us a shout if you find anything dodgy. However, we would truly appreciate a citation once in a while. Everything on this page is licensed as CC BY 4.0.

Excess soil acidity remediation decisions with MIR screening (2024)

To effectively remediate acid soils over the long-term, both active and exchangeable acidities should be neutralized to a level that can be tolerated by the crop combination/rotation that is being grown. The recommendation process for excess soil acidity (xHp) remediation used in this notebook involves evaluating whether to take immediate remediation actions, perform measurements to assess xHp levels, or take no action. The framework leverages a multilevel model to predict xHp levels of soil samples based on their MIR spectral properties. The model provides prior distributions of xHp levels, which are then updated with MIR soil test data to form a posterior distribution. Different actions—remediation without measurement, doing nothing, and measuring first—are assessed using this posterior distribution. Download the code from OSF here.

30 minute read

Uganda wildfires: A Bayesian approach for forecasting VIIRS-SNPP fires on a discrete global grid (2024)

This notebook develops a predictive Bayesian repeated measures approach on a Discrete Global Grid. The approach utilizes openly available fire detection time series from the Visible Infrared Imaging Radiometer Suite (VIIRS-SNPP) satellite for Uganda (2012 – 2023). The goal is to uncover spatial patterns and temporal trends that are often overlooked. By doing so, the associated models can forecast potential hotspots and assess the likelihood of future fires for the upcoming year. The can also be used for fire risk mapping. You can find all the data and code at our Uganda OSF repository.

45 minute read

Predictive mapping of building densities from GeoSurvey data in Uganda (2024)

An accurate record and mapping of buildings is important for a range of applications, from population estimation, urban development and humanitarian response planning and environmental science. Apart from their numerous other uses , accurate maps of specifically building densities are useful for understanding human impacts on the natural environment. This notebook develops a Bayesian method for mapping and monitoring building densities using Uganda’s 2020 GeoSurvey data and raster features. It addresses challenges in predictive mapping of buildings, such as model prediction uncertainty and baseline establishment. The approach is designed to be open and reproducible. You can find all the data and code at our Uganda OSF repository.

45 minute read

Bayesian multilevel regression and irrigation survey poststratification on a discrete grid in Uganda (2023)

This notebook examines the use of Multilevel Regression and Poststratification (MRP) in situations when adjustments are needed to correct for over- or under-representation of certain places in a specific Region of Interest. MRP combines multilevel modeling with poststratification adjustments, enhancing predictions in smaller geographical areas and improving the reproducibility, comprehensiveness, and interpretation of land survey results. We use a Bayesian multilevel regression approach because it provides a fuller picture of uncertainty. Unlike more traditional point estimates, the posterior distributions that are generated provide a range of plausible values for new observations, highlighting the uncertainty inherent in any predictions. Download the code from OSF here.

45 minute read

Spatially balanced sampling and small area predictions on a discrete grid (2023)

This notebook demonstrates setting up a representative and reproducible, which provides complete coverage of Uganda’s croplands. Croplands are the primary Region of Interest and the target for various land management interventions by the Government of Uganda and private sector entities. Based on a recent high-resolution remote sensing, croplands occupy approximately 112,404 square kilometers of Uganda’s overall land area. Selecting an appropriate sampling frame for ground observations and measurements and/or experiments is a critical planning step because it determines both the main and recurrent costs of any mapping and/or monitoring activities. Download the code from OSF here.

30 minute read

Stacked spatial predictions of smallholder irrigation in Uganda (2023)

Irrigation is crucial in Africa, a continent grappling with climatic challenges and a growing population. It enhances food security by mitigating drought risks and improving crop yields, thus playing a key role in economic growth and poverty reduction. Sustainable irrigation practices are essential for ecological sustainability, though careful management is needed to avoid environmental degradation. In Uganda, research gaps include identifying suitable irrigation areas, assessing environmental impacts, analyzing economic benefits, forecasting climate change impacts, and evaluating policy effectiveness. Addressing these will significantly improve irrigation practices, ensuring food security, economic development, and environmental sustainability. This notebook provides R code for exploring irrigation in Uganda. Download the code from OSF here.

25 minute read

Spatial predictions of cereal grain ionomes from Ethiopia and Malawi (2023 update)

This notebook focuses on understanding the ionomic (elemental) composition and spatial distribution of mineral nutrients in cereal grains from Ethiopia and Malawi. We combined machine-learning techniques to create high-resolution maps of cereal crop ionomes across Ethiopia and Malawi. The workflow described in the notebook combines satellite imagery with various datasets, including land cover, soil properties, climate data, and topographic information, to establish relationships between these factors and the ionomic nutrient composition of cereal grains. The integration of these diverse data sources with different MLAs and stacked generalizations provide accurate predictions of cereal crop ionomes at national and food system scales. These spatial predictions are useful for:

  • Identifying nutrient deficiencies: By understanding the elemental composition of crops in different regions, researchers and policymakers can identify areas with nutrient deficiencies hindering crop growth and yields. Targeted interventions can be developed to address these deficiencies and improve crop productivity.
  • Enhancing food security: Mapping crop ionomes can contribute to efforts to enhance food security in Africa by providing data on the nutrient content of crops. This information can be used to develop strategies to improve the nutritional quality of food produced in the region.
  • Guiding agricultural practices: Spatial predictions of crop ionomes can help farmers make informed decisions on the best management practices for their crops, such as selecting appropriate fertilizers and adjusting application rates to optimize nutrient availability and uptake.
  • Environmental monitoring: Understanding the distribution of elements in crops can also help monitor the potential environmental impacts of agricultural activities, such as the accumulation of toxic elements in soil and water resources.
30 minute read

AfSIS SOPs for collecting soil and crop samples (2023)

This notebook provides an update of the standard AfSIS field sampling operating procedures and workflows for collecting soil and crop samples. The notebook is maintained on GitHub (here).

10 minute read

Land cover classification with multilabel GeoSurvey data from Malawi (2022)

Quantifying the geographical extent, location, and spatial dynamics of croplands, rural and urban settlements, and different types of vegetation cover provides essential information for monitoring and managing human-dominated ecosystems and landscapes. Large portions of Africa remain a virtual “terra incognita” in this context. The main reason for monitoring land cover is to assess where in a particular country or region of interest significant impacts of humans on ecosystem services can be expected within different land cover classes. The main goal of this notebook is to illustrate an improved starter code for predictive land cover mapping with multilabel data. This notebook is maintained on Github (here), and you can fork and alter it from there for your reference and use.

15 minute read …

Labeling staple cropping systems with association rules and multilabel classification (2022)

The main objective of this notebook is to introduce code for labeling, exploration and (spatial) discovery of different mixed staple food cropping systems in Rwanda and Tanzania. The data we use include georeferenced observations of the occurrence of: cattle, sheep, goats, poultry, maize, wheat, sorghum, rice, bean, cowpea, pigeon pea, soybean, groundnut, Irish potato, sweet potato, cassava, yam, banana/plantain, sugarcane, and sunflower. Note that most of these crops (except for sorghum, cow pea and yam) are exotic to Africa. This markdown notebook is maintained on GitHub (here), and you can fork and alter it from there for your reference and use.

30 minute read …

Rating landscape soil aggregate stability from laser diffraction particle size data (2022)

The main goal of this notebook is to illustrate a workflow for assessing soil aggregate stability of landscapes. It is about using models to estimate experimental dispersion treatment effects and the uncertainty in those estimates. It does not go into the details of laser diffraction particle size analysis (LDPSA) data collections or the associated laboratory analysis procedures. Instead, it focuses on the associated steps that are needed to generate potentially useful statistical inferences for populations of LDPSA measurements in landscapes, which take both the CoDa and the ordinal nature of the laboratory dispersal treatments into account. The notebook can be downloaded (here).

20 minute read …

Meta-analyses of agricultural lime application effects on crop yields from field trial data (2022)

Soil acidity affects soil nutrient availability, rhizobial, and other microbial activity and is generally thought to reduce crop root growth and yield. For example, in Rwanda virtually all of the currently surveyed cropland soils are acid (pH < 7 in water) and ~64% are strongly acid (with pH < 5.5). This notebook provides the workflow for analyzing a large number of on-farm lime response trials that were conducted by the One-Acre Fund in Rwanda between 2016-2020 as part of their annual monitoring and evaluation program. The markdown notebook is available on GitHub (here).

20 minute read …

Spectral workflows for diagnosing reserve soil acidity and soil test-based lime requirements (2022)

This notebook focuses on chemometric processes and the associated machine learning workflows that are needed to generate useful predictions from a population of mid-infrared spectral signatures (features) relative to their corresponding reference measurements (labels). While the focus is on reserve soil acidity (Hp), the “stacked” ensemble model approaches that are developed can be usefully applied to other important soil properties. The markdown notebook is available on GitHub (here).

20 minute read …

Machine learning workflows for predictive soil mapping (2021)

This notebook provides practical guidelines for “predictive soil mapping” (PSM) using machine learning (ML) with robust uncertainty assessments. Our main intent here is to provide a reproducible, generalized mapping framework and some of the associated ML computing workflows in R. The actual data and workflows should also be readily transferable to other computing environments where needed. We use legacy soil data from Rwanda here, which will be updated with new data as they become available. The notebook is openly maintained on GitHub (here). The additional predicted soil property maps can be downloaded from the RwaSIS OSF repository (here).

20 minute read …

CoDa workflows for exploring mineral nutrient compositions of cereal grains (2021)

The objective of this notebook is to provide starter code for exploratory data analyses (EDA) and reporting of the mineral nutrient compositions of the main cereal grains (maize, pearl-millet, rice and sorghum) that are grown in Malawi. The markdown notebook is maintained on Github (here).

25 minute read …

Chemometric workflows for predicting Zinc levels in maize grain (2021)

The two main questions posed are: 1.) can Zn concentrations in soils be used to predict Zn concentrations/deficiencies in maize grain? While there should be strong bio-kinetic links between soils and plants, the differential uptake and bioavailability of soil Zn in food crop tissues is complex and is not well quantified or predicted; 2.) can we use MIR spectra to predict Zn concentrations in maize grain reliably? If so, this would advance and enable localized Zn deficiency diagnostics and risk assessments that could potentially be carried out routinely, at low cost, across large geographical regions and populations of interest. The notebook can be downloaded from GitHub (here).

25 minute read …

Predicting spatial yield potentials from survey data (2021)

This notebook develops an example of a spatial crop yield potential model. The approach consists of a combination of a standard production function that considers yields relative to inputs and a novel “Site index” (SI), which describes the quality of the production environment for the purpose of growing a particular crop. The multilevel model version of this can be used to gauge the potential productivity of croplands and to provide a comparative frame of reference for evaluating management options. The notebook can be downloaded from GitHub (here).

20 minute read …