Introduction

Soil acidity affects soil nutrient availability, rhizobial and other microbial activity and is generally thought to reduce crop root growth and yield. For example, in Rwanda virtually all of the currently surveyed cropland soils are acid (pH < 7 in water) and ~64% are strongly acid (with pH < 5.5). Also see the associated notebooks about Rwanda’s Rwanda cropland soil pH and its maize & bean distributions. Moreover, commonly used nitrogen fertilizers that supply ammonium directly or that react in the soil to produce ammonium, such as various ammonium nitrates, urea, manure and compost, will decrease soil pH over time (also consult the advice provided by e.g., PennState Extension about this).

The application of Agricultural lime (aglime) can be used to mitigate the effects of soil acidity on nutrient availability, reduced microbial activity, and the occurrence of Aluminium and/or Manganese toxicities. The rocks and minerals from which agricultural lime is derived, primarily Limestone, but also Chalk, is primarily composed of calcium (CaO) and/or magnesium (MgO) oxides, which are readily measured by e.g., X-ray fluorescence. Limestone is typically pulverized and/or chemically altered to produce different aglime products of varying efficacy in increasing soil pH and crop yields. Similar to chemical fertilizers, the use of aglime also generates significant greenhouse gas emissions in its production, to-farm-gate transport and through its field application, which should be minimized.

A Meta-analysis is an analytical approach that uses statistical techniques to combine the results from different studies to obtain quantitative estimates of the overall effects of particular interventions on defined outcomes. Generally, the aim is to derive estimates of a common outcome variable, based on the relevant sources of variation across studies. The scope of meta-analyses is defined by the type of population, the types of interventions, their comparators, and the types of outcomes that are of interest. The acronym PICO (population, interventions, comparators and outcomes) helps to serve as a reminder of these (e.g., see the Cochrane Handbook). We will follow the Cochrane guidance here and will go through some of the relevant analysis steps including data setup, exploratory data analysis, data modeling and model interpretation.

The main question is: “What is the increase in crop yields with aglime applications?” The relevant PICO eligibility definitions for addressing this question are:

Rwanda was selected for this illustration because of the high prevalence of strongly acid soils within its croplands, as well as the availability of a relatively large, recent, multi-season, on-farm trial dataset that is suitable for this purpose.

Data set up

To actually run this notebook, you will need to load the packages indicated in the chunk directly below to download, assemble and analyze the lime trial data that are used in this workflow.

# Package names
packages <- c("osfr", "dplyr", "stringi", "quantreg", "DT", "metafor", "metaviz")

# Install packages
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
  install.packages(packages[!installed_packages])
}

# Load packages
invisible(lapply(packages, library, character.only = TRUE))

The next chunk downloads trial data from the RwaSIS OSF repository. The raw data are also openly available from One Acre Fund (OAF) on email request, if you would like to trace our initial data eligibility screening and wrangling efforts.

# Create a data folder in your current working directory
dir.create("RW_lime", showWarnings=F)
setwd("./RW_lime")
# dir.create("Results")

# Trial data download from OSF at: https://osf.io/7mt6k/
# osf_retrieve_file("7mt6k") %>% osf_download(conflicts = "overwrite") ## does not work currently

lime <- read.table("1AF_lime.csv", header=T, sep=",")
lime$season <- paste(lime$year, lime$season, sep="")
lime$site <- stri_trans_totitle(lime$site) 
## 'data.frame':    12686 obs. of  13 variables:
##  $ tid     : int  1 1 1 1 2 2 2 2 3 3 ...
##  $ country : chr  "KEN" "KEN" "KEN" "KEN" ...
##  $ site    : chr  "Harambee" "Harambee" "Harambee" "Harambee" ...
##  $ year    : int  2016 2016 2016 2016 2016 2016 2016 2016 2016 2016 ...
##  $ season  : chr  "2016a" "2016a" "2016a" "2016a" ...
##  $ followup: chr  "n" "n" "n" "n" ...
##  $ exp_name: chr  "maize lime type and application rates (multiplot)" "maize lime type and application rates (multiplot)" "maize lime type and application rates (multiplot)" "maize lime type and application rates (multiplot)" ...
##  $ trt_desc: chr  "standard oaf maize monocrop" "homa lime 100kgs spot application" "calciprill lime 50kgs spot application" "homa lime 200kgs spot application" ...
##  $ crop    : chr  "maize" "maize" "maize" "maize" ...
##  $ lime    : chr  "n" "y" "y" "y" ...
##  $ cce_amt : num  0 107 124 214 0 ...
##  $ fert    : chr  "y" "y" "y" "y" ...
##  $ yield   : int  4561 3297 3149 4908 5574 5737 6182 5465 5844 5537 ...

PICO eligibility

The data we will be using in this notebook were generated by the One Acre Fund (OAF) in Rwanda, Kenya and Tanzania as part of their intervention testing and M&E program. They consist of 3714 individual (multi-location, on-farm) trials with 12686 crop yield (kg ha-1) measurements in various crop × lime × fertilizer treatment combinations. The trials were conducted in 55, Level-4 administrative units over 11 cropping seasons between 2014-2020. Note that the collective label crop=="bean"refers to Phaseolus vulgaris L. including the “common”, “climbing” or “bush” bean cultivars that were used in the trials. We shall only be focusing on an A/B comparison between lime treated (lime=="y" & fert=="y") and the corresponding comparator yields (lime=="n" & fert=="y") that were recommended locally by OAF field extension staff. The next chunk calculates the yield outcomes (delta) for the lime intervention and the corresponding comparator yields.

trt <- subset(lime, lime=="y" & fert=="y", select=c(country, season, site, year, tid, crop, cce_amt, yield))
con <- subset(lime, lime=="n" & fert=="y", select=c(tid, yield))
trial <- merge(trt, con, by="tid")
colnames(trial) <- c("trial_id","country", "season", "site","year","crop","lime_amt",
                     "treated_yield","control_yield")
trial$delta <- trial$treated_yield - trial$control_yield

We can plot the delta values out to obtain a feel for the distribution of the aglime yield effect comparisons for the 1303 OAF aglime trials that we consider to be eligible under our PICO definition for Rwanda.

par(pty="s", mar=c(4,4,1,1))
plot(I(treated_yield/1000)~I(control_yield/1000), rwa, xlim = c(0,15), ylim = c(0,15),
     xlab = "Fertilizer only comparator yield (t/ha)", ylab = "Aglime intervention yield (t/ha)",
     cex.lab = 1.3)

abline(c(0,1), col="grey", lwd = 2)
curve(rqd$coefficients[2]*x+rqd$coefficients[1], add=T, from=0, to=15, col="dodger blue", lwd=2)
curve(rqd$coefficients[4]*x+rqd$coefficients[3], add=T, from=0, to=15, col="red", lwd=2)
curve(rqd$coefficients[6]*x+rqd$coefficients[5], add=T, from=0, to=15, col="dodger blue", lwd=2)

plot(ecdf(rwa$delta/1000), main="", xlab="Aglime yield effect (t/ha)", ylab="ECDF", xlim=c(-6,6),
     verticals=T, lty=1, lwd=2, cex.lab=1.3, col="tomato", do.points=F)
abline(0.5,0, lty=1, col="grey")
**Figure 1:** Plots showing the relationships between aglime intervention and fertilizer only comparator crop yields for PICO eligible liming trials conducted in Rwanda.**Figure 1:** Plots showing the relationships between aglime intervention and fertilizer only comparator crop yields for PICO eligible liming trials conducted in Rwanda.

Figure 1: Plots showing the relationships between aglime intervention and fertilizer only comparator crop yields for PICO eligible liming trials conducted in Rwanda.

Note that Figure 1 does not show a substantial aglime intervention effect. To explore this comparison further, we aggregate the delta yield values by season, site and crop type to set up a PICO table for meta-analyses. Also note that this could be done in a number of different ways that could include additional moderator variables and/or grouping factors.

lm <- aggregate(trial$lime_amt, list(trial$country, trial$season, trial$site, trial$crop), FUN = mean)
pn <- aggregate(trial$delta, list(trial$country, trial$season, trial$site, trial$crop), FUN = length)
pm <- aggregate(trial$delta, list(trial$country, trial$season, trial$site, trial$crop), FUN = mean)
ps <- aggregate(trial$delta, list(trial$country, trial$season, trial$site, trial$crop), FUN = sd)

pico <- cbind(pn, round(lm[,5], digits = 0), round(pm[,5], digits = 0), round(ps[,5], digits = 0))
colnames(pico) <- c("country","season","site","crop","n","lime","yi","vi")
pico$vi <- (pico$vi / sqrt(pico$n))^2
pico$vi <- round(pico$vi, digits = 0)
pico$lime <- round(pico$lime/1000, digits = 2)
pico <- pico[order(pico$country, pico$site), ]

datatable(pico, rownames = FALSE, select = "multiple")

PICO table legend

Meta-analyses

We shall use the metafor package in this section, which provides a robust and widely-tested approach for fitting random-effects and meta-regression models (see Viechtbauer, 2010, which describes the functionality of the package). You may also want to watch the excellent, recent tutorial by Wolfgang Viechtbauer for additional inspiration about meta-analyses.

The next chunks fit initial random effects models to the Rwanda trial data, followed the simplest (least formatted or annotated) summary forest plots for those models. There are a lot of options in this context (see the customization guidance provided on the metafor website about this), but we will leave those for you to explore.

# Rwanda maize random-effects model
rm0 <- rma(yi, vi, subset=(country=="RWA" & crop=="maize"), data = pico)

# Rwanda beans random-effects model
rb0 <- rma(yi, vi, subset=(country=="RWA" & crop=="beans"), data = pico)
labs <- paste(pico$country, pico$season, pico$site, pico$crop, sep = ", ")
op <- par(cex=0.9, font=4)
par(mar=c(4,4,1,1))

# Maize
forest(rm0, slab = labs, header="Country, Season, Site, Crop", 
       xlab = "Maize yield effect (kg/ha)", digits = 0, addpred=TRUE)

# Beans
forest(rb0, slab = labs, header="Country, Season, Site, Crop", 
       xlab = "Bean yield effect (kg/ha)", digits = 0, addpred=TRUE)
**Figure 2:** Forest plots showing the mean difference maize (top) and bean (bottom) yields in response to lime treatments in Rwanda based on random effects models. Note that the dashed line around the RE model polygon (diamond) at the bottom of each plot indicates the 95% prediction interval for the combined outcomes, which in both cases includes 0.**Figure 2:** Forest plots showing the mean difference maize (top) and bean (bottom) yields in response to lime treatments in Rwanda based on random effects models. Note that the dashed line around the RE model polygon (diamond) at the bottom of each plot indicates the 95% prediction interval for the combined outcomes, which in both cases includes 0.

Figure 2: Forest plots showing the mean difference maize (top) and bean (bottom) yields in response to lime treatments in Rwanda based on random effects models. Note that the dashed line around the RE model polygon (diamond) at the bottom of each plot indicates the 95% prediction interval for the combined outcomes, which in both cases includes 0.

We can also look at the moderator effects of the average liming rates (t/ha) that were applied at the different sites across the Rwanda trials with mixed-effects models.

# combined model with lime application rate effect estimates 
ry1 <- rma(yi, vi, mods = ~ factor(crop)+lime, subset=(country=="RWA"),  data = pico, digits = 1)
ry1
## 
## Mixed-Effects Model (k = 92; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of residual heterogeneity):     24847.8 (SE = 5410.2)
## tau (square root of estimated tau^2 value):             157.6
## I^2 (residual heterogeneity / unaccounted variability): 80.72%
## H^2 (unaccounted variability / sampling variability):   5.19
## R^2 (amount of heterogeneity accounted for):            24.69%
## 
## Test for Residual Heterogeneity:
## QE(df = 89) = 370.3, p-val < .01
## 
## Test of Moderators (coefficients 2:3):
## QM(df = 2) = 15.2, p-val < .01
## 
## Model Results:
## 
##                    estimate     se  zval  pval   ci.lb  ci.ub 
## intrcpt              -242.6  115.9  -2.1  0.04  -469.8  -15.5    * 
## factor(crop)maize      15.0   43.4   0.3  0.73   -70.1  100.2      
## lime                  351.6   90.3   3.9  <.01   174.6  528.7  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Below is an example of a bubble plot showing the observed yield differences of the individual trial sites plotted against a quantitative moderator variable i.e., in this case the average lime application rate yield effects that are grouped by season, site and crop type in Rwanda. The size of the points is proportional to the precision (inverse variance) weights that individual studies received in the analysis, with larger points receiving more weight. Based on the mixed-effects meta-regression models above, the predicted average yield response as a function of the lime application rate is also shown in the plot, with the corresponding 95% confidence interval. Note the heterogeneity between trials that received between 1 - 1.5 (t ha-1 CCE) lime applications.

**Figure 3:** Lime application rate effects on mean yield differences in Rwanda based on a site-level mixed-effects model.

Figure 3: Lime application rate effects on mean yield differences in Rwanda based on a site-level mixed-effects model.

Main conclusion and takeaways

Based on the presented analyses, we currently do not recommend investments in wide-spread aglime applications in the maize and bean cropping systems of Rwanda. The data, though extensive, practical and very useful, are insufficient to justify recommending that these types of interventions should be applied at a national scale for either maize or bean crops. More spatially targeted lime applications may prove useful in the future, but we could not demonstrate a sufficiently large yield effect-size based on the currently available data. If you are going to do such presumeably expensive trials, you should try to think them through to do them right.

The main takeaways from this notebook are the following: