Introduction
Soil acidity affects soil nutrient availability, rhizobial and other
microbial activity and is generally thought to reduce crop root growth
and yield. For example, in Rwanda virtually all of the currently
surveyed cropland soils are acid (pH < 7 in water) and ~64% are
strongly acid (with pH < 5.5). Also see the associated notebooks
about Rwanda’s Rwanda
cropland soil pH and its maize
& bean distributions. Moreover, commonly used nitrogen
fertilizers that supply ammonium directly or that react in the soil to
produce ammonium, such as various ammonium nitrates, urea, manure and
compost, will decrease soil pH over time (also consult the advice
provided by e.g., PennState
Extension about this).
The application of Agricultural lime
(aglime) can be used to mitigate the effects of soil acidity on
nutrient availability, reduced microbial activity, and the occurrence of
Aluminium
and/or Manganese
toxicities. The rocks and minerals from which agricultural lime is
derived, primarily Limestone, but also
Chalk, is primarily
composed of calcium (CaO) and/or magnesium (MgO) oxides, which are
readily measured by e.g., X-ray
fluorescence. Limestone is typically pulverized and/or chemically
altered to produce different aglime products of varying efficacy in
increasing soil pH and crop yields. Similar to chemical fertilizers, the
use of aglime also generates significant greenhouse gas emissions in its
production, to-farm-gate transport and through its field application,
which should be minimized.
A Meta-analysis is
an analytical approach that uses statistical techniques to combine the
results from different studies to obtain quantitative estimates of the
overall effects of particular interventions on defined outcomes.
Generally, the aim is to derive estimates of a common outcome variable,
based on the relevant sources of variation across studies. The scope of
meta-analyses is defined by the type of population, the types of
interventions, their comparators, and the types of outcomes that are of
interest. The acronym PICO (population, interventions, comparators and
outcomes) helps to serve as a reminder of these (e.g., see the Cochrane Handbook). We
will follow the Cochrane guidance here and will go through some of the
relevant analysis steps including data setup, exploratory
data analysis, data modeling and
model interpretation.
The main question is: “What is the increase in crop yields
with aglime applications?” The relevant PICO eligibility
definitions for addressing this question are:
- P - on-farm food crop yield trials conducted (in Rwanda, or
elsewhere)
- I - aglime application + recommended fertilizer application
- C - recommended fertilizer application only (the current best
practice)
- O - effects on maize and bean yields
Rwanda was selected for this illustration because of the high
prevalence of strongly acid soils within its croplands, as well as the
availability of a relatively large, recent, multi-season, on-farm trial
dataset that is suitable for this purpose.
Data set up
To actually run this notebook, you will need to load the packages
indicated in the chunk directly below to download, assemble and analyze
the lime trial data that are used in this workflow.
# Package names
packages <- c("osfr", "dplyr", "stringi", "quantreg", "DT", "metafor", "metaviz")
# Install packages
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
install.packages(packages[!installed_packages])
}
# Load packages
invisible(lapply(packages, library, character.only = TRUE))
The next chunk downloads trial data from the RwaSIS OSF repository.
The raw data are also openly available from One Acre Fund (OAF) on
email request, if you would like to trace our initial data eligibility
screening and wrangling efforts.
# Create a data folder in your current working directory
dir.create("RW_lime", showWarnings=F)
setwd("./RW_lime")
# dir.create("Results")
# Trial data download from OSF at: https://osf.io/7mt6k/
# osf_retrieve_file("7mt6k") %>% osf_download(conflicts = "overwrite") ## does not work currently
lime <- read.table("1AF_lime.csv", header=T, sep=",")
lime$season <- paste(lime$year, lime$season, sep="")
lime$site <- stri_trans_totitle(lime$site)
## 'data.frame': 12686 obs. of 13 variables:
## $ tid : int 1 1 1 1 2 2 2 2 3 3 ...
## $ country : chr "KEN" "KEN" "KEN" "KEN" ...
## $ site : chr "Harambee" "Harambee" "Harambee" "Harambee" ...
## $ year : int 2016 2016 2016 2016 2016 2016 2016 2016 2016 2016 ...
## $ season : chr "2016a" "2016a" "2016a" "2016a" ...
## $ followup: chr "n" "n" "n" "n" ...
## $ exp_name: chr "maize lime type and application rates (multiplot)" "maize lime type and application rates (multiplot)" "maize lime type and application rates (multiplot)" "maize lime type and application rates (multiplot)" ...
## $ trt_desc: chr "standard oaf maize monocrop" "homa lime 100kgs spot application" "calciprill lime 50kgs spot application" "homa lime 200kgs spot application" ...
## $ crop : chr "maize" "maize" "maize" "maize" ...
## $ lime : chr "n" "y" "y" "y" ...
## $ cce_amt : num 0 107 124 214 0 ...
## $ fert : chr "y" "y" "y" "y" ...
## $ yield : int 4561 3297 3149 4908 5574 5737 6182 5465 5844 5537 ...
PICO eligibility
The data we will be using in this notebook were generated by the One Acre Fund (OAF) in Rwanda, Kenya
and Tanzania as part of their intervention testing and M&E program.
They consist of 3714 individual (multi-location, on-farm) trials with
12686 crop yield (kg ha-1) measurements in various
crop
× lime
× fertilizer
treatment combinations. The trials were conducted in 55, Level-4
administrative units over 11 cropping seasons between 2014-2020. Note
that the collective label crop=="bean"
refers to Phaseolus
vulgaris L. including the “common”, “climbing” or “bush” bean
cultivars that were used in the trials. We shall only be focusing on an
A/B
comparison between lime treated
(lime=="y" & fert=="y"
) and the corresponding
comparator yields (lime=="n" & fert=="y"
) that were
recommended locally by OAF field extension staff. The next chunk
calculates the yield outcomes (delta
) for the lime
intervention and the corresponding comparator yields.
trt <- subset(lime, lime=="y" & fert=="y", select=c(country, season, site, year, tid, crop, cce_amt, yield))
con <- subset(lime, lime=="n" & fert=="y", select=c(tid, yield))
trial <- merge(trt, con, by="tid")
colnames(trial) <- c("trial_id","country", "season", "site","year","crop","lime_amt",
"treated_yield","control_yield")
trial$delta <- trial$treated_yield - trial$control_yield
We can plot the delta
values out to obtain a feel for
the distribution of the aglime yield effect comparisons for the 1303 OAF
aglime trials that we consider to be eligible under our PICO definition
for Rwanda.
par(pty="s", mar=c(4,4,1,1))
plot(I(treated_yield/1000)~I(control_yield/1000), rwa, xlim = c(0,15), ylim = c(0,15),
xlab = "Fertilizer only comparator yield (t/ha)", ylab = "Aglime intervention yield (t/ha)",
cex.lab = 1.3)
abline(c(0,1), col="grey", lwd = 2)
curve(rqd$coefficients[2]*x+rqd$coefficients[1], add=T, from=0, to=15, col="dodger blue", lwd=2)
curve(rqd$coefficients[4]*x+rqd$coefficients[3], add=T, from=0, to=15, col="red", lwd=2)
curve(rqd$coefficients[6]*x+rqd$coefficients[5], add=T, from=0, to=15, col="dodger blue", lwd=2)
plot(ecdf(rwa$delta/1000), main="", xlab="Aglime yield effect (t/ha)", ylab="ECDF", xlim=c(-6,6),
verticals=T, lty=1, lwd=2, cex.lab=1.3, col="tomato", do.points=F)
abline(0.5,0, lty=1, col="grey")
Note that Figure 1 does not show a substantial aglime intervention
effect. To explore this comparison further, we aggregate the
delta
yield values by season
,
site
and crop
type to set up a PICO table for
meta-analyses. Also note that this could be done in a number of
different ways that could include additional moderator variables and/or
grouping factors.
lm <- aggregate(trial$lime_amt, list(trial$country, trial$season, trial$site, trial$crop), FUN = mean)
pn <- aggregate(trial$delta, list(trial$country, trial$season, trial$site, trial$crop), FUN = length)
pm <- aggregate(trial$delta, list(trial$country, trial$season, trial$site, trial$crop), FUN = mean)
ps <- aggregate(trial$delta, list(trial$country, trial$season, trial$site, trial$crop), FUN = sd)
pico <- cbind(pn, round(lm[,5], digits = 0), round(pm[,5], digits = 0), round(ps[,5], digits = 0))
colnames(pico) <- c("country","season","site","crop","n","lime","yi","vi")
pico$vi <- (pico$vi / sqrt(pico$n))^2
pico$vi <- round(pico$vi, digits = 0)
pico$lime <- round(pico$lime/1000, digits = 2)
pico <- pico[order(pico$country, pico$site), ]
datatable(pico, rownames = FALSE, select = "multiple")
PICO table legend
country
- ISO country code (KEN, RWA or TZA)
season
- year and season (a or b)
site
- Site name (22 sites in Rwanda, 42 sites in total
including Kenya and Tanzania)
crop
- Crop type (maize or beans)
n
- total number of trials per season
,
site
& crop
type
lime
- average lime application rates (t
ha-1 CCE) per season
, site
, &
crop
type
yi
- mean yield response (treated - control, in kg
ha-1) per season
, site
&
crop
type
vi
- variance of yield response (SE2 of the
mean yield response) per season
, site
&
crop
type
Meta-analyses
We shall use the metafor
package in this section, which provides a robust and widely-tested
approach for fitting random-effects and meta-regression models (see Viechtbauer, 2010,
which describes the functionality of the package). You may also want to
watch the excellent, recent tutorial by Wolfgang
Viechtbauer for additional inspiration about meta-analyses.
The next chunks fit initial random effects models to the Rwanda trial
data, followed the simplest (least formatted or annotated) summary
forest plots for those models. There are a lot of options in this
context (see the customization guidance provided on the metafor
website about this), but we will leave those for you to explore.
# Rwanda maize random-effects model
rm0 <- rma(yi, vi, subset=(country=="RWA" & crop=="maize"), data = pico)
# Rwanda beans random-effects model
rb0 <- rma(yi, vi, subset=(country=="RWA" & crop=="beans"), data = pico)
labs <- paste(pico$country, pico$season, pico$site, pico$crop, sep = ", ")
op <- par(cex=0.9, font=4)
par(mar=c(4,4,1,1))
# Maize
forest(rm0, slab = labs, header="Country, Season, Site, Crop",
xlab = "Maize yield effect (kg/ha)", digits = 0, addpred=TRUE)
# Beans
forest(rb0, slab = labs, header="Country, Season, Site, Crop",
xlab = "Bean yield effect (kg/ha)", digits = 0, addpred=TRUE)
We can also look at the moderator effects of the average liming rates
(t/ha) that were applied at the different sites across the Rwanda trials
with mixed-effects models.
# combined model with lime application rate effect estimates
ry1 <- rma(yi, vi, mods = ~ factor(crop)+lime, subset=(country=="RWA"), data = pico, digits = 1)
ry1
##
## Mixed-Effects Model (k = 92; tau^2 estimator: REML)
##
## tau^2 (estimated amount of residual heterogeneity): 24847.8 (SE = 5410.2)
## tau (square root of estimated tau^2 value): 157.6
## I^2 (residual heterogeneity / unaccounted variability): 80.72%
## H^2 (unaccounted variability / sampling variability): 5.19
## R^2 (amount of heterogeneity accounted for): 24.69%
##
## Test for Residual Heterogeneity:
## QE(df = 89) = 370.3, p-val < .01
##
## Test of Moderators (coefficients 2:3):
## QM(df = 2) = 15.2, p-val < .01
##
## Model Results:
##
## estimate se zval pval ci.lb ci.ub
## intrcpt -242.6 115.9 -2.1 0.04 -469.8 -15.5 *
## factor(crop)maize 15.0 43.4 0.3 0.73 -70.1 100.2
## lime 351.6 90.3 3.9 <.01 174.6 528.7 ***
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Below is an example of a bubble plot showing the observed yield
differences of the individual trial sites plotted against a quantitative
moderator variable i.e., in this case the average lime application rate
yield effects that are grouped by season
, site
and crop
type in Rwanda. The size of the points is
proportional to the precision (inverse variance) weights that individual
studies received in the analysis, with larger points receiving more
weight. Based on the mixed-effects meta-regression models above, the
predicted average yield response as a function of the lime application
rate is also shown in the plot, with the corresponding 95% confidence
interval. Note the heterogeneity between trials that received between 1
- 1.5 (t ha-1 CCE) lime applications.
Main conclusion and takeaways
Based on the presented analyses, we currently do not recommend
investments in wide-spread aglime applications in the maize and bean
cropping systems of Rwanda. The data, though extensive, practical and
very useful, are insufficient to justify recommending that these types
of interventions should be applied at a national scale for either maize
or bean crops. More spatially targeted lime applications may prove
useful in the future, but we could not demonstrate a sufficiently large
yield effect-size based on the currently available data. If you are
going to do such presumeably expensive trials, you should try to think
them through to do them right.
The main takeaways from this notebook are the
following:
In the OAF trials presented, the application of lime showed only
small positive intervention effects on crop yields (~350 kg
ha-1 per metric ton ha-1 of lime added), which
were observed over 9 cropping seasons relative to their respective
comparator yields. The fairly low average lime application rates (see
Figure 2) in these trials may not have been sufficient to overcome the
exchangeable (reserve) acidity constraints nor to effectively alter soil
pH. However, testing for specific lime dose-response relationships would
require additional and more systematically collected intervention and
comparator data. See the additional guidance about soil spectral test
based liming recommendations at this
link. Just download the html file to display it in your
browser.
This study had several limitations. First, there was substantial
methodological heterogeneity among selected field trials in terms of the
lime intervention characteristics in terms of lime dosage, application
methods and followup times. Additional sources of heterogeneity included
differences in statistical heterogeneity among sites, which was high for
the crop yield outcomes for maize and beans. Therefore, we could only
assess the effectiveness of the lime application treatments as a
whole.
As is demonstrated, site-level varying effects are likely to
dominate production function estimates and inference. These warrant more
systematic investigations. The heterogeneity of the trials that were
reported here is high (I2 > 80%, e.g., see
model ry1
results and Figures 2 & 3), indicating that
additional subgroup and/or moderator analyses are/would be needed to
more consistently differentiate between both interventions and
comparators in the underlying trial designs. One initial way of
correcting for that uncertainty would be to include the georeference of
each individual on-farm trial. Amendments to and reanalyses of the data
that are presented here could subsequently be accomplished quickly. This
would also assist in developing spatially explicit models for lime
responses in the future.
The initial question: “What is the increase in crop yields
with aglime applications?”, can be answered using the approach and
workflow that was used in this notebook. Additional crops, lime
application treatments and georeferenced trials could be readily amended
to the PICO framework and the associated analyses that were presented
here.
The workflow runs fast (< 2 min) on a normal, 8 core, 16 GB
memory computer and could be further automated for the purpose of EDAs,
outlier screening, monitoring and data modeling as additional and/or new
trial data become available in Rwanda or elsewhere. We will post
additional material for calculating full Bayesian posterior
predictions in a future notebook. These take much longer to
run.
Addendum: Recommended field trial protocol adjustments
Should you be interested in more than unreplicated case studies,
ensure that individual experimental designs can actually be harmonized
for more efficient and reliable learning. This bears careful thinking
about the relevant PICO factors prior to initiating any additional
liming projects or trials. It is generally best to simulate the data
before initiating activities on the ground. The workflow presented here
generates the needed prior information for such a process.
Once those factors have been thought through, more efficient and
less costly agronomic response trial designs could be generated for
future deployment. Conducting field trials is a quite expensive
operational undertaking, where the quality of the recorded data is also
an important consideration. Sometimes less is more this context. We
encourage you to explore leaner, more optimized (e.g., d-optimized,
latin-hypercube, or other) trial designs than the ones that were used to
generate the data that were used here. This would save money, improve
data quality, and might also improve statistical inference and
learning.
Put all raw survey and experimental designs on KoBoToolbox. Just
go to KoBoToolbox, set up the
forms in Excel (or anything else) and deploy the forms for your surveys
and experiments using ODK … both
are free and easy to use! Then systematically georeference, time stamp,
photograph and monitor all individual field trials and/or survey
locations.
East Africa is certainly not Nebraska, South Africa or Brazil.
Polycropping or intercropping with local staple food
varieties occurs virtually everywhere in Rwanda’s smallholder production
cropping systems and elsewhere in East Africa (see: Mixed Farming Systems). Our
recommendation is to start conducting agricultural input trials in ways
that actually reflect those facts rather than on the basis of largely
unrealistic monocropping practices.
Repeat soil acidity (pH & exchangeable acidity) measurements
over time. You don’t really need spectrometers to do that, though it
might help … also with other related soil properties. Good pH probes sell
for ~50 U$ on Amazon.
Systematically document the crop varieties that are used in
trials. This would help in assessing additional genotype × environment
interactions (GE)
that plant breeders need to assess, recommend, and develop acid tolerant
crop varieties in different places.
Consistently specify lime provenance data. Aglime sources in
Rwanda (e.g., see Sirikare
et. al., 2012) and elsewhere are quite variable in terms of their
calcium carbonate equivalents (CCE), Ca/Mg ratios, fineness and
estimated extractable limestone source reserves (also see Nduwumuremyi et al.,
2013).
Put all of the data and analysis code on Github and/or OSF …
unless there are actual IRB
and/or privacy concerns. Make the data and the code truly FAIR so that others
can attempt to help more easily!
Any questions, comments or corrections related to this notebook are
most welcome via AFSIS. The
notebook is maintained on Github,
and you can fork and modify it from there as you see fit.