The “data revolution” is changing the world. There is vast appetite for better understanding -- and improving -- ground-level realities across the globe. In the field of global development, there is growing demand for evidence-based policy and for rigorously measured, cost-effective solutions at scale. This demand has increased the need for high quality, dynamic data across wide geographic areas, and for innovative ways of measuring economic activity and development outcomes.
One of the hallmarks of our work is measuring what we do - and using data to test whether what we are planning on doing actually works. Especially for our Beta programs that are actively pressure-testing specific interventions to determine whether they work as intended and are scalable to potentially millions of people, we need reliable data on key indicators - collected cost-effectively, efficiently, and in a timely manner. This can be a tall order when locations are remote, connectivity is poor, and government data outdated or nonexistent.
A New Partnership for Nimble, Scalable Data
Enter Premise, a data and analytics platform that measures economic, political, and social developments in real time. Premise uses a global network of contributors around the world to generate data in close-to-real time. The company uses both discovery and revisit surveys in statistically representative samples to generate micro data on food and commodity prices, wages, and other data that we need all the time to make intelligent decisions about our interventions.
For a data-driven, evidence-based organization that seeks to maximize cost-effectiveness, access to reliable, localized, real-time data is one of the key challenges that can make or break our work.
When Do We Need Data?
Really, all the time. We rely on data throughout three key stages in our programs' lifecycle:
- Assessing an intervention’s feasibility and expected impact at scale;
- Developing operational models as projects scale up; and
- Monitoring service delivery as an intervention is operating at scale and in a steady state.
The type of data needed evolves throughout this process. In stage one, a crucial component of data includes scoping the landscape and assessing if the necessary conditions exist for the program to be scaled to new contexts. As the program expands to serve additional beneficiaries, we must collect data to verify that impact remains robust and monitor unintended consequences. Once programs reach scale, there is need for ongoing monitoring of service delivery. The data needed at scale is not focused on measuring impact, but assessing usage and ensuring quality program implementation in line with the tested design.
Each stage often requires real-time, hyper-local, and observable data to dynamically test hypotheses and monitor on-the-ground conditions. This data informs our cutting-edge research, enables smarter programmatic decision-making, informs cost-effectiveness, and improves accountability.
What Data Do We Need?
Here are some types of data that we regularly need:
- Mapping existing services or infrastructure, for example, up-to-date lists and locations of all health clinics in an area;
- Wage information for interventions that target incomes of the poor;
- Seasonal hunger patterns and food consumption of poor households;
- Monitoring service delivery such as chlorine is delivered to dispensers at the water source, schools have deworming pills on deworming day.
There are a number of sources for these data but each have significant drawbacks:
- Administrative data. This is data collected by governments or other institutions. It's often out of date and of uncertain quality and accuracy. It is almost never ‘real time,’ and it can be difficult to access.
- Data collected by teams of professional surveyors and monitors, either hired directly or through survey firms. This data tends to be expensive and slow to collect, requires a lot of staffing, and may not be as flexible as we need it to be.
- Citizen report cards and other crowdsourced information. This type of data is very difficult to quality check, has spotty coverage, can be inherently biased, and is hard and expensive to maintain well over time.
That said, we may always need teams of trained enumerators to, for instance, measure anthropometric outcomes of a particular program on a sample of children under five. We have small Monitoring and Evaluation teams in both Eastern Africa and in India that are doing fabulous work. However, there is a gap in the data toolbox for a nimble, low-cost, dynamic, scalable, and real-time strategy - one that we are seeking to address with an Evidence Action/Premise partnership.
An Example: No Lean Season
Here is one example of a Beta project for which data is critical: No Lean Season. No Lean Season grows seasonal income for ultra-poor subsistence farmers during the annual ‘lean season,’ a time between planting and harvesting of crops when subsistence households experience reduced job opportunities, and concomitant hunger and famine.
No Lean Season is based on a rigorous evaluation in Bangladesh that found that poor households that were given cash for a bus ticket worth $8.50 traveled to where seasonal work was available. Acquiring a seasonal job outside of the village resulted in improved household consumption of 500 - 700 calories per person, per day during the lean season. Evidence Action is currently pressure-testing No Lean Season in Bangladesh to scale up there in 2016, and to grow to other locations that experience seasonal famine in 2016/2017.
We need lots of data for No Lean Season that has been hard to come by cost-effectively and quickly:
- Wage data at top labor destination locations for the top three to five low-skill occupations e.g. rickshaw pulling, brick makers;
- Employment opportunities data prior to the intervention such as expected demand for labor and current wages;
- Scoping data for new locations that would measure the degree of existing migration, if so where to and when; data on households impacted by seasonal hunger, and labor market data for new locations.
A Partnership for Data
We are excited to be working with Premise’s adaptable survey approach and their elastic network of data collectors that remotely recruit, vet, train and mobilize flexible, distributed networks of contributors for No Lean Season and a number of other projects in three countries. Our economists and data scientists on the Beta team also appreciate the kind of scalable operation and the smart comprehensive quantitative sampling Premise delivers. We are also keenly interested in the data verification systems Premise deploys that provides quality control via a mix of automated (outlier detection; machine learning; computer vision) and manual (directed sampling using oDesk) checks.
That said, this is an experiment - one we will be writing about as it unfolds and as we are learning more as we work together. If you are interested in supporting this work (the price tag for three countries / projects is a very reasonable $150K USD total, discounted by both our organizations to cover only direct costs of data collection) you can donate here (chose Beta as the program to benefit).
Regardless, we will keep you updated on how we are faring with a new, disruptive way getting real-time development data.
Photo credit: Premise