Six Best Practices for Creating Strong Evidence to Inform Policymaking

By Aaron Williams, John Walsh, Ridhi Purohit

May 15, 2025

Rooting policymaking in evidence and data is crucial to effective governance and decisionmaking. Policymakers need to know if the qualifications set for public benefit programs are helping families they intend to serve, if rental assistance programs are preventing evictions and homelessness, and if their communities are providing conditions for upward mobility. But evidence-based policymaking requires accurate, timely information. When policymakers root their choices in evidence, government action has greater impact.

Evidence is crucial in guiding public policy decisions. When policymaking is done without accurate evidence, decisionmakers are left to rely on assumptions and heuristics. Policy based on assumptions increases the risk of ineffective or harmful outcomes in place of the intended goal.

Motivated by this, the Urban Institute’s Upward Mobility Initiative (UMI) adopted six best practices to ensure the accurate construction of more than two dozen Mobility Metrics across 10 years for more than 3,500 counties and cities. These practices should serve as a model for data practitioners to strengthen evidence creation and provide accurate, timely information.

The six practices

Transparency. Sunlight is the best disinfectant, but the calculations behind too many analyses never see the light of day. All code, and all data when permitted, for calculating the Mobility Metrics are publicly available on GitHub for anyone to review or use. Transparency builds trust and gives others the opportunity to find mistakes or make judgements about assumptions behind calculations. How important is this? Mistakes in a high-profile analysis of debt to gross domestic product ratios were only discovered after a graduate student at the University of Massachusetts Amherst spent months waiting for the authors to share their erroneous analysis.
Code-first analysis. Baking cookies using a recipe is much more difficult if half of the steps are missing from the recipe. Point-and-click tools like Microsoft Excel hide errors in opaque cell references and easily corrupt data. Analytic code that captures every step of an analysis is tougher to learn but can be used to create a complete recipe that documents the steps from start to finish. All Mobility Metrics are constructed using reproducible R and Stata code. Collaborators on the project were required to submit programs that included all the necessary steps for creating the metrics, from import of the raw data (when possible) to export of the final metric.
Zero-trust code reviews. Most people trust their coworkers, but sometimes it is useful to pretend otherwise. All Mobility Metrics go through zero-trust code reviews where the burden to prove the accuracy of all calculations falls on the analyst, and the reviewer is tasked with recreating all the steps the analyst used to calculate the mobility metric. Reviewers also play the role of an outsider’s eyes, flagging steps or instructions in the process that may be clear to the programmer but are described vaguely.
Test-driven development. Modern manufacturing often contains a proactive quality assurance plan with predefined tests that must be passed for the product to leave the factory. Data analysis also benefits from proactive tests that are defined before building the analysis. This process is called test-driven development. The Mobility Metrics pass predefined tests that define the possible range of values (e.g., income can’t be negative), the geographies that should be represented, and certain logical conditions that should never be violated, like the upper-bound of a confidence interval being lower than the estimated value.
Benchmarking. Most pieces of evidence fit into a web of evidence instead of standing on their own. It’s important to identify published calculations that are similar to our calculations in advance. For example, the National Low Income Housing Coalition (NLIHC) produces a report that includes the number of affordable and available rental units for a selection of metropolitan statical areas (MSAs), a statistic very similar to the housing affordability metric calculated for the UMI data dashboard. To benchmark our results, we identified counties that had similar boundaries to their surrounding MSA and compared the UMI data with those from the NLIHC. The comparison confirmed that our results show similar patterns in housing affordability and served as a useful check on our calculation.
Quantifying uncertainty. All quantitative evidence contains uncertainty, and the level of uncertainty varies based on the sources behind the evidence. Survey data from sources like the American Community Survey are uncertain because a sample is used to represent a population, and a growing number of households don’t respond. Administrative data, even from gold standard sources like unemployment insurance taxes and credit bureau records, don’t always align with the population of interest. Most of the Mobility Metrics data include indicators of quality that quantify the level of uncertainty using tests and provide an indicator on a three-point scale from most certain to least certain.

Evidence is invaluable to decisionmaking

Adopting these best practices requires time, expertise, and resources, especially at first. But these practices’ benefits pay themselves back in the long run by increasing the availability of timely and accurate evidence. Accurate data not only improve the value of analysis, but in the policy world, they also reduce the risk of misguided and even harmful decisionmaking that affects real people. We encourage data practitioners to adopt these practices while producing new evidence. Looking through the Upward Mobility GitHub is a good starting point for building a trustworthy, effective evidence-building process.

A roadmap for getting the numbers right.