It has long been the case that “open source” plus “Federal government” equals oxymoron, and for good reason. The US government might run on numbers, but those numbers — and the algorithms and models used to analyze them — are a mystery to those of us who work outside the monolithic agencies of Washington, D.C.
Federal agencies are required by law to justify their proposals before codifying them into active laws. And for good reason: changes in tax policy, for example, can affect the entire economy. However, “justification” typically comes in the form of estimates based on numbers crunched deep within the sunless bowels of places such as the Office of Tax Analysis. There, nameless bureaucrats calculate outcomes using undisclosed statistical models based on assumptions that the rest of us, including the lawmakers writing these laws, do not get to see.
Thanks to the Open Source Policy Center, however, the citizens of this great nation may now get a glimpse into the data, and the methods used to derive and analyze that data, that drive public policy and the creation of new laws. In short, this DC-based nonprofit organization seeks to let a little open source sunshine into the black box of government data modeling. Part of the American Enterprise Institute, the OPSC launched in April 2016 with the mission of making public policy analysis transparent, or at least a bit more accessible.
The OPSC calls itself “a laboratory for predicting the effects of public policy” — i.e., the government programs and laws of the land that rule us all. This can often feel like a one-way street — they decide, we abide — but projects like OPSC’s TaxBrain are intended to put some of the power in the hands of the people, or at least the number nerds.
TaxBrain uses open source economic simulation models to simulate the effects proposed changes to federal tax policies. Users can churn their chosen numbers through the (also open source) web application, or download the code to run locally.
All projects that the OPSC incubates are open source, with complete source code available on GitHub. These repositories include flagship projects TaxBrain, the first and currently most widely adopted OPSC offering, and another called Cost of Capital Calculator, both available as supported downloads from the OPSC website. A dozen or so more initiatives are in varying stages of completion and available for inspection at the organization’s code repository.
Black Box Economics
Matt Jensen is the founder and managing director of the Open Source Policy Center, and he recently spoke about his group’s initiatives at AnacondaCon 2017, held earlier this month in Austin. According to Jensen, most major policy decisions currently made in the United States are based on proprietary data analysis and algorithms — processes held in great secrecy by their respective federal agencies. His AnacondaCon presentation focused on how TaxBrain leverages Anaconda to make tax policy analyses open and available to anyone, anywhere.
Most major policy decisions currently made in the United States are based on proprietary data analysis and algorithms — processes held in great secrecy by their respective federal agencies.
An extremely compelling use case for TaxBrain, according to Jensen — who initially spent years working as a tax policy analyst himself — is the ability to access modeling and impact estimates as part of the process of creating new policies, rather than after.
The current system, he explained, requires policymakers to craft their reforms first and then send them for outside testing and analysis. Legislators come up with ideas, consult with attorneys to write them up as a formal bill, and then take it to whatever entity is in charge of estimating the potential effects and outcomes. Proposed tax legislation goes, for example, to the Joint Committee on Taxation, which runs the proposed policy through its proprietary modeling resources and eventually — which can be as much as a year later — returns an analysis.
Problems with this scenario are many. One being that the Joint Committee on Taxation is, by design, a narrow pipeline: exactly one modeling resource available means the agency can only analyze one bill at a time. A much more significant issue is that, however fast or slow it goes, the entire process takes place inside a “black box.”
“You don’t get the methodology, you don’t get the list of assumptions they used, and you definitely don’t get the code,” said Jensen. “This is what our policymakers have to rely upon when they’re making tax policy.”
One of the major benefits of open sourcing tax policy data modeling, according to Jensen, lies in finding bugs. Lots of bugs.
“Being open source naturally helps us find the bugs in our system because there are so many more eyeballs on a project. And finding them is crucial because if we don’t, then we are going to make policy decisions based on flawed analyses that affect millions and millions of people,” he said. Meanwhile, he added, “I know there are bugs in the proprietary [government] modeling outfits that don’t get found. Because nobody’s looking for them.” And, apparently, no one is interested in starting.
The original reason the federal models are closed-source, according to Jensen, is because that is how they started. “These shops in the 1970s, they didn’t know what it meant to do open source modeling, and they don’t want to start now because, I think, there are bugs in their code and it’s messy and one main guy has been working on it for a long long time and he doesn’t want to expose it to everyone,” he explained.
Furthermore, Jensen said, many federal agency shops are using proprietary languages and seem to think that because they’re using, say SAS, they can’t make their code open source: “There’s a lot of misinformation.”
Jensen was not intending to slag his GS colleagues with this statement. The government often employs good modelers, he said. However, he added, “There is a universe of cutting-edge computer scientists and economists at universities across the country who could be lending their expertise and resources to help make our government better.”
And, evidently, D.C. insiders also feel the same way. Indeed, the Obama administration’s tax policy staff used OPSC tools while generating new legislation and, Jensen said, the current administration’s tax staff has been using them as well.
The upshot: “Regardless of ideology, we want our government to be transparent, to produce reproducible results, to be accountable. After all, they work for us,” said Jensen.
If transparency is not going to come from within, the best way to get it will be to shine a light from the outside. We want our policy makers to ask better questions, based on the availability of the freshest and most accurate data, to produce the best and most relevant regulations. Those are all the things that open source data science allows. We would be able to produce better policy if they were baked into the federal system of analysis.
This does not mean that Jensen’s Open Source Policy Center is vying to become Washington’s next one-stop number crunching shop. The OSPC, said Jensen, merely wishes to put worthy tools into any non-governmental hands who would wield them most effectively.
“Our impact lies in incubating the development of these tools, and leaving the research to the empirical economists and statisticians — or the average citizen.”
The OSPC Border Adjustment Calculator:
Feature image: Freebirds World Burrito, Austin, Texas.