Starting with the Big Mac index
"While we take care to identify our sources, we have not often published the data behind them. Sometimes, this is for good reason: some data are proprietary or otherwise not ours to publish. Often, we have simply not made the time to do it. This is a shame: releasing data can give our readers extra confidence in our work, and allows researchers and other journalists to check — and to build upon — our work. So we’re looking to change this, and publish more of our data on GitHub.Peeling back the curtain: How the Economist is opening the data behind our reporting (The Economist)
Why now?
Years ago, “data” generally meant a table in Excel, or possibly even a line or bar chart to trace in a graphics program. Today, data often take the form of large CSV files, and we frequently do analysis, transformation, and plotting in R or Python to produce our stories. We assemble more data ourselves, by compiling publicly available datasets or scraping data from websites, than we used to. We are also making more use of statistical modelling. All this means we have a lot more data that we can share — and a lot more data worth sharing."
No comments:
Post a Comment