What’s New in Public Data: Highlights from the 2016 APDU Conference

As a board member of the Association of Public Data Users (APDU), I am in the enviable position to get to develop ideas for the annual conference, and moderate them. It’s a chance to engage on issues crucial to my job (the latest in data products) and essential to my civic self (data journalism). Having attended the conference for the past few years, I know that the annual event is always a stimulating sequence of well-considered speakers, panels and announcements, most of which are highly relevant to the work we do here at PolicyMap.

Last year’s panel that I created about using publicly available data with proprietary, third-party data was so thought-provoking and fun, that I elected to chair two panels this year. One was a beeline to what I needed to know for my job: a survey of the latest new data products available. The other was a topic that holds great interest to me as an avid consumer of the news media: the increasing reliance on data by journalists.

“Highlights of New Public Data Products” afforded the APDU community a chance to learn about three new/updated data products: Census’ American Community Survey redesign, the Centers for Medicare and Medicaid (CMS) DataNavigator, and the Longitudinal Employer-Household Dynamics (LEHD) Job-to-Job Flows.

With Jeff Sisson from the ACS, we learned about the redesign of the ACS and how they plan to adjust table offerings based on consumer feedback. Jeff also talked about the development of new 1-year estimates for areas with smaller population thresholds than are currently available (greater than 20,000 instead of greater than 65,000!). He also discussed 5-year variance replicate estimates for advanced users who want to generate Margins of Error (MOE) to match an internally-set threshold. Finally, Jeff discussed the eagerly-awaited Center of Dissemination Services and Consumer Innovation (CEDSCI) service that will replace the many current Census delivery mechanisms including DataFerret and FactFinder. Jeff anticipates CEDSCI going live in late 2016/early 2017, and we can’t wait to use it!

With Chris Powers from CMS, we got the inside scoop on data.cms.gov, a comprehensive source for datasets from CMS, CDC, NIH, FDA and other agencies. The CMS dashboard about Medicare Chronic Conditions was of particular interest, given our relatively recent decision to include these data in PolicyMap’s Health tab under Health Status>Chronic Conditions>Medicare Population.

Finally, we heard from the dynamic Erika MacEntarfer of LEHD. Erika talked about why Census decided to develop the new job-to-job flows data. She explained that because more than half of workers in the US left the labor market or moved to different industries after the housing boom, the LEHD team found it highly relevant to learn where those workers went and what they’re doing now. Erika also noted general public interest in this topic as evidenced from an article from FiveThirtyEight’s Ben Casselman about the idea that workers are leaving their jobs for advancement in their careers.

Speaking of Ben Casselman, the second panel I chaired at the conference, “Data’s Essential Role in Effective Journalism,” covered the relatively new approach to journalism that hinges on the use of data. The aforementioned Ben Casselman talked about the pitfalls of relying too heavily on data, as well as the necessity of using data to understand political rhetoric. Sarah Cohen of the New York Times discussed the frustration of the data embargoes imposed by public data providers because of the limited timeframe for crafting a compelling story, and she talked more generally about her experiences running a team of data journalists at the Times. She explained how the use of data in journalism – rather than the use of expert opinion – has taken reporters out of trenchcoats and midnight meetings and into the realm of data analysis and data visualization production. D’Vera Cohn of Pew Research Center talked about the importance of presenting data “journalistically” and making data accessible to users and consumers by making it personal and interactive. She discussed the imperative to provide context and definitions, even when Tweeting about data. And finally, she had some good advice for data providers about dealing effectively with journalists, such as preparing talking points of no more than 25 words to share with journalists who come calling. Data journalism seemed to have a unique appeal among APDU members: it interested data consumers and data providers alike.

Other highlights of the conference included hearing from Micah Altman of MIT Libraries and Cavan Capps about preserving privacy and security while optimizing data availability through decentralized data hubs. Robert Avery discussed his progress and timeline with the National Mortgage Database, and we learned of the project’s importance in avoiding a future housing crisis. We also got an update from Steve Pierson about the perilous funding of government statistical programs in his Washington Briefing.

It was an honor to have a seat at the table with our distinguished panelists, and the conference provided a refreshing few days of taking our expertise in data to the next level and engaging with those on the forefront of data production and analysis.