“Relying on surveys is not a sustainable model anymore.”
This was the message of Chief Statistician of the United States Nancy Potok, and it was a running theme throughout the 2017 Association of Public Data Users (APDU) Conference, which wrapped up last week. (The actual theme was “Communicating Data”.) In her keynote, she talked about how surveys are getting more expensive, budgets are shrinking, respondents are getting less responsive, data users want better data.
It wasn’t so much a gloomy assessment of the present (though some surely saw it that way) as a call to find new innovative ways of creating data.
Her speech, and other aspects of the conference, was focused on using administrative data to improve and replace survey data. Administrative data is data originally meant to serve a different purpose, but which can be used for statistical and data purposes. An example of administrative data would be HUD data on home vacancies, which comes from the postal service. The postal service isn’t interested in vacancy data, but they are interested in keeping track of which houses they don’t need to deliver mail to. That information ends up being useful data. (A version of this data is available to PolicyMap users, from Valassis Lists.)
What’s Wrong with Survey Data?
The reason people are losing some faith in survey data is that congressional gridlock has left the Census Bureau budget at a substantially lower level than it’s been in previous decades leading up to the decennial census. The department is scrambling to find ways to cut costs without reducing the quality of its products, and one of those ways may be through the use of administrative data.
Two different speakers gave a very different look at the current state of Census Bureau funding. Lobbyist Howard Feinberg of Insights Association gave a very grim outlook, saying that unless Congress manages to pass a new budget (and not just another unchanged continuing resolution), we’ll be likely to see further cuts, possibly to the ACS 5-year estimates (that prediction prompted a lot of silent gasps). George Washington University professor Andrew Reamer pointed out that getting rid of ACS 5-year estimates would disrupt the regulatory process around hundreds of other federal government programs. All which is to say, there is uncertainty around the future activities of the Census Bureau.
However, Lisa Blumerman, Associate Director of Decennial Census Programs at the Census Bureau, had a much more positive view of the situation, detailing how the preparations for the 2020 Census are on track. She made a special point to mention, on numerous occasions, the support the bureau has gotten from the new administration. As has been mentioned frequently, the new Secretary of Commerce, Wilbur Ross, worked at the Census Bureau during the 1960 Census.
Data from Commercial Firms
So all that said, what would an increased use of administrative data look like? The first panel discussion was on the role of commercial firms in public data, an interesting topic because private entities have TONS of data. An economist from Zillow, Aaron Terrazas, talked about their role in sharing data; as you might imagine, Zillow has some home sale data at their disposal, and it sounds like they’re willing to share much of it with researchers.
So how do we get data from private firms? Stefaan Verhulst of NYU’s GovLab talked about the need for partnerships and data collaborates. Sometimes, researchers can offer insights back to the firms about their data that might be helpful. Sometimes they might be able to perform validation on the data that could directly benefit the firm. But sometimes, it just comes down to finding the one person at the company who has the time to get you the data you’re looking for (that they’re willing to share).
How can this help government data? Michael Dalton from the Bureau of Labor Statistics talked about how he was able to use data from Career Builder to help validate the BLS’s Job Openings and Labor Turnover Summary (JOLTS), which makes plenty of sense. One can imagine that aggregated data from payroll provider could be invaluable. But if the company sees the value in the data, they may no longer make it publicly available. And one of the goals of using this data is to save money, not spend more.
Public Administrative Data
It seems that at every conference we attend, the Census Bureau has a session on their attempts to increase survey response rates (sometimes it’s as simple as changing the font on the envelope). In a slight twist on that, this conference had a session on the Census’s attempts to increase trust in the Census and alleviate concerns about privacy.
A central problem the Census faces is non-responses. Historically, if a household doesn’t respond to the Census, someone is sent to knock on their door. If no one answers, they might talk to a neighbor, and find out how many people live in the non-responder’s house. It turns out, some people find this creepy!
As an alternative, the Census could use administrative data, such as the aforementioned HUD/USPS vacancy data, or Social Security address data, to get information about non-responders. Is that less creepy? The Census did their own surveys (as you’d expect) and conducted some focus groups, and found that by and large, the public already expects that different government entities share data, and so people didn’t have a problem with this. Unfortunately, it also meant they might be less apt to share data with the Census, because they think it might get shared with, say, the Department of Homeland Security, or the IRS. (It doesn’t.) As usual, the specific wording often plays a big role in people’s attitudes towards what the Census does.
But it’s not always easy for different government agencies to share data. The Bureau of Economic Analysis wanted to use Census data to help their calculations of the GDP, and ran into all kinds of roadblocks. The Census Bureau and BEA are part of the same government agency (the Commerce Department). As of this year, they work in the same building. As Nancy Potok put it, “That’s at the far end of the ridiculous spectrum.”
Hearing all this talk about administrative data was interesting after my visit earlier this year to the ACS Data Users Conference. There, the talk was about how administrative data was a fad, and the real solutions were going to come through better use of survey data. It makes sense, though, that a conference devoted to survey data might be skeptical of other data. But it definitely drives home the sense that there are multiple perspectives on this.
One of the points made by various speakers is that use of administrative data is in its infancy. With survey data, we have decades of standards and methodologies around margins of error, sample sizes, etc. Little of that exists with administrative data, so far. We’ll see at next year’s APDU conference how far along its coming.