Has there ever been a more data-driven general election than the 2012 cycle? After the Obama campaign’s stunning win, story after story have emerged of its number-crunching ground game and an organization driven by the kind of nuanced, granular, what’s-in-your driveway statistics once reserved for Fortune 50 marketing firms. It turns out that where you live and what you wear and watch says more about whom you’ll vote for than anything you actually think.
The depth to which modern campaigns have collected and used information is acknowledged but perhaps not really understood. Campaign Manager Jim Messina has talked publicly about enabling data-heads to develop models, applications and other means to deliver actionable information to operators and deliver voters to the polls. Mitt Romney, by contrast, failed to understand his own data, and the collapse of his own data application called Orca contributed to the Republicans’ failure on Election Day.
But only the professionals really understand how much data has been involved in this last election cycle. The rest of us have been focused on Nate Silver and his fivethirtyeight blog for The New York Times, which for two general election cycles has been able to predict with eerie accuracy the lineup of the Electoral College.
By contrast, here’s Messina dismissing the publicly available data: “Most of the public polls you were seeing were completely ridiculous,” Messina told Politico. “A bunch of polling is broken in the country.” Remember, that’s the information in the public domainthat allowed Nate Silver to predict the outcome of the Electoral College for all 50 states. Imagine what kind of data Messina had at his disposal.
Actually, you don’t have to imagine. Although Karl Rove hasn’t called a race accurately since 2006, he’s been pretty transparent about how he runs his races. In October 2006, as George W. Bush faced losing the House, Rove sparred with NPR’s Robert Siegel, who asserted that the polls were looking bad for the Republicans. Rove contested that. “I’m looking at the same polls you’re looking at,” Siegel insisted.
“No, you are not,” Rove shot back. “I’m looking at 68 polls a week for candidates for the U.S. House and U.S. Senate and Governor and you may be looking at four-five public polls a week that talk attitudes nationally.” That’s nearly 70 polls a week, nearly 10 polls a day – during the midterms – which is exponentially more information than was publicly available to NPR.
We’re talking about an entirely different level of data, privately collected and held not just by the presidential campaigns but by the party campaign committees, statewide campaigns for Senate, district-level races for Congress, and state-level races for assembly, and so on. I’ve even heard of data-assembly for races at the county level. All of that information is proprietary, and the public will never see it.
Why not? Why shouldn’t that information be available, just like the polling published by Gallup, Pew, Rasmussen, Quinnipiac, and the news organizations, and aggregated by such sites as fivethirtyeight and Real Clear Politics? The proprietary information is incredibly dense and rich, interesting and valuable. It would tell us not only about how the campaigns are run and what the campaigns collect on the voters. It would tell us who we are and how we are changing. A lot has been written about how American demographics are tipping dramatically against the Republican party. Public exit polling and the U.S. Census Bureau only tell us so much.
The answer to the question is mostly money. Collecting information is expensive and data is valuable, and you don’t give it away for free. And information like that is really only valuable when it’s collected consistently over time. Data collected properly since before the 2006 race have a shelf-life. Public opinion surveys are the Twinkies of quantitative sociology.
But the campaigns don’t collect this data. The companies the campaigns contract to construct the polls, put them in the field, and deliver the reports do that. To release the information, we’d have to convince or pay them to make the information public. Or we could insist that the publicly funded presidential campaigns – the most recent of which was John McCain’s 2008 race – makes the information a kind of public property.
At the very least, the public, historians and political scientists need to know that this data exists – in vastly larger quantities than they are likely aware – and will not simply vanish with each election cycle. In the end that information is about the American people and collected for the purposes of electing their leaders. It should be made available to them.