Have you ever wondered why Montana is the happiest state? Or why Mitt Romney is the front runner? These questions and others like them heavily influence our economy, our politics, and even everyday life, but the data that holds the answers is scattered far and wide. This is now changing thanks to the Fusion Project which is combing the vast data stores of government and research institutions alike to create one massively powerful data set.
The Fusion Project is making the promises of big data a reality and is possible only because of the explosion of available data. Back in 2010 Eric Schmidt, then CEO of Google, said that “every two days we now create as much information as we did from the dawn of civilization up until 2003”. Data has even skyrocketed since then with the proliferation of smart phones, tablet computers, and other products producing new forms of data, but no one has yet taken full advantage of this overwhelming data source.
“I was surprised by the sheer volume of data available from governments and public institutions,” said Jason Kolb, a senior data scientist at Applied Data Labs. “But there is very little value being extracted from it due to its fragmented nature and our inability to analyze it all as a whole.” When the data is properly fused to private data, Kolb said, companies can reveal exciting new insights and identify ideas and opportunities that were previously hidden. For example, they can extend customer data with demographic and income information from the US Census project, or quality of life data from Pew Research. This unlocks much richer sets of information for use in customer service, marketing, and many other initiatives.
To address this need, Applied Data Labs is launching The Fusion Project–a public project using techniques and technology developed internally at Applied Data Labs to combine multiple public data sets into a single, value-packed data set for public consumption. Using distributed data analysis technology which employs advanced statistical analysis and data integration technology, the Fusion Project is able to join together previously silo’d data sets, unlocking insights and ideas that were previously unavailable. The project’s ultimate goal is to unlock the synergy latent in the publicly available data sets and put them to work in analytics environments.
“The Fusion Project uses several new and emerging technologies to work its magic,” said Kolb. “It’s one of the first projects to use Semantic Web technology in a meaningful way, and we’ve developed several unique ways to combine statistical research and analysis to stitch data sets together.” Applied Data Labs researches and develops experimental and theoretical analytic technologies internally and then consumerizes them in various ways, the Fusion Project being one of the first publicly-available incarnations of this process.