Big data is changing how we view the world, and necessitating new ways of handling data as well, according to Data Summit 2016 keynotes presented on Wednesday by Kalev Hannes Leetaru and George John.
Data mining expert Kalev Hannes Leetaru is a Forbes columnist and founder of the GDELT Project (http://gdeltproject.org). Supported by Google Ideas, the GDELT Project monitors the world's broadcast, print, and web news from nearly every corner of every country in over 100 languages and identifies the people, locations, organizations, counts, themes, sources, emotions, counts, quotes and events driving our global society.
George John is a solutions architect at Amazon Web Services.
In his keynote presentation on Day 2 of Data Summit, Leetaru looked at how large datasets and computing platforms allow us to reimagine our world. He has looked at news as emotion, the geography of social media, cultural computing, and cities as geographic networks. Big data, the algorithms that control data discovery, and the data-driven economy are changing how we interact with information, he says. The GDELT Project collects data from local news coverage worldwide in 100 languages (65 live translated) including online news preserved via Internet Archive, ensuring that there is a permanent record of coverage that that might disappear quickly. In addition, the project collects data from television, academic literature, books, human rights reports, and imagery.
According to Leetaru, the goal is to take the incredible volumes of data that we have access to and fuse them together for a global view while breaking down linguistic barriers that block understanding
Beyond extracting and recording extract factual events and their coverage in the media in real time, the project also creates a narrative of themes imbued with the emotions, dreams, and fears to help put events in perspective. We have more data available now than we have had in human history but analysis has not grown with it, he observed. The size of the datasets that people tend to analyze have not really grown. Big datasets are whittled down which means that important details and characteristics can be missed that are only visible on a global scale.
Cloud presents an important capability in terms of handling large datasets, he said. “The power of the cloud is that we can analyze much larger datasets," said Leetaru. Cloud has changed the equation and the ability to handle large datasets. Cloud allows us to take petabytes of data and analyze it while 15 years ago, it would not be possible to do this kind of analysis unless you had highly skilled data experts. “Cloud as a whole is really breaking down barriers," he noted.
Presenting a sponsored keynote, George John, solutions architect, Amazon Web Services, covered the benefits of managed database services in allowing organizations to focus on using their data rather than on maintaining high availability and infrastructure. Amazon offers a managed service for each database type with Amazon RDS SQL database engines; Amazon Dynamo DB document and key-value store; Amazon ElastiCache in-memory key-value data store; and Amazon RedShift Data Warehouse.
Slides from these Data Summit 2016 presentations are available at https://www.dbta.com/DataSummit/2016/Presentations.aspx.
Data Summit is an annual 2-day conference, preceded by a day of workshops. Data Summit offers a comprehensive educational experience designed to guide you through all of the key issues in data management and analysis today. The event brings together IT managers, data architects, application developers, data analysts, project managers, and business managers for an intense immersion into the key technologies and strategies for becoming a data-informed business.
Many additional presentations from Data Summit 2016 have also been made available for review at www.dbta.com/DataSummit/2016/Presentations.aspx.