Big Data Notes
“Big data” represents a paradigm shift in the technologies and techniques for storing, analyzing and leveraging information assets. In this column, we track the progress of technologies such as Hadoop, NoSQL and data science and see how they are revolutionizing database management, business practice, and our everyday lives.
The pioneers of big data, such as Google, Amazon, and eBay, generated a "data exhaust" from their core operations that was more than sufficient to allow them to create data-driven process automation. But, for smaller enterprises, data might be the scarcest commodity. Hence, the emergence of data marketplaces.
Posted August 05, 2014
Big data analytics is a complex field, but if you understand the basic concepts—such as the difference between supervised and unsupervised learning—you are sure to be ahead of the person who wants to talk data science at your next cocktail party!
Posted June 11, 2014
About 3 years ago, the AMP (Algorithms, Machines, People) lab was established at U.C. Berkeley to attack the emerging challenges of advanced analytics and machine learning on big data. The resulting Berkeley Data Analytics Stack—particularly the Spark processing engine—has shown rapid uptake and tremendous promise.
Posted April 04, 2014
Solid State Disk (SSD)—particularly flash SSD—promised to revolutionize database performance by providing a storage media that was orders of magnitude faster than magnetic disk, offering the first significant improvement in disk I/O latency for decades. Aerospike is a NoSQL database that attempts to provide a database architecture that can fully exploit the I/O characteristics of flash SSD.
Posted February 10, 2014