Big Data
The well-known three Vs of Big Data - Volume, Variety, and Velocity – are increasingly placing pressure on organizations that need to manage this data as well as extract value from this data deluge for Predictive Analytics and Decision-Making. Big Data technologies, services, and tools such as Hadoop, MapReduce, Hive and NoSQL/NewSQL databases and Data Integration techniques, In-Memory approaches, and Cloud technologies have emerged to help meet the challenges posed by the flood of Web, Social Media, Internet of Things (IoT) and machine-to-machine (M2M) data flowing into organizations.
Big Data Articles
Google's first "secret sauce" for web search was the innovative PageRank link analysis algorithm which successfully identifies the most relevant pages matching a search term. Google's superior search results were a huge factor in their early success. However, Google could never have achieved their current market dominance without an ability to reliably and quickly return those results. From the beginning, Google needed to handle volumes of data that exceeded the capabilities of existing commercial technologies. Instead, Google leveraged clusters of inexpensive commodity hardware, and created their own software frameworks to sift and index the data. Over time, these techniques evolved into the MapReduce algorithm. MapReduce allows data stored on a distributed file system - such as the Google File System (GFS) - to be processed in parallel by hundreds of thousands of inexpensive computers. Using MapReduce, Google is able to process more than a petabyte (one million GB) of new web data every hour.
Posted January 11, 2010
Google introduced the MapReduce algorithm to perform massively parallel processing of very large data sets using clusters of commodity hardware. MapReduce is a core Google technology and key to maintaining Google's website indexes.
Posted September 14, 2009