MapReduce for Business Intelligence and Analytics

Bookmark and Share

Google introduced the MapReduce algorithm to perform massively parallel processing of very large data sets using clusters of commodity hardware. MapReduce is a core Google technology and key to maintaining Google's website indexes.

The general MapReduce concept is simple: The "map" step partitions the data and distributes it to worker processes, which may run on remote hosts. The outputs of the parallelized computations are "reduced" into a merged result. MapReduce works well when performing the sort of aggregate operations typical in business intelligence-daily sales totals for instance-as well as operations such as web search that involve searches through massive data sets.

Read more here.

To stay on top of all the trends, subscribe to Database Trends and Applications magazine.