A Reference Guide to Stream Processing

The goal of streaming systems is to process big data volumes and provide useful insights into the data prior to saving it to long-term storage. The traditional approach to processing data at scale is batching; the premise of which is that all the data is available in the system of record before the processing starts. In the case of failures the whole job can be simply restarted.

While quite simple and robust, the batching approach clearly introduces a large latency between gathering the data and being ready to act upon it. The goal of stream processing is to overcome this latency. It processes the live, raw data immediately as it arrives and meets the challenges of incremental processing, scalability and fault tolerance.

Download PDF

Sponsors