some examples of streaming data (3)
what is stream processing
a processing mode where individual records or a small set of records are processed continuously, producing a simple response
can streaming data be processed by batch processing?
yes
what is bounded data?
datasets that are finite in size
what is unbounded data?
datasets that are (at least theoretically) infinite in size and new data can arrive and be made available at any point of time
What are streaming systems designed with in mind?
Unbounded data
What is a data surge?
a sudden and significant increase in the volume of data flowing through a streaming data processing system.
For real-time systems, why is failing to produce a processing result within a time window as bad as not producing
a result at all?
The events may become “insignificant” and the insights or trends produced may no longer be valid or accurate
Examples of streaming data (4)
Frameworks for the ingestion of unbounded data (7)
What are streams
sequences of immutable records that arrive at some point in time
Other phrases for streams (3)
What type of dataset are streams?
datasets in motion
What type of dataset are tables?
datasets at rest
what are the components of processing elements (PE)? (3)
why are input queues needed? (3)
What is a spout in Apache Storm?
Elements that generate streams from external sources
What is a stream in Apache Storm?
Unbounded stream of tuples
What is a bolt in Apache Storm?
A processing element that consumes and generates streams
What is a topology in Apache Storm?
a flow of spouts, streams and bolts
what does the order of tuples in a stream represent?
the time at which they arrive at the streaming system
what does the term at-least-once processing mean?
a guarantee that each message or event in a system will be processed at least once, but potentially more than once - ensuring that no data is lost in the event of failure
what is event time?
the time at which one event has been generated by a source
what is processing time?
the time at which events are seen by the stream processing system