BDP V’s?
Volume, Variety, Velocity
ETL?
Extract, Transform, Load
Types of processing?
Stream and Batch
What is Data Parallelism?
You split the data into chunks and apply the same algorithm to all pieces.
What is Task Parallelism?
You split the tasks into chunks, and run it on a cluster of machines.
Properties of BDP?
Robustness, Low latency read/write, Scalability, Minimal maintenance
Revolution of BDP?
large scale computing processing on distributed, commodity computers, enabled by advanced software using elastic resource allocation.