Kinesis
Kinesis export data to other services
EMR;
S3;
RedShift;
Lambda
Kinesis Components
Stream
Producers(data creators)
Consumers(data consumers)
Shards(processing power)
Kinesis Benefits
Kinesis When to use
Gaming; Real-time analytics; Application alerts; Log/Event data collection Mobile data capture
Kinesis– Producer
Kinesis Consumer
EMR (Elastic MapREduce)
is a service which deploys out EC2 instances based off of the Hadoop big data framework.
EMR Workflow
Other EMR Facts
EMR Master Node
EMR Slave Node
Core node and Task Node.
Core Node
a slave node that software components which run tasks AND stores data in the Hadoop Distributed File System(HDFS) on your cluster.
– do the heavy lifting with the data.
Task Node
a slave node that has software components which only run tasks.
– optional
EMR Map Phase
EMR reduce phase
reducing i sa function that aggregates the split data back into one data source.
reduced data needs to be stored as data processed by the EMR cluster is not persistent.