What is Big Data?
Can be described in terms of:
- Volume - too big to fit on a single server
- Velocity - streaming data requires milliseconds to analyse and respond
- Variety - data in many forms such as structured, unstructured, text, multimedia
What is the most difficult aspect of Big Data?
The lack of structure
Why is the lack of structure in Big Data a challenge?
What are the ways to handle Big Data?
Why does that fact that data doesn’t fit on a single server become a problem for Big Data?
Relational Databases do not scale well across multiple machines so functional programming must be used
What is the source of Big Data?
Data from networked sensors, smartphones, video surveillance, mouse clicks etc are continuously streamed
Describe the fact-graph model
What are the components used to make up the fact-graph model?