What is big data?
Data that can’t be processed or analysed using traditional processes or tools.
What are the 3 characteristics of big data?
1) Variety: Variety of different forms of information // data may lack structure
Cannot be represented by a relational database: e.g. email messages, videos, images
2) Volume: There is a lot / high volume of data (to process as one dataset) // data will not fit on one server
hundreds of terabytes: e.g. medical datasets for diagnosis, predicting disease outbreaks
3) Velocity: The data is generated/received and processed at high velocity
Data must be processed as it is received - it cannot be batched and processed later: e.g. card payment fraud detecton, recommendation systems
What are the challenges that come with big data?
How are Big Data’s challenges overcome?
What are the ethical issues around big data?
Area 3: Ethical and Legal Issues
Features of functional programming languages that make it easier to write code that can be distributed to run across multiple servers
In the fact based model, each individual piece of information is stored as a […]
In the fact based model, each individual piece of information is stored as a fact
In the fact based model, […] is stored as a fact
In the fact based model, each individual piece of information is stored as a fact
What is stored with each fact in a fact based model?
Timestamp of the date and time at which a piece of information was recorded
Why are timestamps stored with facts?
Facts are never deleted or overwritten and multiple different values could be held for the same attribute so the timestamps allow the computer to discern which value is most recent (immutable)
Why can traditional relational databases not be used for big data?
Due to the volume of big data, there are usually several terabytes of information which are unstructured and need to be processed extremely rapidly
How to construct graph schema?
Why are fact-based models useful for storing Big Data?
Big Data characteristics MS with examples
Big Data problems and solutions MS
Big Data ethics MS