What is Continuous data and Discrete data
what is a relational database
A type of database that organises data into a tables
What are the 3 types of data?
Structured, semi-structured and unstructured
Give examples for each data type
stuctured: spreadsheets, transactonal records, continuous data , relational records.
semi : emails, zipped files, xml files,
unstructured: social media, weather data, entertainment
How can each data type be captured?
stuctured:IoE sensors , Retail checkpoints , Online surveys
semi :Qualitative data, Sensors, Satellites , gps
unstructured: Scrappers, Posts , comments , Photos , videos
What is data preparation?
the identification of the data gathered and transforming it into its raw data to be ready for analysis.
what steps are taken during textural data cleaning preparation?
-Get rid of useless words
-Get rid of punctuations
-Lower case
-Fix errors
-Translate language
what is data wrangling?
the conversion of data from one form to another.
what are the data cleaning steps?
(Challenge: Define them)
Discovery - identifying the data that has been gathered and working out how it needs altering for analysis
Structuring - cornering data so it can be analysed for apple taking a HTML document and putting it into a table
Cleaning - removing duplicates and converting data types
Enriching - to combine data with data from other sources or to fill in using sessions to add more content or find context or to find additional data that may need gathering
Validation - making sure the data is reasonably complete and consistent
Publishing - the hanged data is released in its new first for analysis
What is Data Mining
The process of automatically finding hidden patterns relationships and anomalies in large datasets to predict outcomes or gain valuable insights