What is structured data?
What can you do with structured data?
Whats an example of structured data?
CSV file
Collibra exported to Excel which happened to be in structured format, brought into google sheets for analysis
What is unstructured data?
no set structure/format
How can you analyse unstructured data?
Pre-processing methods:
Texting mining ⚒️
Natural Language Processing🗣️
Image Recognition🖼️
What’s an example of unstructured data?
customer reviews on social media
What is a relational database?
database structured into tables made with rows and columns
What are the tables within a relational database joined by?
Primary Key: unique identifier for each record
What is normalisation
process to organise and maintain the data
what are the steps in normalisation called?
Normal forms
what are the 3 normal forms?
First, second, third - standards used to structure tables
what is the first form?
E.g. in a Products table, you may have started with “Clothing, Casual” in the “Categories” column, but 1NF means “Clothing” and “Casual” are split into seperate rows
what is the second form?
e.g. ProductName depends only on ProductID, so they create a seperate table from the Sales Table which contains SaleID, Qty and Customer Name
what is the third form?
E.g. A sales table with SaleID, ProductID, Qty, CustomerName and CustomerID - CustomerName can be derived from CustomerID, so we only need CustomerID in the table and the Name is removed
What is NoSQL
not only SQL - flexible databases that store and manage un/structured data
example of NoSQL
What databases system design does JLP use?
What are the challenges of maintaining Relational Databases? and how do you overcome them?
What is a Data Warehouse?