Examples of Design Models
Analysis Model and Design Model
Data Dictionary parts
Entity-Relationship Diagram(Data Object Description)
Data Flow Diagram (Process Specification)
State Transition Diagram(Control Specification)
Types of RDBMS for Transactions
Manual (Paper-based)
Digital (Computerized but not necessarily automated version of digital like an excel spreadsheet)
Digital (Computer based and automated like a Database Management System)
It refers to the selection of logical representations of data objects identified during the requirements definition and specification phase
Data Design
It abstracts relations into objects allowing simplified interactions with a database without writing raw SQL queries.
Object-Relational Mapping (ORMS ex. Mongoose)
It refers to the process of organizing and structuring the data in a relational database to reduce redundancy and improve data integrity.
It can benefit the long-term performance and cost of maintaining large databases.
Levels of Normalization
Irregular Relation (Table with non-atomic domain values)
1NF: First Normal Form
2NF
3NF
BCNF : Boyce Codd Normal Form
Fourth Normal Form
Fifth Normal Form
Normalization focuses on atomicity and eliminating redundancy, but a surrogate key may be added to meet normal forms of complex relations, for example, a table with a tile that may store more than one value in an “array” type can be separated as different rows in the same table.
First Normal Form
Eliminates partial dependencies
2NF
Eliminates transitive dependencies
3NF
Ensures that every determinant is a super key
BCNF (Boyce Codd Normal Form)
Remove multi-valued dependencies
4NF
Remove Join Dependencies (It doesn’t have to be repeating values like in 4NF)
Fifth Normal Form
It refers to the intentional introduction of redundancy into a normalized database to improve query performance, the opposite of normalization
Denormalization, this is often implemented when there is a strong need for faster read access and the overhead of maintaining the highly normalized tables outweighs the benefits.
When to use denormalization?
When there is a constant need to frequently join tables, joining causes overhead so merging them permanently is better for faster lookup.
Vertical splitting when some attributes are rarely used.
Horizontal Splitting for Large Tables
Adding a previous value attribute.
Adding a duplicate attribute to a different table
Adding a derived attribute for frequently derived data.
Data Design best practices (3)
Understand the Data Model
Define the Schema
Design for Scalability
Data models
RDBMS for business transactions
Simple Key-Value Pairs
Document-Oriented (JSON, BSON, XML)
Column-Family
Graph-based
Vector databases
Hybrid Implementations
How to define the schema? (5)
Define Data types
Define relationships
Define frequently queried fields
Denormalize as necessary
Enforce schema and data validation
How to consider scalability?
Enable autoscaling
Design for distributed architecture
Design using hybrid approaches
Use compression