Database systems Flashcards

(66 cards)

1
Q

What is a DBMS?

A
  • Database management system
  • Software used to retrieve, maniuplate and manage data
  • Acts as middle layer between application and database
  • Extends human logic by enabling complex queries and analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is a DBMS better than a simple file system?

A
  • Avoid manual passing of data between programs
  • Provides central, integrated database
  • More efficient data access and management
  • Better for larger scale and supports security controls
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three main levels or abstractions?

A
  • Physical memory: how data is stored
  • Logical level: what data is stored
  • View level: how user interacts with data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 3 main threats to database security?

A
  • Loss of integrity: info should be properly modified
  • Loss of availability: losing the ability to access database
  • Loss of confidentiality: protecting database from unauthorised disclosure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an ER model and what does it contain?

A
  • Entity-relationship model
  • High-level conceptual data modelling approach that focusses on data requirements
  • Contains entities, attributes and relationships
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Name 6 attribute types

A
  • Simple or composite
  • Single or multivalued
  • Stored or derived
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are weak entities?

A
  • Entity that cannot be uniquely identified by its own attributes
  • Identified by its composite key
  • Connects to a strong entity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is functional dependancy?

A
  • It is a constraint between 2 sets of attributes within a relation
  • If X->Y, this means that X determines Y
  • So if 2 rows have the same value for X, they also have the same value for Y
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the 4 types of functional dependancy?

A
  • Trivial FD: X->Y and Y is a subset of X
  • Non-trivial FD: X->Y and Y is not a subset of X
  • Multivalued FD: X->{Y,Z} and Y and Z are not dependant
  • Transitive FD: X->Y and Y->Z mean X->Z
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the functional dependancy closure?

A
  • All the functional dependancies that can be derived given a set of FDs F
  • Can help us normalise the database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the attribute closure?

A
  • The set of attributes that can be functionally determined from an attribute
  • Can help us determind the keys for the relation, as all attributes must be functionally dependant on the key
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is normalisation and what are its advantages?

A
  • Normalisation is the process of organising the data in the database
  • Prevents redundancy
  • Offers data consistency and flexibility
  • Enforces relational integrity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What anomalies does normalisation help us prevent?

A
  • Insertion anomalies: omission during insertion
  • Deletion anomalies: unintended loss of data due to deletion
  • Update anomalies: data inconsistency that results from redundant data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is first normal form 1NF?

A
  • No repeating groups
  • All attributes are atomic (indivisible) so no multi-valued or composite attributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is second normal form 2NF?

A
  • Must be 1NF
  • Has no partial dependancies (each non-key attribute depends on the entire primary key)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is third normal form 3NF?

A
  • Must be in 2NF
  • No transitive dependancies involving non-key attributes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a trigger?

A
  • An object that automatically executes when an event occurs
  • Without being explicitly called
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Which events can activate triggers?

A
  • Data-modifying operations
  • INSERT, UPDATE, DELETE
  • Each triggers is tied to one event and one table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the difference between a BEFORE and AFTER trigger?

A
  • BEFORE executes before data is written: used for validation, data correction etc
  • AFTER executes after data is written: used for auditing, logging etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How can we use OLD and NEW modifiers in triggers?

A
  • OLD is data before the event and NEW is data after the event
  • INSERT: can only use NEW
  • UPDATE: can use NEW and OLD
  • DELETE: can only use OLD
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the benefits of triggers?

A
  • Enforce business rules
  • Maintain data integrity
  • Automatically audits and synchronises data
  • Prevent invalid operations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How can triggers be used for event handling?

A
  • BEFORE triggers can reject invalid operations
  • Raises SQL error (eg SIGNAL SQLSTATE)
  • Prevents bad data from being committed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How can we implement multiple triggers/chaining?

A
  • MYSQL uses FOLLOWS and PRECEDES
  • Enables trigger execution order
  • However can cause difficulty in debugging and readability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a stored procedure?

A
  • A stored procedure is a named block of SQL logic stored on the database server
  • Executed using a single CALL statement
  • Moves logic closer to the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What are 4 differences between triggers and stored procedures?
- Triggers are **automatic** whereas stored procedures are **explicitly called** - Triggers are **event-driven** whereas stored procedures are **task-driven** - Triggers are **table-bound** whereas stored procedures are **independant** - Triggers are **hidden** but stored procedures are **visible**
26
How are stored procedures stored and executed?
- First execution: compiled and stored in cache - Subsequent calls: executed directly from cache - This reduces CPU and parsing overhead
27
What are the three parameters of stored procedures?
* IN: accepts a value * OUT: returns a values * INOUT: accepts and returns a value ## Footnote OUT must be captured using session variables
28
How can stored procedures improve security?
- Users can be granted access to **procedures**, but **not underlying tables** - Prevents unauthorised data access
29
What are some limitations of stored procedures?
- Difficult to **debug** - Require **specialised skills** - Poor **portability** - Increase server resource usage
30
What are cursors?
- A cursor **iterates** through a result set row-by-row - Used inside stored procedures only - Read-only, forward-only
31
How are NULL values represented in SQL?
- Use UNKNOWN logical state and can be checked with IS NULL or IS NOT NULL
32
What is PDO?
- PDO is an **object-oriented** databased access layer in PHP - Provides a **secure**, **consistent** interface for interacting with databases - `$pdo = new PDO($dsn, $username, $password);`
33
How does PDO support error handling?
- Can set modes as attributes - `$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_WARNING);` - Can also use `PDO::ERRMODE_SILENT` and `PDO::ERRMODE_EXCEPTION`
34
What are prepared statements in PHP?
- Use placeholders that place user-supplied data into SQL - ``` $stmt = $pdo->prepare("SELECT * FROM users WHERE email = ?"); $stmt->execute([$email]);``` - Replaces ? with variable ## Footnote Prevents SQL injection
35
How is data fetched from PDO?
- Data is retrieved using fetch() or fetchAll() - `$row = $stmt->fetch(PDO::FETCH_ASSOC);` - Returns an associative array
36
How and why are passwords hashed in PDO?
- `$pwd = password_hash($password, PASSWORD_DEFAULT); ` - Protects users if database is compromised and uses strong up-to-date hashing algorithms
37
How does PDO verify passwords?
- `password_verify($plainPassword, $storedHash); ` - Returns true if password matches and false if not - Passwords are not rehashed- comparison is handled internally
38
What HTML element is used to collect user input?
- A form - `
`
39
How can we check a form has been submitted?
- By checking the request method - `$_SERVER['REQUEST_METHOD']=== 'POST'` - Using `if (...)` would show the form and process it
40
Why is using POST better than using GET?
- GET would expose any sensitive data in the URL
41
What do we need to do so can we copy the post data into a local variable and then check it has been assigned a value?
- `$var = $_POST['variable']` - `isset($_POST['variable'])` - Need to use "name" attribute in the HTML input otherwise it doesn't get sent to PHP
42
What is a transaction in DBMS?
- A transaction is an executing program that forms a **logical unit** of database processing - Can be executed inside the **application** or the **DBMS server** - Can be **Read-only**: SELECT, or **Write-only**: INSERT, UPDATE, DELETE
43
How does a DBMS handle many transactions at the same time?
- Multiple transactions may access the same database items at the same time - Transactions run as **independant programs**: own **namespace** and own **memory** - DBMS manages coordination to ensure **correctness**, **consistency** and **performance**
44
Why does the DMBS use a buffer?
- **Secondary storage** is slow - Frequently accessed data is copied into **buffer** in **main memory** on the DBMS **server** - Applications **read from buffer** which improves performance and response time
45
What happens when a database issues a WRITE operation?
- A WRITE updates the database **conceptually** - Actual update is stored on **DBMS buffer** - Disk/secondary storage is updated later - Writing to disk is slow so buffering allows efficient **transaction processing**
46
What is transaction processing and why is it critical in multi-user database systems?
- Multi-user systems have multiple users trying to access the same database at the same time - Allows **concurrecy** - Critical for systems with **high availability** and that require **fast response times**
47
What is concurrency in database systems and how is it achieved?
- Concurrency is when multiple users are **accessing** and **changing** the database at the same time - Achieved using **multiprogramming**: multiple processes admitted to **ready queue**, and they are **scheduled** so they **interleave** - Not truly parallel
48
What is interleaving and why does it matter for transaction processing?
- Interleaving is when the CPU switches between processes and created the **illusion** of simulataneous execution - Maximises system **throughput**
49
What 4 problems can occur when transactions run concurrently?
- **Lost update**: Two transactions read the same original value; one update overwrites the other - **Dirty read (Temporary update)**: A transaction reads uncommitted data that is later rolled back after abortion - **Incorrect summary**: An aggregate calculation mixes old and new values due to concurrent updates - **Unrepeatable read**: Re-reading the same item returns a different value becuase another transaction updated it
50
What are the ACID properties of database transactions?
- **Atomicity**: transaction fully completes or fully fails- no partial changes - **Consistency**: database moves from one valid state to another- rules and constraints preserved - **Isolation**: concurrent transactions do not interfere- appear as if executed sequentially - **Durability**: commited changes are permanent, even after crashes or failures
51
What is a schedule in transaction execution?
- Concurrent transations are executed in an **interleaved fashion** - Schedule defines the execution **order** of all operations - We want to preserve transaction order and **avoid conflicts**
52
What are the 3 conditions of a scheduling conflict?
1. Belong to different transactions 2. Access the same item 3. At least one operation is a WRITE ## Footnote Conflict = inconsistent data
53
What is a serial vs serialisable schedule?
- **Serial** schedules are when transactions execute one after another - **Non-serial** is interleaved transactions but can produce **errors** - **Serialisable** schedule is a an interleaved schedule that is **equivalent** to some serial schedule - **Equivalence** focusses on **read/write** operations only
54
What is conflict equivalence in scheduling?
- We **can't rely on observing states** to determine if schedules are equivalent- some non-equivalent schedules may produce the same state - **Conflict equivalence**: If the **relative order** of any 2 conflicting operations is the same in both schedules
55
How is graph theory used to test is a schedule is serializable?
- Construct a dependancy graph where **nodes are transactions** and **directed edges are conflicts** - If there are **no cycles** the schedule is serializable - Ensures interleaved schedules **preserve correctness**
56
What are the 4 formats of NoSQL?
- **Key-based**: value is record or object, accessed quickly by key - **Document-based**: stores data in documents accessed by document_id - **Column-based**: each column stored in own file- for large data storage - **Graph-based**: entities=nodes, relations=edges
57
What is the CAP theorem?
- A distributed system can only guaratee 2/3: - **Consistency**: all nodes have same data - **Availability**: system is consistently available for read/write operations - **Partition tolerance**: system is continually available during network faults
58
What is the difference between embedding and referencing in MongoDB?
- **Embedding**: stores data in single document structure - **Referencing**: stores links or references from one document to another
59
When is embedding better than referencing?
- When related items are **frequently used or fetched together** - **One-to-one** relationship between documents - Document is **not a key document** - Data does **not change or grow** - Nested documents have **same volatility**
60
When is referencing better than embedding?
- When embedding would result in **substantial data duplication** - When documents **grow** - **Many-to-many** relationship between documents - The document is a **key document** - **Fast writes** are required
61
How does Couchbase store and distribute data?
- Uses a **key-value model** - Data split into 1024 **shards** called **Buckets** using **hashing** - **Clusters** are multiple nodes that **act as one database** that **automatically share and balance the load**
62
Why is a masterless architecture better than master-slave architecture?
- **Master-slave**: master becomes bottleneck so slows down scalability and if it goes down the service is stopped - **Masterless**: any node can handle reads and writes so avoids bottlenecks ## Footnote Masterless has better performance, scalability and availability
63
What are the benefits of document data modelling over relational data modelling?
- Relational schema requires **migrations and joins** that can be complex and risky - Document data modelling **preserves natural structure** rather than flattening
64
How do we translate relational models into documents?
- Each table mapped directly to **single JSON document** - Column becomes fields and PK becomes document keys - **One-to-many relationships** represented using **nested arrays** so related data is stored in a single document
65
When should we store related data as nested objects, vs seperate documents?
- **Nested**: one-to-one or one-to-many - **Seperate**: many-to-one or many-to-many
66
When should we store children as nested objects vs seperate documents?
- **Nested**: data reads and write are mostly parent+child fields - **Seperate**: data reads are mostly parent fields and writes are mostly parent or child fields