Databases Glossary Flashcards

(69 cards)

1
Q

ACID vs BASE

A

ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties for reliable transaction processing, common in traditional SQL databases. BASE (Basically Available, Soft state, Eventually consistent) is a model for distributed systems that prioritizes availability over strict consistency, common in NoSQL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

attribute (field)

A

A property or characteristic of an entity. In a relational table, this corresponds to a column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

backup

A

A copy of data taken and stored elsewhere so that it may be used to restore the original after a data loss event.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Candidate key

A

An attribute or a set of attributes that can uniquely identify a tuple in a relation. A relation can have multiple candidate keys.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cartesian product

A

A relational algebra operation that combines every tuple from one relation with every tuple from another relation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

CRUD operations

A

The four basic functions of persistent storage: Create, Read, Update, and Delete.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

DBMS, rDBMS, oDBMS, gDBMS

A

DBMS: Database Management System. A software system for creating and managing databases. rDBMS: Relational DBMS. oDBMS: Object-oriented DBMS. gDBMS: Graph-oriented DBMS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

discriminator

A

An attribute of a weak entity that, combined with the primary key of the owning entity, uniquely identifies a weak entity instance. Also known as a partial key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

entity (E/R diagram)

A

A real-world object or concept with an independent existence that is to be represented in the database (e.g., a person, a car, a course).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Equijoin

A

A type of join that combines tuples from two relations where the values of a common attribute are equal. It is a subset of the Cartesian product.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

foreign key

A

An attribute or set of attributes in a relation that refers to the primary key of another (or the same) relation, enforcing referential integrity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

index

A

A data structure used to improve the speed of data retrieval operations on a database table at the cost of additional writes and storage space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

intersection

A

A relational algebra operation that returns all tuples that are present in both of two specified relations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

multi-valued attribute

A

An attribute that can hold multiple values for a single entity instance (e.g., a ‘phone_numbers’ attribute for a person).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

natural, cross, equi, outer, inner, left, right joins

A

Types of join operations. Natural: joins on all common columns. Cross: Cartesian product. Equi: joins on equality. Inner: returns only matching rows. Outer (Left, Right, Full): returns matching rows plus non-matching rows from one or both tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

natural key

A

A candidate key that is formed of attributes that exist in the real world and have a logical relationship to the entity, such as a vehicle’s VIN or a person’s social security number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

predicate

A

The conditional part of a query, typically in the WHERE clause, that specifies the criteria for selecting rows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Primary Key

A

A candidate key chosen by the database designer to uniquely identify tuples within a relation. It cannot contain null values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

project, select, groupby

A

SQL/Relational Algebra operations. Select: retrieves rows (tuples). Project: retrieves columns (attributes). Group By: arranges identical data into groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

query

A

A request to a DBMS for retrieving, modifying, inserting, or deleting data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

record

A

A collection of related data items treated as a single unit. Synonymous with a tuple or a row in a relational database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Referential Integrity

A

A property ensuring that a foreign key value must match an existing primary key value in the referenced table or be null.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

relational, relation (table)

A

A relation is a set of tuples (rows), representing data in a two-dimensional table with columns (attributes).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

relation - databases, weak relation

A

A relation that contains a weak entity. It cannot exist on its own and depends on another ‘strong’ relation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
relation - maths, ternary relation
In mathematics, a set of ordered n-tuples. A ternary relation involves three sets.
26
schema
The structure of a database, describing the entities, their attributes, and the relationships among them. It is the 'blueprint' of the database.
27
semantic dependence
The meaning of an attribute's value is dependent on the value of another attribute.
28
semi-structured data
Data that does not conform to a strict data model but has some organizational properties, like tags, that make it easier to analyze. Examples include JSON and XML.
29
SQL
Structured Query Language. A standard language for managing and manipulating data in a relational database.
30
sub-entity
A subgrouping of an entity type that has distinct attributes from other subgroupings. Also known as a subtype or subclass.
31
superkey
Any set of attributes that, taken collectively, uniquely identifies a tuple within a relation. A candidate key is a minimal superkey.
32
synthetic key
An artificial key created by the database designer when no suitable natural key exists. Often an auto-incrementing integer. Also known as a surrogate key.
33
transaction
A single, logical unit of work consisting of one or more operations. Transactions must be atomic, consistent, isolated, and durable (ACID).
34
tuple (record)
A single row in a relational table, representing a single instance of an entity.
35
union
A relational algebra operation that combines the results of two queries and removes duplicate tuples.
36
update
A CRUD operation that modifies existing data in a database.
37
value
The actual data stored within an attribute (field) for a specific tuple (record).
38
value atomicity
A principle of the relational model stating that the value stored at the intersection of each row and column must be indivisible or atomic.
39
view
A virtual table based on the result-set of an SQL statement. It contains rows and columns, just like a real table, but does not store the data itself.
40
weak entity
An entity that cannot be uniquely identified by its own attributes and relies on the primary key of an associated 'strong' or 'owning' entity.
41
weak entity and discriminator
A weak entity is identified by the combination of the primary key of its owning entity and its own discriminator (or partial key).
42
big data
Data that is too big to fit in main memory/primary store/core.
43
distributed system
A system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another.
44
disk drive
A secondary storage device for recording, storing, and retrieving digital information using rotating platters coated with magnetic material.
45
filesystem
The methods and data structures that an operating system uses to keep track of files on a disk or partition; that is, the way the files are organized on the disk.
46
heap
An area of dynamically-allocated memory, contrasted with the stack. Used for data structures that can grow or shrink in size during program execution.
47
primary store (main memory) vs secondary store
Primary store (e.g., RAM) is volatile, faster, and smaller, directly accessible by the CPU. Secondary store (e.g., disk drive, SSD) is non-volatile, slower, and larger.
48
SSD
Solid-State Drive. A non-volatile storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory.
49
volatile
A type of memory that loses its content when the computer's power is turned off. For example, RAM.
50
arity
The number of attributes in a relation or the number of arguments in a function or operation.
51
one-to-many, many-to-one, one-to-one, many-to-many
Cardinality constraints describing the relationship between two entities. (e.g., One-to-many: one student can enroll in many courses).
52
transitive, commutative, associative
Mathematical properties of operators. Associative: (a+b)+c = a+(b+c). Commutative: a+b = b+a. Transitive: If a=b and b=c, then a=c.
53
JSON and or XML
JSON (JavaScript Object Notation) and XML (eXtensible Markup Language) are text-based formats for representing semi-structured data.
54
key/value pair, key/value store
A data model where data is stored as a collection of key-value pairs. The key serves as a unique identifier. Common in NoSQL databases like Redis.
55
path-oriented query
A query that navigates a hierarchical or graph-like data structure to retrieve elements. XPath for XML is a common example.
56
identifier, unique identifier
An attribute or a set of attributes whose values uniquely identify an entity instance or a tuple in a relation.
57
composition (of functions or relations)
An operation that combines two functions or relations by applying one to the result of the other.
58
domain set and range set of a function
The domain is the set of all possible input values for a function, while the range is the set of all possible output values.
59
three-valued logic (true/false/null)
A logic system used in SQL databases to handle comparisons involving NULL values, where the result can be true, false, or unknown (represented as NULL).
60
regular expression
A sequence of characters that specifies a search pattern in text.
61
graph (directed, node/vertice, edge/arc)
A data structure consisting of nodes (or vertices) that represent entities, and edges (or arcs) that represent connections or relationships between them. A directed graph has edges with direction.
62
serialising (marshalling or pickling)
The process of converting a data structure or object into a format (like a string or byte stream) that can be stored or transmitted and reconstructed later.
63
shredded
The process of decomposing semi-structured data, like XML documents, into a set of relational tables for storage in an RDBMS.
64
sharded
A type of database partitioning that separates very large databases into smaller, faster, more easily managed parts called data shards. It is a form of horizontal partitioning.
65
OLAP vs OLTP
OLTP (Online Transaction Processing) handles a large number of short, atomic transactions (e.g., banking). OLAP (Online Analytical Processing) involves complex queries over large amounts of data for analysis and business intelligence.
66
normal form
A property of a relational schema that indicates its quality and freedom from certain types of data redundancy and update anomalies. Common forms are 1NF, 2NF, 3NF, BCNF.
67
fixed-point
In a process of iterative computation, a state that does not change with further iterations.
68
data redundancy
The condition created within a database in which the same piece of data is held in two separate places, leading to potential inconsistencies.
69
scalar reduction
An operation that processes a set of values to return a single summary value, such as SUM(), COUNT(), or AVG().