SQL + Database Fundamentals Flashcards

(32 cards)

1
Q

What is the difference between a primary key and a foreign key?

A

A primary key uniquely identifies each record in a table and cannot contain NULL values. A foreign key is a field in one table that references the primary key of another table, establishing relationships between tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain the concept of normalization in databases.

A

Normalization is the process of organizing data to reduce redundancy and improve data integrity. It involves dividing large tables into smaller, related tables and defining relationships between them through foreign keys.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the differences between INNER JOIN and LEFT JOIN?

A

INNER JOIN returns only records that have matching values in both tables. LEFT JOIN returns all records from the left table and matching records from the right table; unmatched rows show NULL for right table columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the difference between WHERE and HAVING clauses.

A

WHERE filters rows before grouping and is used with individual columns. HAVING filters grouped results after GROUP BY and works with aggregate functions like COUNT or SUM.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the DISTINCT keyword do in SQL?

A

DISTINCT removes duplicate rows from query results, returning only unique values. It can be applied to one or multiple columns to show unique combinations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Name three common SQL data types and their purposes.

A

VARCHAR for variable-length text, INTEGER for whole numbers, and DATE for date values. Other types include DECIMAL for precise numbers, BOOLEAN for true/false, and TIMESTAMP for date and time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a NULL value and how does it differ from zero or empty string?

A

NULL represents the absence of data or unknown value, different from zero (a number) or empty string (text with no characters). NULL comparisons use IS NULL or IS NOT NULL, not equals operator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are table constraints and give examples.

A

Constraints enforce data quality rules. Examples include PRIMARY KEY (uniqueness), FOREIGN KEY (referential integrity), NOT NULL (required data), UNIQUE (no duplicates), CHECK (condition validation), and DEFAULT (preset values).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the purpose of GROUP BY in SQL?

A

GROUP BY combines rows with identical values in specified columns into summary rows. It’s typically used with aggregate functions like COUNT, SUM, AVG, MIN, MAX to calculate statistics per group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Explain the difference between COUNT(*), COUNT(column), and COUNT(DISTINCT column).

A

COUNT(*) counts all rows including those with NULL. COUNT(column) counts non-NULL values in that column. COUNT(DISTINCT column) counts unique non-NULL values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What aggregate functions are commonly used in data analysis?

A

SUM adds values, AVG calculates average, MIN finds minimum, MAX finds maximum, COUNT counts records, and STDDEV calculates standard deviation. These reduce multiple rows to a single result.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a subquery and when would you use one?

A

A subquery is a query nested inside another query. Use subqueries to filter data based on aggregate results, return multiple values for IN clauses, or break complex queries into manageable parts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain the difference between a subquery in SELECT vs WHERE clause.

A

SELECT subqueries return single values as computed columns. WHERE subqueries filter rows based on conditions. SELECT subqueries run once; WHERE subqueries may run multiple times, affecting performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a Common Table Expression (CTE) and its advantage?

A

A CTE (WITH clause) creates a temporary result set used within a query. Advantages include improved readability for complex queries, easier debugging, and ability to reference the same result multiple times.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does UNION do and how is it different from UNION ALL?

A

UNION combines results from multiple queries and removes duplicates. UNION ALL combines results but keeps all duplicates. UNION ALL is faster since it skips the duplicate-removal step.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a self-join and provide an example use case.

A

Self-join joins table to itself to find relationships between rows. Example: finding employees and their managers by joining employee table to itself on manager_id, or finding customers who made consecutive purchases.

17
Q

Explain what a cross join produces.

A

Cross join (Cartesian product) returns all combinations of rows from two tables. Results in rows equal to table1_rows × table2_rows. Use cautiously as it can produce massive result sets.

18
Q

What string functions are commonly used in SQL?

A

Common functions include: UPPER/LOWER for case conversion, SUBSTRING/SUBSTR for extracting portions, CONCATENATE or || for joining strings, LENGTH/LEN for string length, TRIM for removing spaces, REPLACE for substitution.

19
Q

Explain date functions and their importance.

A

Date functions extract/manipulate temporal data. Common ones: DATEADD/DATEDIFF calculate intervals, EXTRACT gets components, DATE_FORMAT converts format, GETDATE/NOW gets current date. Essential for time-based analysis.

20
Q

What is the CASE statement and why is it useful?

A

CASE statement implements conditional logic in queries: CASE WHEN condition THEN value ELSE alternative END. Use for creating categories, applying business rules, or conditional calculations without leaving SQL.

21
Q

What is a schema and what role does it serve?

A

Schema is logical organization of database objects like tables, views, indexes, and functions. It provides structure, organizes related objects, enables access control, and allows multiple logical databases on one server.

22
Q

Explain the concept of referential integrity.

A

Referential integrity ensures relationships between tables remain consistent. Foreign keys enforce that related records exist in parent tables. Prevents orphaned records, maintains data accuracy, and is enforced through constraints.

23
Q

What is the difference between TRUNCATE and DELETE?

A

DELETE removes specific rows and can be rolled back with WHERE conditions. TRUNCATE removes all rows faster but cannot be filtered and uses less transaction log space. DELETE triggers row-by-row, TRUNCATE deallocates data pages.

24
Q

Explain RIGHT JOIN and FULL OUTER JOIN.

A

RIGHT JOIN returns all records from the right table and matching records from the left table. FULL OUTER JOIN returns all records from both tables, with NULL where matches don’t exist. FULL OUTER JOIN combines benefits of LEFT and RIGHT.

25
What is a view and when would you use one?
A view is a virtual table based on a query result. Use views to simplify complex queries, enforce security by limiting column access, provide consistent interfaces, or improve query maintainability.
26
Explain the difference between a clustered and non-clustered index.
Clustered index defines physical order of rows and there is only one per table. Non-clustered index creates separate structure referencing data and multiple can exist. Clustered is faster for range queries.
27
What is the ORDER BY clause and how does it work?
ORDER BY sorts query results by specified columns in ascending (ASC) or descending (DESC) order. Can sort by multiple columns with priority given to first column listed. Essential for organizing results meaningfully.
28
What is the LIMIT clause used for?
LIMIT restricts number of rows returned by a query. Use LIMIT 10 to get first 10 rows, or LIMIT 10 OFFSET 20 to skip first 20 rows and get next 10. Essential for pagination and performance.
29
Explain the concept of database transactions.
Transactions group SQL statements into units where all succeed or all fail. They maintain ACID properties: Atomicity (all-or-nothing), Consistency, Isolation, Durability. Use COMMIT to save changes or ROLLBACK to undo.
30
What is the difference between INNER JOIN and FULL OUTER JOIN?
INNER JOIN returns only matching records from both tables. FULL OUTER JOIN returns all records from both tables with NULLs where matches don't exist. FULL OUTER JOIN is superset of INNER JOIN results.
31
How would you identify and fix a Cartesian product problem?
Cartesian products occur when JOIN conditions are missing or incorrect, creating massive row counts. Fix by: verifying JOIN conditions are correct, checking for missing ON clauses, using INNER JOIN instead of comma-separated tables.
32
What is the difference between HAVING and WHERE in terms of order of execution?
WHERE filters rows BEFORE grouping occurs in a query. HAVING filters grouped results AFTER aggregation. This means WHERE is more efficient for filtering base data, HAVING filters aggregate results.