What is the difference between a primary key and a foreign key?
A primary key uniquely identifies each record in a table and cannot contain NULL values. A foreign key is a field in one table that references the primary key of another table, establishing relationships between tables.
Explain the concept of normalization in databases.
Normalization is the process of organizing data to reduce redundancy and improve data integrity. It involves dividing large tables into smaller, related tables and defining relationships between them through foreign keys.
What are the differences between INNER JOIN and LEFT JOIN?
INNER JOIN returns only records that have matching values in both tables. LEFT JOIN returns all records from the left table and matching records from the right table; unmatched rows show NULL for right table columns.
Explain the difference between WHERE and HAVING clauses.
WHERE filters rows before grouping and is used with individual columns. HAVING filters grouped results after GROUP BY and works with aggregate functions like COUNT or SUM.
What does the DISTINCT keyword do in SQL?
DISTINCT removes duplicate rows from query results, returning only unique values. It can be applied to one or multiple columns to show unique combinations.
Name three common SQL data types and their purposes.
VARCHAR for variable-length text, INTEGER for whole numbers, and DATE for date values. Other types include DECIMAL for precise numbers, BOOLEAN for true/false, and TIMESTAMP for date and time.
What is a NULL value and how does it differ from zero or empty string?
NULL represents the absence of data or unknown value, different from zero (a number) or empty string (text with no characters). NULL comparisons use IS NULL or IS NOT NULL, not equals operator.
What are table constraints and give examples.
Constraints enforce data quality rules. Examples include PRIMARY KEY (uniqueness), FOREIGN KEY (referential integrity), NOT NULL (required data), UNIQUE (no duplicates), CHECK (condition validation), and DEFAULT (preset values).
What is the purpose of GROUP BY in SQL?
GROUP BY combines rows with identical values in specified columns into summary rows. It’s typically used with aggregate functions like COUNT, SUM, AVG, MIN, MAX to calculate statistics per group.
Explain the difference between COUNT(*), COUNT(column), and COUNT(DISTINCT column).
COUNT(*) counts all rows including those with NULL. COUNT(column) counts non-NULL values in that column. COUNT(DISTINCT column) counts unique non-NULL values.
What aggregate functions are commonly used in data analysis?
SUM adds values, AVG calculates average, MIN finds minimum, MAX finds maximum, COUNT counts records, and STDDEV calculates standard deviation. These reduce multiple rows to a single result.
What is a subquery and when would you use one?
A subquery is a query nested inside another query. Use subqueries to filter data based on aggregate results, return multiple values for IN clauses, or break complex queries into manageable parts.
Explain the difference between a subquery in SELECT vs WHERE clause.
SELECT subqueries return single values as computed columns. WHERE subqueries filter rows based on conditions. SELECT subqueries run once; WHERE subqueries may run multiple times, affecting performance.
What is a Common Table Expression (CTE) and its advantage?
A CTE (WITH clause) creates a temporary result set used within a query. Advantages include improved readability for complex queries, easier debugging, and ability to reference the same result multiple times.
What does UNION do and how is it different from UNION ALL?
UNION combines results from multiple queries and removes duplicates. UNION ALL combines results but keeps all duplicates. UNION ALL is faster since it skips the duplicate-removal step.
What is a self-join and provide an example use case.
Self-join joins table to itself to find relationships between rows. Example: finding employees and their managers by joining employee table to itself on manager_id, or finding customers who made consecutive purchases.
Explain what a cross join produces.
Cross join (Cartesian product) returns all combinations of rows from two tables. Results in rows equal to table1_rows × table2_rows. Use cautiously as it can produce massive result sets.
What string functions are commonly used in SQL?
Common functions include: UPPER/LOWER for case conversion, SUBSTRING/SUBSTR for extracting portions, CONCATENATE or || for joining strings, LENGTH/LEN for string length, TRIM for removing spaces, REPLACE for substitution.
Explain date functions and their importance.
Date functions extract/manipulate temporal data. Common ones: DATEADD/DATEDIFF calculate intervals, EXTRACT gets components, DATE_FORMAT converts format, GETDATE/NOW gets current date. Essential for time-based analysis.
What is the CASE statement and why is it useful?
CASE statement implements conditional logic in queries: CASE WHEN condition THEN value ELSE alternative END. Use for creating categories, applying business rules, or conditional calculations without leaving SQL.
What is a schema and what role does it serve?
Schema is logical organization of database objects like tables, views, indexes, and functions. It provides structure, organizes related objects, enables access control, and allows multiple logical databases on one server.
Explain the concept of referential integrity.
Referential integrity ensures relationships between tables remain consistent. Foreign keys enforce that related records exist in parent tables. Prevents orphaned records, maintains data accuracy, and is enforced through constraints.
What is the difference between TRUNCATE and DELETE?
DELETE removes specific rows and can be rolled back with WHERE conditions. TRUNCATE removes all rows faster but cannot be filtered and uses less transaction log space. DELETE triggers row-by-row, TRUNCATE deallocates data pages.
Explain RIGHT JOIN and FULL OUTER JOIN.
RIGHT JOIN returns all records from the right table and matching records from the left table. FULL OUTER JOIN returns all records from both tables, with NULL where matches don’t exist. FULL OUTER JOIN combines benefits of LEFT and RIGHT.