SQL Interview Problems Flashcards

(29 cards)

1
Q

How do you find the top N records per group in SQL?

A

You use a window function like ROW_NUMBER or RANK partitioned by the group and ordered by the value you care about. Then you keep only rows where the rank is less than or equal to N. This lets each group independently pick its top records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What problem does the Top-N-per-group query solve?

A

Sometimes you want the best results within each category rather than across the entire dataset. For example the top 3 products in each region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can SQL detect duplicate rows?

A

You group rows by the columns that should be unique and count how many rows exist. If the count is greater than one then duplicates exist.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you remove duplicates in SQL?

A

You usually assign ROW_NUMBER within groups of duplicate rows and keep only row number 1. This keeps one clean record while removing extra copies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can SQL calculate a running total?

A

You use a window function with SUM and an ordered window. This adds values progressively so each row sees the sum of all previous rows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a running total useful for?

A

It helps track cumulative values over time like total sales growing day by day.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you find gaps in sequences in SQL?

A

You compare each row to the previous row using window functions like LAG. If the difference between numbers is larger than expected then a gap exists.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is detecting gaps useful?

A

It helps identify missing records such as skipped invoice numbers or missing dates in time series data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you compare a row to the previous row in SQL?

A

You use the LAG window function. It lets the current row see values from earlier rows in the ordered dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you compare a row to the next row in SQL?

A

You use the LEAD window function which looks forward to the next row in the sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is cohort analysis?

A

Cohort analysis groups users based on a shared starting event such as signup date and tracks their behavior over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why do analysts use cohort analysis?

A

It helps understand how user behavior changes over time instead of mixing users from different starting periods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is sessionization in SQL?

A

Sessionization groups events that occur close in time into sessions. For example grouping website clicks into one browsing session.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does SQL detect sessions?

A

By comparing timestamps between events. If the gap between events exceeds a threshold like 30 minutes then a new session begins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you calculate retention in SQL?

A

You track users who return after their initial event by comparing activity dates to the original signup or first activity date.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a pivot query?

A

A pivot reshapes rows into columns so categories become separate columns for easier comparison.

17
Q

What is an unpivot query?

A

An unpivot turns columns into rows so data becomes easier to analyze in aggregated form.

18
Q

How do you find the second highest value in SQL?

A

You rank values using RANK or DENSE_RANK and then filter for rank equal to 2.

19
Q

How can SQL find missing dates in a dataset?

A

You generate a sequence of expected dates and then compare it to existing records using joins to find dates that are missing.

20
Q

What is a self join?

A

A self join joins a table to itself so rows can be compared within the same dataset.

21
Q

Why are self joins useful?

A

They allow comparisons like employee-manager relationships or detecting relationships between rows in the same table.

22
Q

What is a recursive query?

A

A recursive query repeatedly runs a query using its previous result. It is useful for hierarchical data like organizational trees.

23
Q

What is hierarchical data in databases?

A

Data where rows relate in parent-child relationships such as company org charts or folder structures.

24
Q

What is a moving average query?

A

A moving average calculates the average value across a sliding window of rows such as the last 7 days.

25
Why do analysts use moving averages?
They smooth out short-term fluctuations to reveal longer-term trends.
26
What is a funnel analysis query?
A funnel query measures how many users complete each step in a process such as signup → activation → purchase.
27
Why is funnel analysis important?
It reveals where users drop out of a process so companies know where improvements are needed.
28
What is a churn query?
A churn query identifies users who stopped using a service by detecting long inactivity periods.
29
Why is churn analysis valuable?
Because losing users directly impacts revenue and growth