Data Structures Flashcards

(336 cards)

1
Q

What is a data structure?

A

A specific way of organizing and representing data for efficient operations

It defines how data is held and what operations can be performed on it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

At the abstract level, what does a data structure specify?

A
  • What kind of data it holds
  • What operations you’re allowed to do
  • What guarantees those operations have

Examples include sets, maps, and sequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are examples of operations allowed on data structures?

A
  • insert(x)
  • delete(x)
  • find(k)
  • min()
  • push()
  • pop()
  • enqueue()
  • dequeue()

Each operation has specific pre-/postconditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the roles of data structures?

A
  • stack
  • queue
  • set
  • map
  • priority queue
  • graph

These roles define the type of data structure and its intended use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are two ways a list can be implemented?

A
  • dynamic array
  • linked list

The choice of implementation affects performance and memory usage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the possible implementations of a map?

A
  • hash table
  • balanced binary search tree
  • trie

Different implementations have varying time complexities and behaviors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does time complexity refer to in data structures?

A

The efficiency of operations, e.g., find(k) in a hash map is usually O(1) average

In a tree map, it’s O(log n).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the difference between a data structure and an algorithm?

A
  • Data structure: how data is organized and operations supported
  • Algorithm: a procedure that uses data structures to solve a problem

Example: Dijkstra’s algorithm uses a priority queue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In one sentence, how can a data structure be defined?

A

A disciplined way of organizing data in memory with a defined set of operations

This ensures that important operations can be performed correctly and efficiently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does ADT stand for in data structures?

A

Abstract Data Types

ADTs define the behavior of data structures in terms of their operations and properties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the axes used to compare different ADTs?

A
  • Ordering
  • Uniqueness
  • Access model
  • Typical operations & complexity
  • Typical implementations & tradeoffs
  • When to use vs when not to

These axes help in understanding the roles and functionalities of various data structures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define the Sequence / List ADT.

A

A finite ordered collection of elements, where position matters.

Lists allow duplicates and provide access by position/index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the core operations of the Sequence / List ADT?

A
  • get(i)
  • set(i, x)
  • insert(i, x)
  • delete(i)
  • iterate
  • size()
  • isEmpty()

These operations allow manipulation and access of list elements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the typical complexity for get/set(i) in a dynamic array?

A

O(1)

This indicates constant time access for elements in a dynamic array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When should you use the Sequence / List ADT?

A
  • When you care about order
  • When you need index-based access
  • When you mostly append/iterate

Lists are not ideal for frequent random insertions at large scales.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the Stack ADT characterized by?

A

A sequence where you only touch the top.

Stacks follow Last-In-First-Out (LIFO) access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the core operations of the Stack ADT?

A
  • push(x)
  • pop()
  • top() / peek()
  • isEmpty()

These operations manage the stack’s elements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the typical complexity for all core operations in a Stack ADT?

A

O(1)

All stack operations are performed in constant time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the Queue ADT defined as?

A

First in, first out (FIFO).

Queues manage elements in the order they arrive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the core operations of the Queue ADT?

A
  • enqueue(x)
  • dequeue()
  • front()

These operations allow adding and removing elements from the queue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the typical complexity for all operations in a Queue ADT?

A

O(1)

This indicates that all queue operations can be performed in constant time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Define the Set ADT.

A

A collection of distinct elements. Membership matters; order doesn’t (conceptually).

Sets do not allow duplicates and focus on membership.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the core operations of the Set ADT?

A
  • insert(x)
  • remove(x)
  • contains(x)
  • set ops: union, intersection, difference

These operations manage the elements and relationships within the set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the typical complexity for insert/remove/contains in a Hash set?

A

avg O(1), worst O(n)

This indicates average constant time complexity with potential for linear time in the worst case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the **Multiset / Bag ADT**?
A set that allows duplicates, but still doesn’t care about order. ## Footnote Multisets track the frequency of elements.
26
What are the core operations of the **Multiset / Bag ADT**?
* `add(x)` * `removeOne(x)` * `count(x)` ## Footnote These operations manage the multiplicity of elements in the multiset.
27
Define the **Map / Dictionary / Associative Array ADT**.
Key → value mapping, where keys are (usually) unique. ## Footnote Maps allow retrieval of values based on unique keys.
28
What are the core operations of the **Map ADT**?
* `put(k, v)` / `insert` / `update` * `get(k)` * `remove(k)` * `containsKey(k)` ## Footnote These operations manage key-value pairs in the map.
29
What is the typical complexity for `get/put/remove` in a **Hash map**?
average O(1) ## Footnote This indicates that hash maps provide average constant time complexity for these operations.
30
What is the **Priority Queue ADT**?
A collection where each element has a priority and you always remove the **extremal** one first (min or max). ## Footnote Priority queues manage elements based on their priority rather than their order of arrival.
31
What are the core operations of the **Priority Queue ADT**?
* `insert(x, priority)` * `findMin` / `findMax` * `extractMin` / `extractMax` ## Footnote These operations manage elements based on their priority.
32
What is the typical complexity for `insert` in a **Priority Queue** using a binary heap?
O(log n) ## Footnote This indicates logarithmic time complexity for inserting elements into the priority queue.
33
Define the **Disjoint-Set / Union–Find ADT**.
Structure that maintains a partition of elements into **disjoint groups**, with fast `findWhichGroup(x)` and `mergeGroups(x, y)`. ## Footnote This structure is useful for managing connectivity in graphs.
34
What are the core operations of the **Disjoint-Set ADT**?
* `makeSet(x)` * `find(x)` * `union(x, y)` ## Footnote These operations manage the groups and their relationships.
35
What is the typical complexity for operations in a **Disjoint-Set** with union by rank + path compression?
Amortized nearly O(1) per op ## Footnote This indicates very efficient operations for managing disjoint sets.
36
What is the **Tree ADT** characterized by?
Nodes with parent-child relationships; often used as a **search** or **hierarchical** ADT. ## Footnote Trees are useful for representing hierarchical data.
37
What are the core operations of the **Tree ADT**?
* `insert(k, v)` * `delete(k)` * `find(k)` * traversals: inorder, preorder, postorder ## Footnote These operations manage the elements and structure of the tree.
38
Define the **Graph ADT**.
Vertices + edges; general relationships, not just trees. ## Footnote Graphs are used to model complex relationships between elements.
39
What are the core operations of the **Graph ADT**?
* Add/remove vertex/edge * `neighbors(v)` ## Footnote These operations manage the vertices and edges in the graph.
40
What are **Spatial / Range Query ADTs** designed for?
Structures designed to answer **geometric / range queries** efficiently. ## Footnote These structures are optimized for spatial data and queries.
41
What is the **Approximate / Probabilistic ADTs**?
Structures that support queries like sets/maps, but with controlled errors for huge scale. ## Footnote These structures are useful in big data contexts where exact answers are less critical.
42
What is the primary use of a **Stack**?
Backtracking, 'last thing done next' ## Footnote Stacks are used for operations where the last element added is the first to be removed.
43
What is a **Queue/Deque** used for?
Fair scheduling / pipelines / BFS ## Footnote Queues are essential for managing tasks in a first-in-first-out manner.
44
What is the main function of a **Set**?
Membership, uniqueness, set ops ## Footnote Sets are used to store unique elements and perform operations like union and intersection.
45
What does a **Multiset** provide in addition to a Set?
Membership + multiplicity (counts) ## Footnote Multisets allow for the storage of duplicate elements with counts.
46
What is the primary function of a **Map**?
Key → value lookup ## Footnote Maps are used for associating keys with values for efficient retrieval.
47
What is the purpose of a **Priority Queue**?
Always handle 'best/cheapest/urgent' first ## Footnote Priority queues are used in scenarios where certain tasks need to be prioritized over others.
48
What does a **Disjoint-Set** manage?
Groups/connected-components under merges ## Footnote Disjoint-sets are useful for tracking a set of elements partitioned into disjoint subsets.
49
What is the main characteristic of a **Tree (search)**?
Sorted keys + range queries + log-time operations ## Footnote Trees allow for efficient searching, insertion, and deletion operations.
50
What does a **Graph** represent?
Arbitrary relationships, paths, connectivity ## Footnote Graphs are used to model pairwise relationships between objects.
51
What is the function of a **Spatial Index**?
Nearest neighbors, range queries in multi-dimensional space ## Footnote Spatial indexes optimize queries related to spatial data.
52
What do **Approx. structures** provide?
Huge-scale membership/count/frequency with bounded errors ## Footnote Approximate structures are used when exactness is less critical than performance.
53
What are the **three big topologies** to compare in data structures?
* Linear * Hierarchical * General graph ## Footnote These topologies help in understanding how elements are connected and the types of relationships that can be efficiently queried.
54
What is the **shape** of **linear structures**?
* Elements in a single line * Each element (except ends) has one predecessor and one successor ## Footnote Examples include arrays, linked lists, stacks, and queues.
55
What are linear structures good at? List at least three.
* Sequential processing * Index-based access * FIFO / LIFO disciplines ## Footnote They are simple and have low overhead, especially contiguous arrays.
56
What are the **costs/limits** of linear structures?
* Cannot represent branching relationships * Insert/delete costs are O(n) for arrays * Search operations are O(n) unless sorted ## Footnote These limitations affect their expressiveness and efficiency.
57
When are **linear structures** the right shape?
* When you mostly scan data * When you push/pop from ends * When you need fast contiguous access ## Footnote They are not suitable for complex relationships or frequent middle insertions.
58
What is the **shape** of **hierarchical structures**?
* Rooted hierarchy * One root, zero or more children per node * No cycles ## Footnote Examples include binary trees, heaps, and tries.
59
What are hierarchical structures good at? List at least three.
* Hierarchical relationships * Ordered data & range queries * Priority operations ## Footnote They efficiently manage parent-child relationships and allow for logarithmic search operations.
60
What are the **costs/limits** of hierarchical structures?
* More complex invariants * Pointer-heavy implementations * Single-parent constraint ## Footnote These factors can impact performance and memory usage.
61
When are **hierarchical structures** the right shape?
* When you care about sorted order * When you need logarithmic insert/delete/search * When you require range queries ## Footnote They are suitable for natural hierarchies like organizational charts.
62
What is the **shape** of **general graphs**?
* Vertices (nodes) + edges (connections) * Directed/undirected edges * Can have cycles and arbitrary connections ## Footnote Trees are a special case of graphs.
63
What are general graphs good at? List at least three.
* Arbitrary relationships * Topology-aware algorithms * Modeling real-world systems ## Footnote They are useful for complex networks and systems with many-to-many connections.
64
What are the **costs/limits** of general graphs?
* Highest representational overhead * Algorithmic complexity depends on both V and E * More complex reasoning required ## Footnote These factors can complicate implementation and analysis.
65
When are **general graphs** the right shape?
* When you care about connectivity and paths * When there are multiple relationships between entities * When you need non-hierarchical topologies ## Footnote They are ideal for complex systems like social networks.
66
What is the **typical complexity** for scanning in linear structures?
O(n) ## Footnote This reflects the time taken to process each element in a linear structure.
67
What is the **typical complexity** for search/insert/delete in balanced trees?
O(log n) ## Footnote This efficiency is due to the hierarchical nature of trees.
68
What is the **typical complexity** for traversal in graphs?
O(V + E) ## Footnote This complexity accounts for both vertices and edges in the graph.
69
What is the **memory behavior** of linear structures?
Best cache locality, minimal overhead ## Footnote This makes them ideal for performance-sensitive applications.
70
What is the **ease of reasoning** for linear structures?
Easiest: trivial invariant, simple index arithmetic ## Footnote This simplicity aids in implementation.
71
What are the **heuristics** for choosing a topology for a problem?
* Next in line? → Linear * Natural hierarchy? → Tree * Many-to-many relationships? → Graph ## Footnote These questions guide the selection of the appropriate data structure.
72
What is meant by **storage strategy** in the context of data structures?
How the bytes are laid out and linked together ## Footnote Different strategies massively change performance, memory use, and complexity.
73
List the **main axes** to compare different storage strategies.
* Access time * Insert/delete behavior * Cache / memory behavior * Overhead & complexity * Typical use cases ## Footnote These axes help evaluate the efficiency and suitability of various storage strategies.
74
What is the **strategy** for **contiguous storage**?
* Elements stored in one continuous block of memory * Dynamic versions over-allocate and resize when needed ## Footnote Examples include C arrays, std::vector, Java ArrayList, and Python list.
75
What is the **access time** for indexing in contiguous storage?
O(1) ## Footnote Access is calculated using the formula: addr = base + i * size.
76
What are the **pros** of contiguous storage?
* Best cache locality * Very simple implementation * Ideal for hot inner loops and dense collections ## Footnote Contiguous storage is efficient for scenarios requiring fast indexed access.
77
What are the **cons** of contiguous storage?
* Expensive insert/delete in the middle * Need to occasionally reallocate & copy * Hard to share subranges without copying ## Footnote These limitations can affect performance in certain use cases.
78
What is the **strategy** for **linked/node-based storage**?
* Each element lives in its own node allocated separately * Nodes have pointers to neighbors/children ## Footnote Examples include singly/doubly linked lists and pointer-based trees.
79
What is the **access time** for random access by index in linked storage?
O(n) ## Footnote Access requires traversal through the nodes.
80
What are the **pros** of linked/node-based storage?
* Cheap structural changes * Flexible shapes (trees, graphs) * Can grow/shrink without moving existing nodes ## Footnote This flexibility is beneficial for dynamic data structures.
81
What are the **cons** of linked/node-based storage?
* Slow iteration compared to arrays * More complex memory management * Easier to produce fragmentation and pointer bugs ## Footnote These issues can lead to performance degradation.
82
What is the **strategy** for **direct-address tables**?
* Use the key directly as an index into an array * Domain must be small and dense enough ## Footnote Examples include boolean visited arrays and simple lookups on small integer keys.
83
What is the **access time** for operations in direct-address tables?
O(1) worst-case ## Footnote This allows for very fast lookups.
84
What are the **pros** of direct-address tables?
* Fastest possible lookup * Simple and predictable ## Footnote Direct-address tables are efficient for small, dense integer keys.
85
What are the **cons** of direct-address tables?
* Not scalable if key space is large * Wastes memory for unused slots ## Footnote These limitations restrict their applicability.
86
What is the **strategy** for **hash-based storage**?
* Uses hash functions to map keys to indices * Two main bucket strategies: separate chaining and open addressing ## Footnote Hash tables are widely used for efficient key-based lookups.
87
What is the **average-case access time** for hash-based storage?
O(1) ## Footnote This assumes a good hash function and load factor.
88
What are the **pros** of hash-based storage?
* Very fast expected time for key lookups * Go-to for membership and dictionaries ## Footnote Hash-based structures are efficient for many applications.
89
What are the **cons** of hash-based storage?
* Quality of hash function matters * Resizing and load-factor tuning add complexity * Poor range/query support ## Footnote These factors can complicate implementation and performance.
90
What is the **strategy** for **tree/index-structured storage**?
* Elements stored in nodes arranged to maintain a search invariant * Each node stores keys and child pointers ## Footnote Examples include balanced trees and B-trees.
91
What is the **access time** for search/insert/delete in balanced trees?
O(log n) ## Footnote This allows for efficient operations on sorted data.
92
What are the **pros** of tree/index-structured storage?
* Naturally sorted storage * Efficient range queries * More predictable worst-case than hash tables ## Footnote These features make trees suitable for many applications.
93
What are the **cons** of tree/index-structured storage?
* More complex implementation * Slightly higher constant factors for random lookups * Pointer-heavy versions have poorer cache behavior ## Footnote These drawbacks can impact performance.
94
What is the **strategy** for **blocked/page-based storage**?
* Data grouped into blocks/pages * Minimize I/O operations rather than CPU cycles ## Footnote This strategy is essential for external memory systems.
95
What is the **access complexity** in blocked/page-based storage?
Measured in page reads/writes ## Footnote This focuses on I/O complexity.
96
What are the **pros** of blocked/page-based storage?
* Essential when data doesn’t fit in RAM * Yields huge performance gains over naive storage ## Footnote Proper use of this strategy can significantly improve efficiency.
97
What are the **cons** of blocked/page-based storage?
* More complex design and tuning * Latency dominated by disks/SSDs ## Footnote These factors can complicate implementation.
98
What is the **strategy** for **packed/compressed storage**?
* Store data in tightly packed or compressed form * Save memory and improve cache behavior ## Footnote Examples include bitsets and columnar storage.
99
What are the **pros** of packed/compressed storage?
* Great for huge datasets * Useful for analytics and bitmap indexes ## Footnote This strategy is effective for memory and bandwidth constraints.
100
What are the **cons** of packed/compressed storage?
* More complex to modify in-place * Harder to implement than plain arrays ## Footnote These challenges can limit usability.
101
What is the **strategy** for **indirection/handle-based storage**?
* Expose handles instead of raw pointers * Store objects in arrays or pools ## Footnote This method is common in performance-critical systems.
102
What are the **pros** of indirection/handle-based storage?
* Encapsulates storage * Good locality & fewer pointer bugs ## Footnote This approach enhances safety and performance.
103
What are the **cons** of indirection/handle-based storage?
* Extra level of indirection * Need to manage handle lifecycle ## Footnote These factors can complicate implementation.
104
What are the key differences between **in-memory**, **on-disk**, and **hybrid** storage?
* In-memory: Tuned for CPU and caches * On-disk: Tuned for I/O pattern minimization * Hybrid: Structures in memory mirror disk-backed structures ## Footnote Each type has its own performance characteristics and use cases.
105
What is the **general tradeoff** for resizing in hash-based storage?
Must rehash when load factor passes a threshold ## Footnote This operation is O(n) but amortized.
106
What is the **first access strategy** described?
Positional / Random Access (by index) ## Footnote This strategy answers the question: *“What’s the element at position i?”*
107
What is the **typical data structure** for Positional / Random Access?
* Arrays * Dynamic arrays (e.g., `vector`, `ArrayList`, Python `list`) * Fixed-size buffers ## Footnote These structures allow for true random access.
108
What is the **complexity** for access/update by index in Positional / Random Access?
O(1) ## Footnote This indicates true random access efficiency.
109
What are the **strengths** of Positional / Random Access?
* Ideal for numerical work * Excellent cache behavior * Fast sequential scans ## Footnote This access strategy is particularly effective for dense sequences and time series.
110
What are the **weaknesses** of Positional / Random Access?
* Arbitrary index semantics * O(n) for insert/delete in the middle * Not good for membership checks without extra indexing ## Footnote These weaknesses limit its use in certain scenarios.
111
What is the **second access strategy** described?
Sequential Access (iteration-only) ## Footnote This strategy answers the question: *“Give me the next element, then the next…”*
112
What is the **typical data structure** for Sequential Access?
* Linked lists * Streams * Iterators * File handles ## Footnote These structures support iteration without random access.
113
What is the **complexity** for finding an element at index *i* in Sequential Access?
O(i) ## Footnote This indicates that accessing elements by index is inefficient.
114
What are the **strengths** of Sequential Access?
* Works well with streaming data * Supports iterable-only interfaces ## Footnote This access strategy is beneficial for lazy structures.
115
What are the **weaknesses** of Sequential Access?
* Poor random access * Limited ability for binary search ## Footnote These limitations affect its efficiency in certain applications.
116
What is the **third access strategy** described?
Associative / Keyed Access (by key) ## Footnote This strategy answers the question: *“What is the value associated with key k?”*
117
What is the **typical data structure** for Associative / Keyed Access?
* Hash maps * Tree maps * Dictionaries * Symbol tables ## Footnote These structures allow for efficient key-based lookups.
118
What is the **average complexity** for a hash map in Associative / Keyed Access?
O(1) ## Footnote This indicates average-case efficiency for lookups.
119
What are the **strengths** of Associative / Keyed Access?
* Natural for configurations and caches * Directly ask for items without scanning ## Footnote This access strategy simplifies lookups.
120
What are the **weaknesses** of Associative / Keyed Access?
* No natural order * Poor support for range queries ## Footnote These weaknesses can limit its use in ordered data scenarios.
121
What is the **fourth access strategy** described?
Membership Access (set-style) ## Footnote This strategy answers the question: *“Is x in this collection?”*
122
What is the **typical data structure** for Membership Access?
* Sets * Hash sets * Tree sets * Bitsets * Bloom filters ## Footnote These structures are optimized for membership checks.
123
What is the **average complexity** for membership checks in a hash set?
O(1) ## Footnote This indicates efficient membership verification.
124
What are the **strengths** of Membership Access?
* Fast membership checks * Uniqueness enforcement ## Footnote This access strategy is ideal for deduplication tasks.
125
What are the **weaknesses** of Membership Access?
* No positional semantics * Not optimized for retrieving ordered elements ## Footnote These limitations affect its utility in certain contexts.
126
What is the **fifth access strategy** described?
Ordered / Range Access ## Footnote This strategy answers questions like: *“What’s the smallest/largest element?”*
127
What is the **typical data structure** for Ordered / Range Access?
* Balanced BST (AVL, Red–Black) * B-trees * Sorted arrays ## Footnote These structures support ordered queries and range searches.
128
What is the **complexity** for search/insert/delete in tree-based structures for Ordered / Range Access?
O(log n) ## Footnote This indicates efficient operations in balanced trees.
129
What are the **strengths** of Ordered / Range Access?
* Enables range queries * Supports sorted traversals ## Footnote This access strategy is essential for databases and ordered data.
130
What are the **weaknesses** of Ordered / Range Access?
* More complex than hash sets/maps * Slower for point lookups compared to hash maps ## Footnote These weaknesses can impact performance in certain scenarios.
131
What is the **sixth access strategy** described?
Priority / Extremal Access ## Footnote This strategy answers the question: *“What’s the highest/lowest-priority element right now?”*
132
What is the **typical data structure** for Priority / Extremal Access?
* Heaps (binary heap, Fibonacci heap) * Specialized priority queues ## Footnote These structures are designed for priority management.
133
What is the **complexity** for insert and extractMin in a binary heap?
O(log n) ## Footnote This indicates efficient priority operations.
134
What are the **strengths** of Priority / Extremal Access?
* Natural for schedulers and event queues * Avoids full sorts for next best elements ## Footnote This access strategy is useful in pathfinding algorithms.
135
What are the **weaknesses** of Priority / Extremal Access?
* Does not allow easy access to arbitrary elements * Not built for general-purpose storage ## Footnote These limitations restrict its application in some contexts.
136
What is the **seventh access strategy** described?
Hierarchical / Structural Access ## Footnote This strategy answers questions like: *“What are this node’s children/parent?”*
137
What is the **typical data structure** for Hierarchical / Structural Access?
* Trees (DOM, scene graphs, org charts) * File hierarchies ## Footnote These structures naturally express hierarchies.
138
What are the **strengths** of Hierarchical / Structural Access?
* Naturally expresses hierarchies and containment * Good for recursive algorithms ## Footnote This access strategy is effective for tree traversals.
139
What are the **weaknesses** of Hierarchical / Structural Access?
* No direct O(1) access by ID unless indexed * Performance depends on tree shape ## Footnote These weaknesses can affect efficiency in certain scenarios.
140
What is the **eighth access strategy** described?
Graph / Neighbor Access ## Footnote This strategy answers questions like: *“What are the neighbors of this node?”*
141
What is the **typical data structure** for Graph / Neighbor Access?
* Adjacency lists * Adjacency matrices * Edge lists ## Footnote These structures represent relationships between nodes.
142
What is the **complexity** for traversals in Graph / Neighbor Access?
O(V + E) ## Footnote This indicates the efficiency of graph traversal algorithms.
143
What are the **strengths** of Graph / Neighbor Access?
* Represents arbitrary relationships * Supports rich topology queries ## Footnote This access strategy is essential for network analysis.
144
What are the **weaknesses** of Graph / Neighbor Access?
* More expensive operations than simple maps/sets/lists * Harder to reason about ## Footnote These limitations can complicate implementation.
145
What is the **ninth access strategy** described?
Spatial / Range-in-Space Access ## Footnote This strategy answers questions like: *“What’s near this point?”*
146
What is the **typical data structure** for Spatial / Range-in-Space Access?
* k-d trees * Quadtrees * R-trees ## Footnote These structures are optimized for spatial queries.
147
What are the **strengths** of Spatial / Range-in-Space Access?
* Critical for physics engines and GIS * Efficient for collision detection ## Footnote This access strategy is vital for spatial analysis.
148
What are the **weaknesses** of Spatial / Range-in-Space Access?
* Complex invariants and tuning * Sensitive to skewed distributions ## Footnote These limitations can affect performance in certain applications.
149
What is the **tenth access strategy** described?
Approximate / Probabilistic Access ## Footnote This strategy answers questions like: *“Is x probably in this set?”*
150
What is the **typical data structure** for Approximate / Probabilistic Access?
* Bloom filters * Counting Bloom filters * HyperLogLog ## Footnote These structures provide memory-efficient approximations.
151
What is the **average complexity** for query/update in Approximate / Probabilistic Access?
O(1) ## Footnote This indicates efficient operations for large datasets.
152
What are the **strengths** of Approximate / Probabilistic Access?
* Tiny memory usage * Suitable for big data applications ## Footnote This access strategy is effective for streaming analytics.
153
What are the **weaknesses** of Approximate / Probabilistic Access?
* Inexact results * Not suitable for exactness-required scenarios ## Footnote These limitations can restrict its use in certain contexts.
154
What does **unordered** order semantics mean?
* No defined iteration order * Implementation detail * Must not rely on iteration order ## Footnote Examples include Abstract Set ADT and hash-based sets/maps.
155
What are the **pros** of unordered order semantics?
* Fast membership / lookup * Spatial locality / hashing tricks * Free to change layout without breaking callers ## Footnote This allows for efficient implementations focused on membership checks.
156
What are the **cons** of unordered order semantics?
* Cannot depend on iteration order for correctness * Cannot do smallest element without extra structure * Cannot do range queries or sorted operations directly ## Footnote These limitations affect how data can be processed.
157
What does **insertion order** mean in data structures?
* Elements come out in the order they were logically inserted * Removing an element preserves the relative order of others ## Footnote Examples include queues and ordered hash maps.
158
What are the **pros** of insertion order semantics?
* Intuitive semantics reflecting time of arrival * Great for logs, history lists, task queues ## Footnote This makes it suitable for scenarios where order of arrival matters.
159
What are the **cons** of insertion order semantics?
* Extra bookkeeping required * Not sorted by key or value ## Footnote Maintaining insertion order in hash-based structures can increase memory usage.
160
What does **positional/index order** mean?
* Structure assigns each element a position/index * Order defined by those positions ## Footnote Examples include arrays and general lists.
161
What are the **pros** of positional/index order?
* Simple mental model * Great for random access by index * Useful for dynamic programming tables ## Footnote This allows for efficient access to specific elements.
162
What are the **cons** of positional/index order?
* Costly insert/remove at arbitrary positions * Index numbers can change after insertions ## Footnote This can complicate the management of data structures.
163
What does **sorted/key order** mean?
* Comparison relation defines order * Guarantees iteration in sorted order ## Footnote Examples include sorted arrays and balanced search trees.
164
What are the **capabilities** of sorted/key order?
* Find min/max easily * Perform range queries * Find predecessor/successor ## Footnote This order allows for efficient querying of data.
165
What are the **pros** of sorted/key order?
* Enables range scans and order statistics * Predictable O(log n) operations in balanced trees ## Footnote This makes it suitable for database-like indexing.
166
What are the **cons** of sorted/key order?
* Maintaining sorted order is costly * Slower for pure key-value lookups than hash maps ## Footnote This can affect performance in certain scenarios.
167
What does **partial order** mean?
* Only some elements are ordered relative to others * Typically know the extremal element (min or max) ## Footnote Example includes heaps or priority queues.
168
What are the **pros** of partial order semantics?
* Efficient for needing the best next item * Simpler and faster than full sorted structures ## Footnote This is useful in scheduling and event queues.
169
What are the **cons** of partial order semantics?
* Not good for full sorted iteration * Cannot access arbitrary elements by priority rank directly ## Footnote This limits the flexibility of data access.
170
What does **structural order** refer to?
* Order defined by how you traverse the structure * Includes depth-first and breadth-first orders ## Footnote Examples include tree and graph traversals.
171
What are the **pros** of structural order?
* Derives different meaningful orders from the same structure * Useful when structural relations matter more ## Footnote This is applicable in scenarios like AST processing and dependency resolution.
172
What are the **cons** of structural order?
* Order is algorithm-dependent * Not a property of the ADT itself ## Footnote This can complicate the understanding of data structure behavior.
173
What is **deterministic order**?
* Guaranteed and specified in documentation * Users can rely on it for correctness ## Footnote This ensures reproducible behavior in applications.
174
What is **unspecified-but-static order**?
* Implementation has some order, but docs say 'don’t rely on it' * Might change in future versions ## Footnote This can lead to brittle bugs if relied upon.
175
What is **randomized order**?
* Some hash maps randomize iteration order * Avoids predictable hashing attacks ## Footnote This discourages reliance on unspecified order.
176
What does **stability** mean in ordering?
* Keeps equal keys in their original order * Important for multi-field sort keys ## Footnote Stability affects reproducibility and interpretability in analytics.
177
What are the **tradeoffs** of different order semantics?
* Unordered: No defined iteration order * Insertion order: Reflects time of arrival * Positional: Order = position index * Sorted: Order defined by comparator * Partial: Only extremal elements ordered * Structural: Defined by traversal strategy ## Footnote Each type has its own guarantees, use cases, and limitations.
178
What are the **four rough levels** of data structure usage?
* Primitive / foundational data structures * Abstract containers / library collections * Composite / engineered data structures * Domain-specific / application-level structures ## Footnote These levels represent a stack of abstraction from raw building blocks to domain-specific constructs.
179
Name examples of **primitive / foundational data structures**.
* Arrays * Dynamic arrays * Linked lists * Basic trees (binary tree, general tree) * Basic graphs (adjacency list/matrix) * Simple stacks/queues ## Footnote These structures live at the lowest level of abstraction and are close to memory representation.
180
What are the **characteristics** of primitive / foundational data structures?
* Live at the lowest level of abstraction * Close to memory representation * Often language/runtime primitives ## Footnote Examples include C arrays, Rust slices, and pointer-based nodes.
181
What is the **usage** of primitive / foundational data structures?
* Implement higher-level containers * Used in systems-level or performance-critical code * Control allocation strategy and growth policy ## Footnote Common in embedded systems, OS kernels, and high-performance computing.
182
What are the **pros** of using primitive / foundational data structures?
* Maximum performance and control * Minimal overhead * Easy to reason about at the CPU/memory level ## Footnote They provide a high degree of efficiency.
183
What are the **cons** of using primitive / foundational data structures?
* Easy to get wrong (bounds, lifetime, aliasing) * Little semantic information * Lots of boilerplate if used directly everywhere ## Footnote They require careful handling to avoid errors.
184
Name examples of **abstract containers / library collections**.
* std::vector * std::list * std::map * std::unordered_map * Java List, Set, Map, Queue * Python list, dict, set, deque ## Footnote These collections abstract away implementation details and focus on ADT roles.
185
What are the **characteristics** of abstract containers / library collections?
* Abstract away implementation details * Provide well-defined semantics * Used at every layer, especially application and business logic ## Footnote They allow developers to think in terms of abstract data types.
186
What is the **usage** of abstract containers / library collections?
* Reuse them instead of re-implementing * Choose container type based on access strategy, order semantics, and complexity needs ## Footnote They are commonly used in general application code.
187
What are the **pros** of using abstract containers / library collections?
* Huge productivity gain * Communicate intent clearly * Usually optimized and debugged by experts ## Footnote They simplify development and improve code clarity.
188
What are the **cons** of using abstract containers / library collections?
* Less control over internal behavior * May not be tuned for extreme performance * Sometimes need different semantics than offered ## Footnote This can lead to the need for composite structures.
189
Name examples of **composite / engineered data structures**.
* LRU cache * Indexed priority queues * Multi-index collections * Interval trees * Segment trees * Versioned/persistent structures ## Footnote These structures are built from basic containers to fulfill more complex roles.
190
What are the **characteristics** of composite / engineered data structures?
* Data-structure-level but not generic enough to be primitive * Encode a compound set of invariants * Typically implemented in libraries or performance-sensitive infrastructure ## Footnote They are tailored to specific performance and semantic requirements.
191
What is the **usage** of composite / engineered data structures?
* Created when standard containers don’t meet access/complexity requirements * Used in backend infrastructure, game/graphics engines, compilers ## Footnote They address specific needs that standard containers cannot fulfill.
192
What are the **pros** of using composite / engineered data structures?
* Tailored to specific performance and semantic requirements * Reusable within an organization or domain * Simplify upper layers of application code ## Footnote They provide clear abstractions for application logic.
193
What are the **cons** of using composite / engineered data structures?
* More complex to design, implement, and test * Risk of re-inventing buggy versions of known patterns * Misuse can introduce hidden complexity ## Footnote Careful design and documentation are essential.
194
Name examples of **domain-specific / application-level data structures**.
* Order book in a trading system * Scene graph in a game engine * AST in a compiler * Dependency graph in a build system * Workflow/state machine models * ECS (Entity–Component–System) ## Footnote These structures encode specific problem domains.
195
What are the **characteristics** of domain-specific / application-level data structures?
* Highly semantics-rich * Designed to answer domain-specific queries efficiently ## Footnote They align closely with domain concepts and improve application logic clarity.
196
What is the **usage** of domain-specific / application-level data structures?
* Exposed directly in the application’s core logic * API is domain-oriented ## Footnote They facilitate clear manipulation of domain concepts.
197
What are the **pros** of using domain-specific / application-level data structures?
* Align perfectly with domain concepts * Make application logic much clearer * Highly optimized for specific queries ## Footnote They enhance clarity and efficiency in application development.
198
What are the **cons** of using domain-specific / application-level data structures?
* Easy to bake in too much policy or assumptions * Harder for new team members to reason about * May be tightly coupled to one system’s needs ## Footnote Proper documentation is crucial for maintainability.
199
What is the **side-by-side comparison** of the four levels of data structures?
| Level | Who uses it directly? | Abstraction focus | Typical examples | | ------------------------------------- | ------------------------------------- | --------------------------------- | ------------------------------------------------------------ | | Primitive / foundational | Low-level / performance-critical code | Memory layout, raw operations | arrays, nodes, adjacency lists, raw trees | | Abstract containers / collections | Most general application code | ADT roles (list, set, map) | `vector`, `list`, `dict`, `set`, `TreeMap`, `PriorityQueue` | | Composite / engineered structures | Infra/engine/library devs | Combined invariants & performance | LRU cache, multi-index tables, segment trees, indexed PQs | | Domain-specific structures | Core domain logic / product features | Domain semantics, domain queries | ASTs, order books, scene graphs, ECS stores, workflow graphs | ## Footnote This comparison highlights the differences in usage and focus across the levels.
200
What should a strong engineer be comfortable with regarding the **four levels** of data structures?
* Understanding performance and memory behavior * Using collections as a default toolbox * Designing composite structures when needed * Modeling problem space explicitly with domain-level structures ## Footnote Engineers should adapt their approach based on the specific needs of their applications.
201
Fill in the blank: If your application logic is full of **Map** and **List>** everywhere, you’re probably stuck at the _______ level.
collection ## Footnote This indicates a lack of good composite or domain-level structures.
202
Fill in the blank: If you’re hand-rolling raw **arrays** and **pointer graphs** for everything, you’re stuck at the _______ level.
primitive ## Footnote This suggests missing opportunities for reuse and clarity.
203
What is the **abstract data type (ADT)** of a stack?
* LIFO structure * Interact only with the top * Last thing put on is the first taken off ## Footnote The core idea of a stack is that it follows the Last In, First Out principle.
204
What are the **canonical operations** of a stack?
* `push(x)` * `pop()` * `peek()` / `top()` * `isEmpty()` * Sometimes: `size()` ## Footnote These operations define how you interact with a stack.
205
What is the **time complexity** for the operations of a good stack implementation?
* `push`: O(1) * `pop`: O(1) * `peek`: O(1) * `isEmpty`: O(1) * `size`: O(1) ## Footnote All key operations should ideally be constant-time.
206
What is the **space complexity** of a stack with n elements?
O(n) ## Footnote Space complexity can vary based on implementation, typically being array-based or linked list.
207
What are the **pros** of an **array-based stack**?
* Excellent cache locality * Very small overhead per element * Simple and fast ## Footnote Array-based stacks are the most common in practice.
208
What are the **cons** of an **array-based stack**?
* Need resizing or fixed capacity * Less flexible for arbitrary growth ## Footnote These limitations can affect performance and usability.
209
What are the **pros** of a **linked-list-based stack**?
* Conceptually unbounded * Push/pop always O(1) without reallocation ## Footnote Linked-list stacks do not require resizing.
210
What are the **cons** of a **linked-list-based stack**?
* More memory overhead * Worse cache behavior * More allocation/GC pressure ## Footnote These factors can impact performance and efficiency.
211
What is the **call stack**?
* Creates stack frames on function calls * Stores return address, parameters, local variables * Frames are stored in contiguous memory ## Footnote The call stack is a runtime structure that manages function calls and returns.
212
What happens during **recursion** in relation to the call stack?
* Each recursive call adds a frame * Deep recursion can cause stack overflow ## Footnote Understanding stack depth and limits is crucial for avoiding errors.
213
What is the difference between **the stack** (memory region) and **a stack** (data structure)?
* The stack: managed by runtime/OS, stores activation records * A stack: logical structure, can be implemented in various memory types ## Footnote Terminology can be confusing; clarity is important.
214
What are some **core patterns** that use stacks?
* Expression evaluation & parsing * Backtracking * Iterative DFS * Undo/redo systems * Monotonic stacks ## Footnote Recognizing these patterns is essential for software engineering.
215
What is **stack underflow**?
Popping or peeking an empty stack ## Footnote Implementations must handle underflow situations appropriately.
216
What is **stack overflow**?
Pushing when the stack is full ## Footnote This can occur in fixed-capacity stacks or due to deep recursion.
217
What is the difference between a **stack** and a **queue**?
* Stack: LIFO * Queue: FIFO ## Footnote Stacks are used for nested scopes and backtracking, while queues are used for scheduling and pipelines.
218
What should a solid engineer be able to **do** with stacks?
* Explain stacks clearly * Implement a stack * Use stacks for various algorithms * Transform recursion to explicit stack * Debug stack issues * Understand call stack behavior * Recognize advanced patterns ## Footnote Mastery of these skills is crucial for effective software engineering.
219
What is the **abstract data type (ADT)** of a **queue**?
A **FIFO** structure: **First In, First Out** ## Footnote You add elements at the back (tail) and remove them from the front (head).
220
List the core operations of a **queue**.
* `enqueue(x)` / `push(x)` / `offer(x)` – add to back * `dequeue()` / `pop()` / `poll()` – remove and return from front * `front()` / `peek()` – see the first element without removing it * `isEmpty()` / `empty()` – check if queue has no elements * Optional: `size()` ## Footnote These operations define how elements are added and removed from the queue.
221
What is the **time complexity** for the core operations of a correctly designed queue?
* `enqueue` → O(1) * `dequeue` → O(1) * `front` / `peek` → O(1) * `empty` / `size` → O(1) ## Footnote This indicates that all core operations can be performed in constant time.
222
What is the **space complexity** of a queue with n elements?
O(n) ## Footnote This reflects the amount of memory required to store the elements in the queue.
223
What is a **circular buffer** in the context of queue implementation?
* Fixed-size array `A` with capacity `C` * Two indices: `head` (next element to dequeue) and `tail` (next element to enqueue) ## Footnote It allows efficient use of space and time for enqueue and dequeue operations.
224
What are the **pros** of using a circular buffer for queues?
* Great cache locality * Minimal overhead per element * Perfect for fixed-capacity, high-throughput systems ## Footnote These advantages make circular buffers suitable for performance-critical applications.
225
What are the **cons** of using a circular buffer for queues?
* Fixed capacity unless resizing is implemented * Complexity in handling resizing and index remapping ## Footnote These limitations can affect the flexibility of the queue.
226
What is a **linked-list-based queue**?
* Struct `Node { value; Node* next; }` * Keep `Node* head` (front) and `Node* tail` (back) ## Footnote This implementation allows for dynamic sizing but has different performance characteristics.
227
What are the **pros** of using a linked-list-based queue?
* Conceptually unbounded * Pure O(1) enqueue/dequeue without resizing ## Footnote This allows for flexibility in the number of elements in the queue.
228
What are the **cons** of using a linked-list-based queue?
* Heap allocation per node → overhead and fragmentation * Poor cache locality * More GC/allocator pressure ## Footnote These factors can lead to performance issues in certain scenarios.
229
What is the difference between a **queue** and a **stack**?
* **Queue**: FIFO – first in, first out * **Stack**: LIFO – last in, first out ## Footnote This distinction defines their respective use cases in programming.
230
What is the difference between a **queue** and a **deque**?
* **Queue**: fixed roles for ends (enqueue at back, dequeue from front) * **Deque**: can push/pop from both ends ## Footnote Deques provide more flexibility in how elements can be added or removed.
231
What is the difference between a **queue** and a **priority queue**?
* **Queue**: Order = time of arrival * **Priority queue**: Order = priority; highest/lowest priority served first ## Footnote This affects how tasks are processed based on their urgency.
232
What is a **producer-consumer** pattern in relation to queues?
* One or more producers enqueue tasks * One or more consumers dequeue tasks and process them ## Footnote This pattern is fundamental in concurrent programming and task management.
233
What is **BFS (Breadth-First Search)** and how does it use queues?
* BFS uses a queue to explore nodes level by level * Enqueue neighbors as discovered, dequeue to explore in order ## Footnote This algorithm is commonly used in graph traversal.
234
What is the **blocking behavior** of queues?
* `put(x)` – blocks if the queue is full * `take()` – blocks if the queue is empty ## Footnote This behavior is important for managing flow in concurrent systems.
235
What are the **key metrics** in queueing theory?
* **Utilization** = λ / μ * **Average queue length** * **Average waiting time** ## Footnote These metrics help in analyzing the performance of queue systems.
236
What is **underflow** in the context of queues?
* Dequeue on empty queue must define behavior: throw exception, return sentinel, or block ## Footnote Proper handling of underflow is crucial to prevent errors in queue operations.
237
What is **overflow** in the context of bounded queues?
* Enqueue when full must have well-defined behavior: block, drop, or return error ## Footnote This ensures that the system behaves predictably under load.
238
What are the **API / design best practices** when using queues?
* Treat them as queues, not pseudo-lists * Be explicit about bounded vs unbounded * Document thread-safety and ordering guarantees ## Footnote Following these practices helps in creating robust queue implementations.
239
What should a solid engineer be able to **do** with queues?
* Explain the queue ADT * Implement a queue using ring buffer or linked list * Use queues for BFS, producer-consumer, and event loops * Choose between queue types * Reason about performance * Identify and fix queue-related bugs ## Footnote Mastery of these skills indicates a strong understanding of queues in software engineering.
240
What is a **list** at the abstract level?
A finite, ordered collection of elements where *position* matters and duplicates are allowed ## Footnote Key points include: Order matters, duplicates allowed, indexed, and elements can be inserted/removed at any position.
241
List the **typical operations** for a list.
* `get(i)` – element at index `i` * `set(i, x)` – replace element at `i` * `insert(i, x)` – shift elements [i..end) right * `remove(i)` – remove element at `i` and shift elements left * `push_back(x)` / `append(x)` – add at end * `pop_back()` – remove last element * iteration: `for each element in order` ## Footnote Everything else is variations on top of these.
242
What are the **core performance expectations** for a dynamic array list?
* `get(i)` / `set(i,x)` → **O(1)** * `push_back(x)` → **amortized O(1)** * `pop_back()` → **O(1)** * `insert/remove` near the middle → **O(n)** ## Footnote These expectations apply to implementations like `std::vector`, Java `ArrayList`, and Python `list`.
243
What are the characteristics of **array-backed lists**?
* Contiguous memory block (array) * Random access O(1) via pointer arithmetic * Good cache locality for iteration ## Footnote Examples include C++ `std::vector`, Java `ArrayList`, and Python `list`.
244
What are the **types of linked lists**?
* Singly linked (`next`) * Doubly linked (`prev` + `next`) * Circular variants ## Footnote Examples include C++ `std::list` (doubly linked) and Java `LinkedList`.
245
When should you **choose a list**?
* You care about **order** and **position** * You want to iterate in a consistent sequence * You frequently append at the end * You need random access by index ## Footnote Lists are not ideal for presence checks, key-value lookups, or priority semantics.
246
What is the **memory behavior** of array-backed lists?
* Great spatial locality * Fewer allocations (one big block) * Reallocation only on growth/shrink events ## Footnote This leads to better performance for large data and hot loops.
247
What are the **iteration patterns** you should be comfortable writing?
* Forward iteration * Reverse iteration * Safe removal while iterating ## Footnote Especially important with linked lists, using pointers/iterators carefully.
248
What should every engineer understand about **linked lists**?
* Implement singly and doubly linked lists * Recognize and handle empty lists, single-element lists * Insert/remove at head and tail * Removing the node you’re currently at during traversal ## Footnote Classic bugs include losing part of the list and memory leaks.
249
What are the **API / design best practices** when exposing a list?
* Expose the right abstraction * Consider ownership and mutability * Avoid leaking implementation details * Document complexity expectations ## Footnote This ensures clarity and usability for API consumers.
250
What is a solid engineer expected to do with lists?
* Define the list ADT * Implement dynamic array and linked lists * Know the complexity of operations * Use lists correctly in algorithms * Understand memory/cache behavior * Choose the right structure * Handle corner cases * Reason about API design ## Footnote Mastery of these areas indicates a strong generalist/systems-minded engineer.
251
What is a **tree** at the abstract level?
A connected, acyclic graph with a distinguished root ## Footnote Key concepts include nodes, edges, root, parent/child relationships, leaves, internal nodes, subtrees, height, and depth.
252
What are the key concepts associated with a **tree**?
* Nodes (vertices) * Edges (links) * Root * Parent / child * Leaf * Internal node * Subtree * Height * Depth ## Footnote These concepts help define the structure and properties of trees.
253
True or false: Every **tree** is a graph.
TRUE ## Footnote However, not every graph is a tree.
254
What are some **real-world structures** that are inherently hierarchical or based on ordered labels?
* File systems * Org charts * UI widgets * DOM * Binary search trees * B-trees * Heaps * Tries * Expression trees * Quadtrees ## Footnote Trees are prevalent in many structured data representations.
255
What is the time complexity for **search**, **insert**, and **delete** operations on a **balanced search tree**?
* O(log n) ## Footnote This applies to structures like AVL trees, Red-Black trees, and B-trees.
256
What is the time complexity for **traversal** over all nodes in a **generic tree**?
O(n) ## Footnote Operations are often defined in terms of traversals (DFS/BFS).
257
What is a **binary tree**?
Each node has at most 2 children ## Footnote This is a fundamental tree structure.
258
What is a **full binary tree**?
Each node has 0 or 2 children ## Footnote This structure is distinct from other types of binary trees.
259
What is a **complete binary tree**?
Filled level by level, left to right, no gaps except possibly last level ## Footnote This structure ensures efficient use of space.
260
What is a **binary search tree (BST)**?
For each node, all keys in left subtree < node’s key, all keys in right subtree > node’s key ## Footnote This property allows for efficient searching.
261
What are the **three types of tree traversals**?
* Depth-First Search (DFS) * Breadth-First Search (BFS) * Level-order traversal ## Footnote Each traversal method has different applications and use cases.
262
What is the **purpose** of a **heap**?
Implement priority queues ## Footnote Heaps maintain a specific order among elements.
263
What is the time complexity for **insert** and **extractMin** operations in a **heap**?
* O(log n) ## Footnote The `peekMin` operation is O(1).
264
What is a **trie**?
A tree structure where nodes correspond to prefixes of strings or bit sequences ## Footnote Tries are used for efficient string operations.
265
What is the purpose of **B-trees / B+ trees**?
Efficient disk-based search trees ## Footnote They are optimized for minimizing I/O operations.
266
What are the characteristics of **pointer-based trees**?
* Flexible * Often used for general trees, BSTs, tries * Pointer-heavy; not cache-optimal ## Footnote Each node typically contains pointers to its children.
267
What are the characteristics of **array-based trees**?
* No pointer overhead * Great cache locality * Works best for complete or near-complete trees ## Footnote This representation computes child indices based on parent indices.
268
When should you use a **tree** instead of an **array/list**?
* Efficient search, insert, delete by key * Hierarchical or branching structure ## Footnote Trees provide advantages in structured data scenarios.
269
What are common **tree-based domains** every engineer encounters?
* Filesystems * UI / DOM trees * Abstract Syntax Trees (ASTs) * Scene graphs / game entities ## Footnote These domains utilize tree structures for organization and representation.
270
What is a common **pitfall** when using naive BSTs?
Inserting sorted keys can lead to a linked list shape ## Footnote This results in degraded performance (O(n)).
271
What is a common **mitigation** for stack overflow in recursive tree traversals?
Use iterative traversals with an explicit stack ## Footnote This approach avoids deep recursion issues.
272
What should a solid engineer be able to **explain** about trees?
* What a tree is * How it differs from a graph * Basic terminology (root, leaf, height, depth, subtree) ## Footnote Understanding these concepts is fundamental for working with trees.
273
What are the **tree-shaped problems** mentioned?
* File/DOM/AST hierarchies * Decision trees * Organizational hierarchies ## Footnote These problems often require tree-based data structures for efficient representation and manipulation.
274
What is the impact of **balanced vs unbalanced** trees on performance?
Balanced trees provide better performance with O(log n) operations, while unbalanced trees can degrade to O(n) ## Footnote Understanding this difference is crucial for optimizing tree operations.
275
Fill in the blank: To **model** a tree structure, you should design a simple tree structure for your domain, such as nodes representing _______.
tasks, dependencies, UI components ## Footnote This helps in visualizing and managing relationships within the data.
276
What are some **debugging** strategies for tree structures?
* Spot stack overflows from deep recursion * Reason about incorrect tree invariants (e.g., BST property broken) ## Footnote Debugging tree structures often involves checking for recursion depth and maintaining properties of the tree.
277
True or false: Understanding tree structures is not essential for general software engineering.
FALSE ## Footnote Comfort with tree structures is important for various software engineering tasks.
278
What are the **next steps** if you want to go further with tree structures?
* Show concrete C++ examples of BSTs / tree traversals / heaps * Walk through a real-world case (e.g., flattening a tree, implementing a simple file tree, or building a tiny AST and interpreter) ## Footnote These options provide practical applications of tree structures in programming.
279
What is a **heap**?
A tree-based data structure that organizes elements by **priority** and guarantees that the **extremal** element is always at the top ## Footnote Supports efficient operations like inserting a new element and getting/removing the min or max element.
280
What are the two most common types of heaps?
* **Min-heap**: smallest key is at the root * **Max-heap**: largest key is at the root ## Footnote The **heap property** ensures that for every node, the key is less than or equal to its children's keys.
281
True or false: A heap is a fully sorted structure.
FALSE ## Footnote Only the root is guaranteed to be min/max; siblings have no particular order.
282
What is the difference between a **heap** and a **priority queue**?
* **Heap**: specific data structure implementation * **Priority queue**: abstract data type with operations like insert, findMin, extractMin ## Footnote Heaps are often used to implement priority queues.
283
List the core operations and their complexities for a **binary heap**.
* `insert(x)` / `push` → **O(log n)** worst-case * `findMin` / `top` → **O(1)** * `extractMin` / `pop` → **O(log n)** worst-case * `heapify` → **O(n)** total * `decreaseKey` → **O(log n)** ## Footnote These complexities highlight the efficiency of heaps for dynamic sets.
284
How is a **binary heap** typically implemented?
As an **array-backed tree** ## Footnote Most production heaps are complete binary trees stored in an array, with implicit tree shape determined by index math.
285
What is the time complexity for **building a heap** using bottom-up heapify?
**O(n)** ## Footnote This is more efficient than repeated insertions, which would take O(n log n).
286
Name three **real-world use cases** for heaps.
* Task scheduling / job priority * Graph algorithms (e.g., Dijkstra’s shortest path) * Top-k queries / streaming ## Footnote Heaps are useful for scenarios where you need to repeatedly extract the next best element.
287
What is the difference between a **heap** and a **sorted array**?
* **Heap**: `insert` = O(log n), `findMin` = O(1) * **Sorted array**: `insert` = O(n), `findMin` = O(1) ## Footnote Heaps are more efficient for many inserts intermixed with finds.
288
What are some **heap variants** worth recognizing?
* **Binary heap**: standard, array-based * **d-ary heap**: each node has `d` children * **Fibonacci heap**: amortized O(1) `insert` ## Footnote These variants have different trade-offs and use cases.
289
What is **lazy deletion** in heaps?
Marking an element as 'invalid' instead of removing it ## Footnote This is useful in algorithms like Dijkstra's when using heaps without decreaseKey.
290
What is the distinction between **heap (data structure)** and **heap (memory)**?
* **Heap (data structure)**: used to implement priority queues * **Heap (memory)**: general-purpose memory area for allocation ## Footnote They are unrelated concepts despite sharing the name.
291
What should a solid software engineer be able to do with heaps? List at least three skills.
* Explain what a heap is and its properties * Know the complexities of heap operations * Use standard library heaps (e.g., C++ `std::priority_queue`) ## Footnote Additional skills include implementing a binary heap and applying heaps to algorithms.
292
Fill in the blank: The **heap property** for a min-heap states that for every node `u`, _______.
key(u) ≤ key(child) for all children ## Footnote This ensures that the root is the minimum element.
293
What is the time complexity for **extract-min** in a binary heap?
**O(log n)** worst-case ## Footnote This operation involves bubbling down the root element to maintain the heap property.
294
True or false: A **d-ary heap** reduces the height of the heap.
TRUE ## Footnote It allows each node to have `d` children, which can optimize certain patterns.
295
Name three **algorithms** that utilize heaps.
* Dijkstra * A* * Prim ## Footnote These algorithms are commonly used in graph-related problems.
296
What is the purpose of a **heap** in algorithms?
* Maintain a k-sized heap * Extract extremal values from a changing set * Manage priority tasks ## Footnote Heaps are particularly useful for efficiently handling priority queues.
297
True or false: A **heap** is a fully sorted data structure.
FALSE ## Footnote A heap is not fully sorted; it maintains a partial order.
298
What is the difference between **heap (data structure)** and **heap (memory)**?
Heap (data structure) is a binary tree structure; heap (memory) refers to dynamic memory allocation ## Footnote Understanding this distinction is crucial for programming and data structure management.
299
Fill in the blank: The **build-heap** operation can be O(n) in complexity.
O(n) ## Footnote This is a common misconception; building a heap is more efficient than one might expect.
300
What are two potential next steps after mastering heaps?
* Show C++ semantics examples * Sketch a small priority scheduler ## Footnote These steps can deepen understanding of heaps in practical applications.
301
At the abstract level, a **graph** is defined as a set of _______ and a set of _______.
vertices (nodes), edges (links) ## Footnote Formally: G = (V, E) where V = set of vertices and E = set of edges.
302
What is an **undirected graph**?
Edge `{u, v}` has no direction ## Footnote “u is connected to v” is symmetric.
303
What is a **directed graph (digraph)**?
Edge `(u → v)` has direction ## Footnote Think “u can reach v” but not necessarily vice versa.
304
What is a **weighted graph**?
Edges have weights/costs/capacities ## Footnote This allows for more complex relationships between nodes.
305
What is an **unweighted graph**?
All edges considered equal (weight 1) ## Footnote Simplifies calculations and relationships.
306
Name the **key variants** of graphs.
* Undirected graph * Directed graph (digraph) * Weighted graph * Unweighted graph ## Footnote Combination examples include directed weighted and undirected unweighted graphs.
307
Define **vertex / node** in the context of graphs.
Individual item in the graph ## Footnote Each vertex can represent an entity or point of interest.
308
What is an **edge** in graph theory?
Connection between nodes ## Footnote Edges can be directed or undirected.
309
What does **degree** refer to in an undirected graph?
Number of edges incident to a node ## Footnote Indicates the connectivity of a vertex.
310
What is **in-degree** in a directed graph?
# of edges *entering* a node ## Footnote Reflects how many nodes point to this node.
311
What is **out-degree** in a directed graph?
# of edges *leaving* a node ## Footnote Indicates how many nodes this node points to.
312
Define a **path** in graph theory.
Sequence of vertices connected by edges ## Footnote Paths can vary in length and complexity.
313
What is a **simple path**?
Path with no repeated vertices ## Footnote Ensures unique traversal of nodes.
314
What is a **cycle** in a graph?
Path starting and ending at the same vertex with at least one edge and no repeats in between ## Footnote Cycles can indicate redundancy or loops in processes.
315
What does it mean for a graph to be **connected**?
There is a path between every pair of nodes ## Footnote Ensures all nodes are reachable from one another.
316
What is a **connected component**?
Maximal set of nodes where each pair is mutually reachable ## Footnote Relevant in undirected graphs.
317
What does **strongly connected** mean in directed graphs?
u and v are mutually reachable via directed paths ## Footnote Indicates a robust interconnection between nodes.
318
What is a **strongly connected component (SCC)**?
Maximal set where every node can reach every other via directed paths ## Footnote Important for analyzing directed graphs.
319
What is a **DAG (Directed Acyclic Graph)**?
Directed graph with **no directed cycles** ## Footnote Crucial for scheduling and dependency management.
320
What is an **adjacency list**?
For each vertex `u`, store a list of its neighbors ## Footnote Typically represented as `vector> adj;`.
321
What is the space complexity of an **adjacency list**?
O(V + E) ## Footnote Efficient for sparse graphs.
322
What is an **adjacency matrix**?
A 2D array `M` where `M[u][v]` indicates if edge `u → v` exists ## Footnote Useful for dense graphs.
323
What is the space complexity of an **adjacency matrix**?
O(V²) ## Footnote Can be inefficient for large sparse graphs.
324
What is an **edge list**?
Just a list/array of edges `(u, v, w)` ## Footnote Space complexity is O(E).
325
What is the complexity of **DFS (Depth-First Search)**?
O(V + E) ## Footnote Explores as far as possible along each branch before backtracking.
326
What is the complexity of **BFS (Breadth-First Search)**?
O(V + E) ## Footnote Explores neighbors by layers based on distance from the start node.
327
What is a **tree** in graph theory?
A connected graph with no cycles ## Footnote Trees have unique properties and traversal algorithms.
328
What is a **bipartite graph**?
Vertices can be partitioned into 2 sets (U, V) such that all edges go between U and V ## Footnote Equivalent condition: graph has **no odd-length cycles**.
329
What is **topological sort**?
Produces an ordering of vertices such that for every edge `u → v`, `u` comes before `v` ## Footnote Applicable only to **DAGs**.
330
What is the **Single-source shortest path (SSSP)** algorithm for unweighted graphs?
BFS ## Footnote Efficiently finds the shortest path in terms of edge count.
331
What algorithm is used for the **Shortest paths** in weighted graphs with non-negative weights?
Dijkstra’s algorithm ## Footnote Utilizes a min-heap for efficiency.
332
What is a **Minimum Spanning Tree (MST)**?
A spanning tree that connects all vertices with no cycles and minimum total edge weight ## Footnote Algorithms include **Kruskal’s** and **Prim’s**.
333
What is the purpose of **Tarjan’s algorithm**?
To find strongly connected components (SCC) in directed graphs ## Footnote Useful in compilers and static analysis.
334
What is the complexity consideration when analyzing graphs?
How many vertices (V) and how many edges (E) ## Footnote Determines if the graph is sparse or dense.
335
What is a common modeling mistake in graph theory?
Using lists/maps where a graph would help ## Footnote Recognizing graphs allows for standard algorithms instead of brittle solutions.
336
What should a solid software engineer be able to explain about graphs?
* What a graph is (directed/undirected, weighted/unweighted) * Key terms: path, cycle, connected component, DAG, SCC, degree ## Footnote Fundamental knowledge for graph-related tasks.