asd_question Flashcards

(126 cards)

1
Q

What is a “specification” in the context of algorithm design?

A

A specification is a pair consisting of a precondition (the requirements for the input data) and a postcondition (the requirements for the output data). It formally defines what the algorithm is expected to do.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does “correct input data” mean?

A

Correct input data is data that satisfies the precondition of the algorithm’s specification.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define “Total Correctness” of an algorithm.

A

An algorithm is totally correct with respect to a specification if, for any input data that satisfies the precondition, the algorithm eventually stops (terminates) and the resulting output satisfies the postcondition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define “Partial Correctness” of an algorithm.

A

An algorithm is partially correct if, for any input data satisfying the precondition, if the algorithm stops, then the output satisfies the postcondition. It does not guarantee that the algorithm will actually stop.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the “Stop Property” (Termination)?

A

The stop property is the guarantee that for any input satisfying the precondition, the algorithm will reach a final state in a finite number of steps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a “Loop Invariant”?

A

A loop invariant is a logical formula (condition) that is true before entering the loop and remains true after each iteration of the loop body.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the three steps required to prove a loop’s partial correctness using an invariant?

A

Initialization (Base): Prove the invariant is true before the first iteration.

Maintenance (Step): Prove that if the invariant is true before an iteration, it remains true after the iteration.

Termination (Stop): Prove that when the loop stops, the invariant combined with the stop condition implies the desired target condition (postcondition).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Provide an example of a loop invariant for an algorithm finding the maximum value in an array.

A

For an algorithm finding the maximum x in an array Arr of length len, an appropriate invariant is: ∀ 0 ≤ j < i : x ≥ Arr[j] ∧ ∃ 0 ≤ j < len : x == Arr[j]. This states that x is the maximum of the elements seen so far (up to index i) and that x is indeed one of the elements from the array.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you conclude the proof of correctness for the maximum-finding algorithm when the loop finishes?

A

When the loop terminates, the stop condition is i == len. By combining this with the invariant (which is still true), we get ∀ 0 ≤ j < len : x ≥ Arr[j], which is exactly the target postcondition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is the “Maintenance” step typically proven in a loop?

A

Maintenance is proven by assuming the invariant holds at the start of an arbitrary iteration and showing that the operations performed inside the loop body (e.g., an if statement that updates a maximum value and an i++ increment) preserve the truth of the formula for the next value of the counter i.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the relationship between Total Correctness, Partial Correctness, and the Stop Property?

A

Total Correctness is equivalent to Partial Correctness plus the Stop Property.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Q1. What are the two obvious conditions that any good algorithm should satisfy?

A

A good algorithm must first compute the correct or desired output for the given problem. Second, it must be effective, meaning it is “fast” in its execution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Q2. What does the “complexity” of an algorithm measure?

A

Complexity measures the efficiency of an algorithm in terms of two basic resources: time and memory. Time complexity measures how fast the algorithm is, while space complexity measures the amount of memory it uses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Q3. What two things must be determined before starting to assess the time complexity of an algorithm?

A

Before assessment, one must identify the dominating operation and the data size (typically denoted as $n$ or $len$).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Q4. Define “Dominating Operation.”

A

A dominating operation is an operation in the algorithm’s code whose total number of executions is proportional to the total number of all operations executed. In searching and sorting, this is usually a comparison between two elements.II. Types of Complexity Functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Q5. Define “Pessimistic (Worst-case) Time Complexity.”

A

Pessimistic time complexity, denoted as $W(n)$, is the maximum amount of work (number of dominating operations) an algorithm must perform for any input of a fixed size $n$.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Q6. Define “Average Time Complexity.”

A

Average time complexity, denoted as $A(n)$, is the expected amount of work performed by the algorithm for an input of size $n$, assuming a specific probability distribution of the input data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Q7. What is the purpose of asymptotic notation (e.g., Big O, Big Theta)?

A

Asymptotic notation is used to describe the rank of order of the complexity function. It simplifies the analysis by ignoring constant factors and lower-order terms, focusing on how the resource requirements grow as the input size $n$ increases toward infinity.III. Asymptotic Ranks and Practical Implications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Q8. List the most common ranks of functions in order of increasing complexity.

A

Constant: $\Theta(1)$Logarithmic: $\Theta(\log n)$Linear: $\Theta(n)$Linear-logarithmic: $\Theta(n \log n)$Square: $\Theta(n^2)$Cubic: $\Theta(n^3)$Sub-exponential: $\Theta(n^{\log n})$Exponential: $\Theta(2^n)$Factorial: $\Theta(n!)$

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Q9. What is considered “unacceptably high” complexity in practice?

A

In practice, an over-polynomial rank of time complexity (such as exponential or factorial) is considered unacceptably high. For space complexity, even a linear rank ($\Theta(n)$) can be considered very high in certain contexts.IV. Case Study: The Search Problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Q10. How is the time complexity of the “Search Problem” (finding a key in an array) assessed?

A
  • Dominating Operation: Comparison of the array element with the target key.Data size: The length of the array, $len$.Pessimistic Complexity: $W(len) = len$ (the key is at the very end or not present), which is $\Theta(len)$.Average Complexity: Under a uniform distribution (where the key is equally likely to be anywhere), $A(len) = rac{len+1}{2}$, which is also $\Theta(len)$.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Q1. What is the formal specification for the “Searching Problem”?

A

Input: A sequence of integers $S$ (indexed from 0 to $len-1$) and an integer key to be found.Output: A natural number index less than $len$ such that $S[index] == key$, or -1 if the key is not present in the sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Q2. What is the “Divide and Conquer” rule in algorithm design?

A

It is a method where a problem is divided into smaller sub-problems such that it is easy to reconstruct the total solution from the sub-solutions. It is often implemented using recursion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Q3. Why is Sequential Search considered to have optimal time complexity for an unordered sequence?

A

Sequential search has linear time complexity $W(len) = \Theta(len)$ with a multiplicative factor of 1. It cannot be improved for unordered data because every element might potentially be the key, and no information about the position is available without checking.II. Searching in Sorted Sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q4. How does the "Skipping" algorithm improve searching in a sorted sequence?
It checks every $k$-th cell, skipping $k-1$ elements in each jump. If it finds a number higher than the key, it only checks the last $k-1$ elements.Complexity: It is $k$ times faster on average than normal sequential search, but it is still linear $W(len) = rac{1}{k} \cdot \Theta(len)$.Optimal $k$: The optimal choice for $k$ is $\sqrt{len}$.
26
Q5. Explain the logic of the Binary Search algorithm.
Binary search uses a "Divide and Conquer" approach:Check the middle element of the current sequence.If it equals the key, return the index.If it is higher than the key, restrict searching to the "left" sub-sequence.If it is lower than the key, restrict searching to the "right" sub-sequence.Repeat until the key is found or the sequence is empty.
27
Q6. What are the complexities of the Binary Search algorithm?
Pessimistic ($W(len)$): $\Theta(\log_2(len))$.Average ($A(len)$): $\Theta(\log_2(len))$.Space ($S(len)$): $O(1)$.Critical Assumption: The sequence must be kept in RAM to allow efficient random access to the middle element.III. Positional Statistics and Selection
28
Q7. What is a "k-th positional statistic" in a sequence?
It is the $k$-th smallest (or largest) element in a sequence. For example, the 1st positional statistic is the minimum of the sequence.
29
Q8. Describe the "Tournament" algorithm for finding the second smallest element.
The sequence is divided into pairs that "compete," and the smaller one (the winner) moves to the next turn. The root of the resulting tree is the smallest element. The second smallest must be among those who lost directly to the winner in the "tournament history".
30
Q9. What is the "Partition" procedure used in selection algorithms?
It selects an element $M$ (pivot) and reorganizes the sequence such that all elements on the left are not greater than $M$ and all elements on the right are not less than $M$. It returns the final index of $M$.Complexity: Time $W(n) = n + O(1)$ and Space $S(n) = O(1)$.
31
Q10. How does Hoare's Algorithm find the k-th positional statistic?
It uses the partition procedure and "Divide and Conquer":Call partition on the sequence.If the returned index equals $k$, return $S[k]$.Otherwise, continue searching recursively in the left or right sub-sequence depending on whether the returned index was greater or less than $k$.Complexity: On average, the time complexity is linear.
32
Q1. Define the formal specification of the "Sorting Problem."
Input: A sequence $S$ of elements that can be ordered according to a binary total-order relation $R$, and a natural number len representing the length of the sequence.Output: A non-decreasingly sorted sequence $S'$ consisting of the elements from the multi-set of the input sequence $S$.
33
Q2. Why is sorting considered one of the most important operations in computer science?
Sorting is a basic operation used extensively in real-life data processing. Its primary applications include:Acceleration of searching.Acceleration of database operations on relations "by key".Data visualization.Computing statistical characteristics.II. Selection Sort and Insertion Sort
34
Q3. Describe the basic idea and complexity of the Selection Sort algorithm.
Idea: The algorithm repeatedly identifies the minimum element from the unsorted part of the sequence and swaps it into its correct position in the output sequence.Time Complexity: It has a square complexity, $W(len) = \Theta(len^2)$.Key Property: The average complexity is the same as the worst-case because the algorithm always executes the same number of comparisons, even for an already sorted sequence.
35
Q4. Explain why Insertion Sort is considered "intelligent."
Insertion sort is adaptive; it adjusts the amount of work based on the degree of sortedness of the input data. For an already sorted sequence, it needs only $n-1$ comparisons, making it linear ($\Theta(n)$) and very fast in that specific case.
36
Q5. Compare the average-case performance of Selection Sort and Insertion Sort.
On average, Insertion Sort is twice as fast as Selection Sort, although both remain in the square time complexity rank ($\Theta(n^2)$).III. Merge Sort and Divide and Conquer
37
Q6. How does the "Divide and Conquer" approach apply to Merge Sort?
The approach follows three steps:Divide: Split the sequence into two halves.Sort: Recursively sort each half separately.Merge: Combine the two sorted halves into a single sorted sequence.
38
Q7. What are the time and space complexities of Merge Sort when implemented with arrays?
Time Complexity: It is linear-logarithmic, $W(len) = A(len) = \Theta(len \log len)$, across all cases because the sequence is always divided into equal halves.Space Complexity: When using arrays, the space complexity is high, $S(n) = \Theta(n)$, due to memory allocation and element copying during the merge phase.IV. Data Structures: Arrays vs. Linked Lists
39
Q8. Compare the advantages and disadvantages of Arrays versus Linked Lists.
Arrays: Offer very fast random access and low memory consumption but have linear time complexity for insertions.Linked Lists: Provide very fast modification operations (inserting/deleting subsequences) but have slower access (only through the list head) and require additional memory for storing links.
40
Q9. How does using Linked Lists improve the Merge Sort algorithm?
If a sequence is represented as a linked list, the space complexity of Merge Sort improves to constant, $S(n) = O(1)$, because it avoids the need for massive array reallocations and element copying.
41
Q1. Describe the basic idea of the QuickSort algorithm.
QuickSort is a "divide and conquer" recursive algorithm.For a sequence of length 1, no action is needed.For longer sequences, the data is reorganized so a "pivot" element $M$ is placed in its final position, such that no larger elements are to its left and no smaller elements are to its right.This process is then applied recursively to the resulting left and right subsequences.
42
Q2. What are the time and space complexities of the Partition procedure?
For an array of length $n$, the Partition procedure has a time complexity of $W(n) = A(n) = \Theta(n)$. Its space complexity is $S(n) = O(1)$.
43
Q3. What is the average time complexity of QuickSort?
Assuming a uniform distribution of all possible input permutations, the average time complexity is linear-logarithmic: $A(n) = \Theta(n \log n)$.
44
Q4. What is the pessimistic (worst-case) time complexity of QuickSort, and what input causes it?
The pessimistic complexity is square, $W(n) = \Theta(n^2)$. This occurs when the recursion depth is maximum $(\Theta(n))$, which happens when the input data is already sorted or invertedly sorted.
45
Q5. Discuss the space complexity of the QuickSort algorithm.
Although QuickSort sorts in place, it has a pessimistic space complexity of $O(n)$ due to the memory cost of the recursion stack. It is possible to reduce this to $\Theta(\log n)$ by rewriting the recursive call for the longer subsequence as an iterative one.II. Stability and Limits of Sorting
46
Q6. Define "Stability" in sorting algorithms and explain its practical importance.
A sorting algorithm is stable if it preserves the original order of elements that have equal values (ties). This is highly important in databases where records may be sorted iteratively by different attributes. Stable algorithms allow multi-attribute sorting without destroying the outcome of previous iterations.
47
Q7. What is the mathematically proven best average time complexity for comparison-based sorting algorithms?
The best possible average time complexity for comparison-based algorithms is linear-logarithmic, or $\Theta(n \log n)$. This is explained by the binary decision tree model, where the average height of a tree with $n!$ leaves is $\Theta(\log(n!)) = \Theta(n \log n)$.III. Non-Comparison Sorting (CountSort and RadixSort)
48
Q8. How does CountSort "beat" the $\Theta(n \log n)$ limit?
CountSort beats the limit by avoiding comparisons altogether, using direct addressing instead. It achieves linear time complexity at the cost of high space complexity (using two helper arrays).
49
Q9. What are the time and space complexities of the CountSort algorithm?
Time Complexity: $A(n,m) = W(n,m) = \Theta(n,m)$, where $n$ is the sequence length and $m$ is the maximum value.Space Complexity: $S(n,m) = \Theta(n,m)$.
50
Q10. Describe the RadixSort scheme.
RadixSort is a lexicographic sorting scheme for fixed-length objects like strings or multi-digit numbers. It applies a stable internal sorting algorithm (like CountSort) to each consecutive position of the objects, moving from the last position to the first.
51
Q1. Define recursion in the contexts of mathematics, programming, and algorithms.
Mathematics: A recurrent formula or definition, such as $n! [cite_start]= (n-1)!n$.Programming: A function that calls itself1.Algorithms: The reduction of an instance of a problem to a smaller instance of the same problem, often referred to as "divide and conquer"2.
52
Q2. What is the critical "warning" provided regarding the implementation of recursion?
Recursion must always be well-founded on a trivial case (base case) to prevent infinite loops, such as $0! [cite_start]= 1$ for a factorial function3.
53
Q3. Compare the positive and negative aspects of recursion as an algorithmic tool.
Positive: It allows for a very compact representation of an algorithm4.Negative: It implicitly costs additional memory because the system must maintain a recursion stack5. In some cases, such as with large inputs (e.g., $n=100,000$), recursion can lead to fatal memory errors where an iterative version would succeed6.+1II. Classical Recursive Problems
54
Q4. Define the Fibonacci sequence and its recurrent formula.
The Fibonacci sequence starts with $0, 1, 1, 2, 3, 5, 8, 13, \dots$7.Base cases: $Fibonacci(0) = 0$ and $Fibonacci(1) = 1$8.Step: $Fibonacci(n+1) = Fibonacci(n) + Fibonacci(n-1)$9.
55
Q5. What is the recurrence relation for the number of moves in the Hanoi Towers puzzle?
To move $n$ rings, the number of moves is defined as:Base: $hanoi(1) = 1$10.Step: $hanoi(n) = 2 \cdot hanoi(n-1) + 1$11.Closed-form solution: $hanoi(n) = 2^n - 1$12.III. Solving Recurrence Equations
56
Q6. Describe the method for solving 2nd-order linear recurrent equations ($s_n = as_{n-1} + bs_{n-2}$).
One must solve the characteristic equation $x^2 - ax - b = 0$13.If there is a single solution $r$: $s_n = c_1r^n + c_2nr^n$14.If there are two solutions $r_1, r_2$: $s_n = c_1r_1^n + c_2r_2^n$15.The constants $c_1, c_2$ are found using the values for $n=0$ and $n=1$16.
57
Q7. What are the three common cases of recursive equations frequently encountered in algorithmics?
Case 1 (Logarithmic): $t(n) = t(n/2) + c ightarrow \Theta(\log n)$. Example: Binary Search17.Case 2 (Linear): $t(n) = 2t(n/2) + c ightarrow \Theta(n)$. Example: Finding maximum in a sequence18.Case 3 (Linear-Logarithmic): $t(n) = 2t(n/2) + cn ightarrow \Theta(n \log n)$. Example: Merge Sort19.IV. Advanced Complexity Analysis
58
Q8. Explain the Master Theorem for solving equations of the form $T(n) = aT(n/b) + f(n)$.
The theorem compares $f(n)$ with $n^{\log_b a}$ to determine the dominant term20:If $f(n)$ is polynomially lower than $n^{\log_b a}$, then $T(n) = \Theta(n^{\log_b a})$21.If they are of the same rank, $T(n) = \Theta(n^{\log_b a} \log n)$22.If $f(n)$ is polynomially higher and satisfies a "regularity" condition, then $T(n) = \Theta(f(n))$23.
59
Q9. What is the average time complexity of QuickSort, and how is it derived?
The average complexity is $\Theta(n \log n)$24. This is derived using harmonic numbers, where the final formula is approximately $1.44 \cdot n \log n$25. Unlike the worst-case ($\Theta(n^2)$), the average case relies on the assumption that input data is uniformly distributed26.+1
60
Q1. What are the two basic types of operations performed on sequences?
The two basic types of operations are:Absolute access: Identifying a place by its index. This operation is fast on arrays.Relative access: Identifying a place as a successor or predecessor of a given element. This operation is slow on arrays but efficient for linked structures.
61
Q2. Define the concept of an "Abstract Data Structure" (ADS).
An Abstract Data Structure is defined by its interface—the set of operations it supports and their specifications—rather than its internal implementation. Examples include Stacks, Queues, and Deques.II. Linked Lists
62
Q3. Compare Singly Linked Lists (SList) and Doubly Linked Lists (DList).
Singly Linked List: Each node contains data and a single link to the next node.Doubly Linked List: Each node contains data and two links: one to the next node and one to the previous node.Performance difference: Doubly linked lists allow for $O(1)$ removal and predecessor access, whereas these operations are typically $\Theta(n)$ in singly linked lists.
63
Q4. What is a "Cyclic List"?
A cyclic list is a linked list where the last node points back to the first node instead of pointing to null.III. Stacks, Queues, and Deques
64
Q5. Explain the Stack data structure and its primary operations.
A stack follows the LIFO (Last In - First Out) principle. Its primary operations are:push(T): Adds an element to the top.pop(): Removes and returns the top element.top(): Returns the top element without removing it.
65
Q6. Explain the Queue data structure and its primary operations.
A queue follows the FIFO (First In - First Out) principle. Its primary operations are:inject(T): Adds an element to the back.out(): Removes and returns the front element.front(): Returns the front element without removing it.IV. Complexity and Amortised Analysis
66
Q7. What is "Amortised Complexity"?
Amortised complexity measures the average cost of an operation over a worst-case sequence of operations. It ensures that while a single operation might be expensive, the total cost of $n$ operations is low.
67
Q8. How do Unbounded (Dynamic) Arrays maintain efficiency during growth?
When an unbounded array is full, it allocates a new array that is twice as large ($lpha=2$) and copies the elements over. Although this specific push is $O(n)$, the amortised cost is $O(1)$ because the expensive reallocations happen infrequently.
68
Q9. Provide a comparison of sequence implementations for common operations.
Indexing [.]: $O(1)$ for Arrays (UArray/CArray); $O(n)$ for Linked Lists.PushBack: $O(1)$ for Linked Lists; $O(1)$ amortised for Arrays.PushFront: $O(1)$ for Linked Lists and Cyclic Arrays; $O(n)$ for standard Unbounded Arrays.Splice: $O(1)$ for Linked Lists; $O(n)$ for Arrays.
69
Q1. Define a Priority Queue (PQ) and its core operations.
A Priority Queue is an abstract data structure where each element has an associated "priority". It supports the following primary operations:insert(T e): Adds a new element with an assigned priority to the PQ.T findMin(): Returns the element with the minimum priority.T delMin(): Returns and deletes the element with the minimum priority.construct(sequence) (optional): Creates a PQ from a given set of elements.
70
Q2. Compare the complexities of a Priority Queue implemented using naive structures (List/Array).
Unsorted List/Array: insert is $O(1)$, but findMin and delMin are both $\Theta(n)$.Sorted List/Array: findMin is $O(1)$ and delMin is $O(1)$, but insert is $\Theta(n)$.II. Binary Heaps
71
Q3. What is a Binary Heap and what properties must it satisfy?
A binary heap is a binary tree implemented to represent a priority queue. It must satisfy:Structure Property: It must be a complete binary tree (all levels full except possibly the last, which is filled left to right).Heap-Order Property: In a min-heap, the key of every node is less than or equal to the keys of its children.
72
Q4. How is a Binary Heap represented as an array?
A binary heap of $n$ elements can be stored in an array $H[1 \dots n]$. For an element at index $i$:Its left child is at index $2i$.Its right child is at index $2i + 1$.Its parent is at index $\lfloor i/2 floor$.
73
Q5. Explain the insert and delMin operations in a Binary Heap.
insert: The new element is placed at the first available leaf position (end of the array) and "percolates up" (upheap) until the heap-order property is restored. Complexity: $O(\log n)$.delMin: The root (minimum element) is removed and replaced by the last element in the heap. This element then "percolates down" (downheap) until the heap-order property is restored. Complexity: $O(\log n)$.
74
Q6. What is the "Fast Construct" method for a Binary Heap?
Instead of $n$ successive insertions ($O(n \log n)$), the heap can be built in $\Theta(n)$ time. All elements are placed in the array arbitrarily, and then downheap is called on every internal node starting from the bottom-most level up to the root.III. Applications and Extensions
75
Q7. What is HeapSort and what is its complexity?
HeapSort is a sorting algorithm that builds a heap from the input data and repeatedly extracts the minimum (or maximum) element. Its time complexity is $\Theta(n \log n)$ for all cases, and it can be performed in-place with $O(1)$ extra space.
76
Q8. List common applications of Priority Queues.
Huffman Codes: Used in data compression.Shortest Path Algorithms: Such as Dijkstra's algorithm.Minimum Spanning Trees: Such as Prim's algorithm.IV. Binomial Heaps (Advanced)
77
Q9. What is a Binomial Tree $B_k$?
A binomial tree $B_k$ is defined recursively: $B_0$ is a single node, and $B_k$ is formed by attaching one $B_{k-1}$ as the leftmost child of the root of another $B_{k-1}$. A $B_k$ tree has $2^k$ nodes and a height of $k$.
78
Q10. What is a Binomial Heap and why is it useful?
A binomial heap is a collection (forest) of binomial trees where each tree satisfies the heap-order property and there is at most one tree of any given rank $k$. Its primary advantage is that the merge operation is very efficient, taking $O(\log n)$ time. Other operations like insert and delMin also take $O(\log n)$ time.
79
Q1. What is a "Dictionary" in the context of abstract data structures?
A dictionary is an abstract data structure that represents a mapping from keys to values. Each element stored is identified by a unique key. It supports three core operations:search(K key): Returns the value associated with the given key (or a special value if the key is absent).insert(K key, V value): Adds a new key-value pair to the dictionary.delete(K key): Removes the element identified by the key.
80
Q2. Provide real-world examples of a dictionary data structure.
Contact book: Key is the person's name; value is the telephone number.Program variable identifiers: Key is the identifier name; value is the memory address.Property-value collection: Key is the property name; value is the property value.
81
Q3. Compare the complexities of dictionary operations using simple implementations.
* Unordered Sequence: Search $\Theta(n)$, Insert $O(1)$, Delete $\Theta(n)$.Ordered Array: Search $\Theta(\log n)$ (using binary search), Insert $\Theta(n)$, Delete $\Theta(n)$.Ordered Linked List: Search $\Theta(n)$, Insert $\Theta(n)$, Delete $\Theta(n)$.II. Hashtables and Hashing
82
Q4. What is the "Direct Addressing" method and what are its limitations?
Direct addressing uses an array where each position corresponds to a key in the universe $U$. While it allows for $O(1)$ operations, it is impractical when the universe $U$ is significantly larger than the number of actual keys $K$ because it wastes a massive amount of memory.
83
Q5. How does a Hashtable resolve the limitations of direct addressing?
A hashtable uses a hash function $h(k)$ to map keys into a smaller range of indices $\{0, \dots, m-1\}$. This allows the size of the table $m$ to be proportional to the number of stored keys $n$, rather than the entire universe $U$.
84
Q6. What is a "collision" and what are the common methods to resolve it?
A collision occurs when two different keys map to the same index in the hashtable ($h(k_1) = h(k_2)$). The two primary resolution methods are:Chaining: Elements mapping to the same index are stored in a linked list at that position.Open Addressing: All elements are stored in the table itself; if a position is occupied, the algorithm probes for the next available slot.
85
Q7. Describe the complexity of Hashtable operations.
In the average case, all dictionary operations (search, insert, delete) take $O(1)$ time. In the worst case (e.g., all keys collide in one chain), operations can take $\Theta(n)$ time.III. Binary Search Trees (BST)
86
Q8. What is a Binary Search Tree (BST) and what is the "BST Property"?
A BST is a binary tree where each node satisfies the following property: for any node $x$, all keys in its left subtree are $\le x.key$, and all keys in its right subtree are $\ge x.key$.
87
Q9. What are the complexities of operations in a standard BST?
Average case: Dictionary operations take $\Theta(\log n)$ time.Worst case: Operations take $\Theta(n)$ time if the tree is unbalanced (e.g., it becomes a linear chain).IV. Balanced and Self-Organizing Trees
88
Q10. Define an AVL Tree and its balancing mechanism.
An AVL tree is a BST that maintains a "balance factor" for every node, defined as the difference in heights between its left and right subtrees. The balance factor must always be in the set $\{-1, 0, 1\}$. If an operation violates this, the tree is rebalanced using rotations. It guarantees $O(\log n)$ complexity even in the worst case.
89
Q11. What is a "Splay Tree" (Self-Organizing BST)?
A splay tree is a BST that uses a splay operation to move a recently accessed element to the root using rotations.Amortised Complexity: Any sequence of $m$ operations on $n$ nodes takes $O(m \log n)$ time.Property: It automatically adapts to non-uniform access frequencies, making frequently accessed elements faster to find.V. Comparative Summary
90
Q12. Compare Hashtables and Binary Search Trees.
* Hashtables: Provide faster average-case dictionary operations ($O(1)$) but do not support ordering-based operations like finding the minimum, maximum, successor, or predecessor.BSTs: Support both dictionary and ordering-based operations with $O(\log n)$ average complexity.
91
Q1. What is the mathematical definition of an undirected graph?
An undirected graph is an ordered pair of sets $G=(V,E)$, where $V$ is a set of vertices and $E$ is a set of edges. Each edge $e = \{v, w\}$ is an unordered pair of vertices from $V$.
92
Q2. Define a Directed Graph (Digraph).
A digraph is an ordered pair $G=(V,E)$, where $V$ is the vertex set and $E$ is the arc set. Each arc $e = (v, w)$ is an ordered pair of vertices, where $v$ is the source and $w$ is the target.
93
Q3. What is the difference between adjacency and incidence?
Two vertices are adjacent if they are connected by an edge. An edge is incident to the two vertices that are its endpoints.
94
Q4. Define the degree of a vertex.
The degree of a vertex $v$, denoted $deg(v)$, is the number of edges incident to it. In digraphs, this is split into in-degree (number of incoming arcs) and out-degree (number of outgoing arcs).II. Graph Representations and Isomorphism
95
Q5. Explain the concept of Graph Isomorphism.
Two graphs $G_1$ and $G_2$ are isomorphic if there exists a bijection between their vertex sets that preserves the adjacency relation. Intuitively, they are the "same" graph drawn differently.
96
Q6. Describe the Adjacency Matrix and the Incidence Matrix.
Adjacency Matrix: A $|V| imes |V|$ matrix where the entry at $(i, j)$ is 1 if vertex $i$ and vertex $j$ are adjacent, and 0 otherwise.Incidence Matrix: A $|V| imes |E|$ matrix where the entry at $(i, j)$ is 1 if vertex $i$ is incident to edge $j$, and 0 otherwise.III. Paths, Cycles, and Connectedness
97
Q7. What is the difference between a path and a cycle?
Path: A sequence of vertices where each consecutive pair is connected by an edge. A simple path does not repeat any vertices.Cycle: A path that starts and ends at the same vertex. A simple cycle does not repeat any vertices other than the start and end.
98
Q8. Define Connectedness in undirected and directed graphs.
Undirected Graph: A graph is connected if there is a path between every pair of vertices.Digraph: A digraph is strongly connected if there is a directed path between every pair of vertices in both directions. It is weakly connected if the underlying undirected graph is connected.IV. Trees and Rooted Structures
99
Q9. What is a "Tree" in graph theory?
A tree is a connected undirected graph that contains no simple cycles.
100
Q10. Define a Rooted Tree and its properties.
A rooted tree is a tree in which one vertex is designated as the root. It introduces hierarchical concepts such as:Parent/Child: Related nodes in the hierarchy.Leaf: A vertex with no children.Height: The maximum depth of any node in the tree.
101
Q11. What characterizes a Binary Tree?
A binary tree is a rooted tree where each vertex has at most two children, typically referred to as the "left child" and the "right child".
102
Q1. Define a Directed Graph (Digraph) and an Undirected Graph.
Directed Graph (Digraph): A pair $G=(V,E)$ where $V$ is a set of vertices (nodes) and $E \subseteq V imes V$ is a set of arcs. Each arc is an ordered pair $(u, v)$, where $u$ is the source and $v$ is the target.Undirected Graph: Similar to a digraph, but the edge $(u, v)$ is an unordered pair.
103
Q2. What are the definitions of a tree, a forest, and a rooted tree?
Tree: An undirected, connected graph without cycles.Forest: A graph without cycles (a collection of trees).Rooted Tree: A tree with one node designated as the root; it is usually represented as a directed graph where arcs point from parents to children.II. Tree Traversals (Visiting Binary Trees)
104
Q3. Describe the three standard methods for visiting nodes in a binary tree.
Pre-order: Visit the root, then recursively visit the left subtree, then the right subtree.In-order: Recursively visit the left subtree, visit the root, then recursively visit the right subtree.Post-order: Recursively visit the left subtree, then the right subtree, then visit the root.III. Graph Traversal Algorithms (BFS and DFS)
105
Q4. Explain the Breadth-First Search (BFS) algorithm and its properties.
Mechanism: BFS explores a graph level-by-level starting from a source node $s$ using a queue.Properties: It computes the shortest distance (minimum number of edges) from the source to all reachable vertices.Complexity: The time complexity is $\Theta(V + E)$.
106
Q5. Explain the Depth-First Search (DFS) algorithm and its properties.
Mechanism: DFS explores as deep as possible along each branch before backtracking.Timestamping: It records two timestamps for each vertex: discovery time ($d$) and finishing time ($f$).Complexity: The time complexity is $\Theta(V + E)$.
107
Q6. How are edges classified during BFS and DFS traversals?
Edges are classified into four types:Tree edges: Edges in the BFS/DFS forest.Back edges: Connect a node to an ancestor.Forward edges: Connect a node to a descendant (DFS only).Cross edges: All other edges (DFS only).IV. Advanced Graph Applications
108
Q7. What is a Topological Sort and how is it computed?
Definition: A linear ordering of vertices such that for every directed edge $(u, v)$, $u$ comes before $v$.Algorithm: 1) Run DFS to compute finishing times $v.f$ for all vertices. 2) Sort vertices in decreasing order of their finishing times.Constraint: It is only possible for Directed Acyclic Graphs (DAGs).
109
Q8. How does one compute Strongly Connected Components (SCCs)?
Using DFS in a four-step process:Compute finishing times $v.f$ for all vertices using DFS on graph $G$.Compute the transposed graph $G^T$ (reverse all arcs).Run DFS on $G^T$, but in the main loop, consider vertices in decreasing order of the finishing times from step 1.The vertices of each tree in the resulting DFS forest form a separate SCC.
110
Q1. Define the formal specification of the Shortest Path Problem.
Input: A weighted directed graph $G=(V, E)$, a weight function $w: E ightarrow \mathbb{R}$, and a source vertex $s \in V$.Weight of a Path: The sum of the weights of the edges forming the path $p = (v_0, v_1, \dots, v_k)$, defined as $w(p) = \sum_{i=1}^{k} w(v_{i-1}, v_i)$.Shortest Path Distance: The minimum weight $\delta(s, v)$ to reach vertex $v$ from $s$. If no path exists, $\delta(s, v) = \infty$.
111
Q2. What is "Edge Relaxation" and why is it critical?
Relaxation is the process of testing whether the shortest path to a vertex $v$ found so far can be improved by going through vertex $u$.Logic: If $v.d > u.d + w(u, v)$, then update $v.d = u.d + w(u, v)$ and set $v.p = u$. All shortest path algorithms in this lecture rely on repeated relaxation.II. Shortest Paths in Specific Graph Types
112
Q3. Explain the algorithm for Shortest Paths in a Directed Acyclic Graph (DAG).
Mechanism: 1) Perform a topological sort of the vertices. 2) Iterate through the vertices in that sorted order and relax all outgoing edges for each vertex.Complexity: $\Theta(V + E)$.Advantage: It is the most efficient method but is strictly limited to graphs without cycles.III. Dijkstra’s vs. Bellman-Ford Algorithms
113
Q4. Describe Dijkstra's Algorithm and its primary constraint.
Mechanism: A greedy algorithm that repeatedly selects the vertex with the minimum distance estimate from a priority queue and relaxes its outgoing edges.Complexity: $O(E \log V)$ when using a binary heap.Constraint: It requires all edge weights to be non-negative ($w(e) \ge 0$). If negative weights are present, the greedy choice may lead to incorrect results.
114
Q5. When should the Bellman-Ford Algorithm be used?
It should be used for graphs that contain negative edge weights.Mechanism: It relaxes all edges in the graph $(|V| - 1)$ times.Negative Cycles: After the iterations, it performs a final check. If any edge can still be relaxed, it identifies a negative cycle (a cycle where the sum of weights is less than zero), in which case no shortest path exists.Complexity: $\Theta(V \cdot E)$.IV. Advanced Concepts
115
Q6. What is the "All-Pairs Shortest Paths" problem and how is it solved?
Goal: To find the shortest path between every pair of vertices $(u, v)$ in a graph.Algorithm Idea: 1) Add an artificial node $s$ with zero-weight edges to all other nodes. 2) Run Bellman-Ford from $s$ to compute potentials. 3) Use these potentials to "re-weight" edges so they are all non-negative. 4) Run Dijkstra's algorithm from every vertex.
116
Q7. What are the common variants of the shortest path problem?
Single-source: From one source to all other vertices.Single-destination: To one destination from all other vertices (can be solved by reversing edges).Single-pair: Between two specific vertices.All-pairs: Between every possible pair of vertices in the graph.
117
Q1. What is the formal definition of the Minimum Spanning Tree problem?
Given an undirected connected graph $G=(V,E)$ with positive weights on edges (defined by a weight-function $w:E ightarrow R^{+}$), find a connected subgraph $G'=(V,E')$ such that the sum of weights on the edges in $E'$ is the minimum possible. The resulting graph must connect all nodes of the original graph and contain no cycles.
118
Q2. Explain the "Cut Property" of MSTs.
For any cut of the graph, if an edge $e$ is the unique minimum-weight edge crossing that cut, then $e$ must belong to every MST of the graph.
119
Q3. Explain the "Cycle Property" of MSTs.
For any cycle in the graph, if an edge $e$ is the unique maximum-weight edge in that cycle, then $e$ cannot belong to any MST of the graph.II. Prim's Algorithm
120
Q4. Describe the basic idea behind Prim's algorithm.
Prim's algorithm is a greedy approach that is quite similar to Dijkstra's algorithm. It starts from an arbitrary root node and grows the tree one edge at a time by always adding the lowest-weight edge that connects a vertex in the tree to a vertex outside the tree.Partial Solution: The partial solution in Prim's algorithm is always connected (it is a tree).
121
Q5. What is the complexity of Prim's algorithm?
When implemented with a binary heap, the time complexity is $O(E \log V)$.III. Kruskal's Algorithm
122
Q6. Describe the basic idea behind Kruskal's algorithm.
Kruskal's algorithm is a greedy approach where edges are considered in non-decreasing order of their weights. An edge is added to the partial solution only if it does not create a cycle.Partial Solution: Unlike Prim's, the partial solution in Kruskal's is not necessarily connected; it is a forest that eventually merges into a single tree.
123
Q7. What is the complexity of Kruskal's algorithm and when is it preferred?
The complexity is $O(E \log E)$, which is effectively $O(E \log V)$. It can be faster than Prim's on sparse graphs where $E = O(V)$ if a fast implementation of Union-Find is used.
124
Q8. What is the "Streaming Mode" of Kruskal's algorithm?
Kruskal's can be used in scenarios where edges arrive one-by-line through a network connection. Even if edges are not sorted by weight, a fast version can be devised with $O(E \log V)$ time complexity and $O(V)$ space complexity.IV. Supporting Data Structures
125
Q9. What is the Union-Find abstract data structure and its role in MST?
Union-Find is used to manage disjoint sets of vertices. It is essential for Kruskal's algorithm to efficiently check if two vertices belong to the same connected component (to avoid cycles) and to merge components when an edge is added.
126
Q10. Compare Prim's and Kruskal's algorithms.
* Connectivity: Prim's partial solution is always a connected tree, while Kruskal's is a forest.Priorities: Prim's uses edge weights as priorities (similar to distances in Dijkstra), while Kruskal's relies on sorting edges by weight.Graph Density: Prim's is generally a good choice for most cases, but Kruskal's is often faster for sparse graphs.