DEFINE B-Tree:
B-Tree of order m is a type of n-ary tree
- A Multi array tree each node has many children
NOT BINARY TREE !!
What are the Properties of a B-TREE
Why is this:
“Every NON leaf node, bare root, has at least m/2 children” Ideal?
Guarantees:
-Not having long paths from Root => Leaf
- Order log(n) ==> Guarantees a Balanced Tree
Why is this:
“All leaf nodes appear on same Level”
Ideal?
Guarantees:
- EVERY Path From Root => Leaf is the same Length
- This also guarantees order log(n)
What is the Purpose of :
“ NON- leaf with c children have c-1 search key values which guide searches down correct subtree” ?
Decides which path we take by comparing values we are searching for with these values.
Thus no need to correspond to actual records we are storing
(DIRECTION THINGS)
What causes the Definition of B-Tree to vary amongst Authors?
m
as it could be:
- no. children
- min no. children
- max no. children
- max/min no. keys in NODE
Does B-Tree have any SIGNIFICANT advantages over AVL or other Height Balanced Binary trees, when used as in-memory data structure?
NO
- Its just more complex to implement
What variant of B-Tree is used as the main-stay of Databases and the most comment external data structure in use?
B+ Tree
What does Data Record mean?
An element of data information to be managed by the B+Tree
Define Key Value:
Value by which we identify the record
NOTE: can have key values with no records
Define Discriminator:
Value used to decide which path to take down the tree in searching for record
NOTE: can have key values with no records
Define Disk Address:
Offset from the start of the disk file to a particular block in file
THINK of it as a pointer to Disk Memory
- Tells us which block number on disc are we storing our block
Describe the External Data Structure?
Is one which is Stored on eternal or secondary memory rather than internal memory RAM.
MEANING that the Data Structure:
- Persists beyond the end of the program execution without
the programmer having to explicitly save/ load the
structure to/from disk (Program Crashes the DATA is still in
B+ Tree)
What happens if Program ENDs?
It will remain in the disc so next time all data in B+ Trees is still there
What is Record-Embedded?
What is an Index B+ Tree?
Stores only key values & disk addresses in the leaf nodes, Identifies the disk block in the separate file of data where the record with corresponding key value resides
Primary Index B+ Trees
1 Pointer for each block on lowest level
E.g 1000 discriminator values then you have 1001 disc pointers to 1001 record files
Secondary Index B+ Trees
Search Operations in RECORD EMBEDDING?
SEARCH :
Start at root & follow path down indicated by discriminator values.
RANGE:
Search for record with a key at one end of the range and iterate using next/pre disk addresses in leaf level, over all records in range
Complexity of B+ Trees?
n-ary tree, with height balanced, so search & Insertion is going to be O(log n)
Cost of the disk read is much large than cost of the processing of in-memory aspects of the data structure that we can ignore this cost