VDBMS Vector Databases Flashcards

(13 cards)

1
Q

Semantic search

A

Query Embedding -> Nearest Neighbor Search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Nearest Neighbor queries: naive method

A

*full scan
*sort
*top-k
*Inefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Nearest Neighbor queries: Index-based method

A
  • Build a spatial index on embeddings
    and use kNN search.
    *Curse of dimensionality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

definition of RAG

A

Retrieval Augmented Generation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what does Retrieval Augmented Generation do when given a query

A

*Create a query embedding
*Run an approximate nearest neighbor (ANN) query
to find the most relevant documents
*With an appropriate prompt, feed the documents
through a generative AI model to get the final
answer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Vector Indexing

A
  • A way to index multidimensional data points
  • Given a query point, find the nearest data point
  • Traditional spatial indexes (based on Euclidean
    distance) work for low-dimensional data
  • As number of dimensions increases, all points
    become equally near
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Locality Sensitive Hashing (LSH)

A
  • Traditional hashing is good for point queries with
    exact match
  • Locality Sensitive Hashing (LSH) works for proximity
    queries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Product Quantization (PQ) Operations

A

*Break down a big vector into subvectors
*Quantize the value of each subvector - Reduce the size of each vector
*Store a list of vector IDs (record IDs) and their corresponding quantized vector
*For a query vector (q), do the same quantization
*Compute the distances to the quantized vectors and the find the nearest one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Navigable Small World (NSW)
basic concept

A

*Connect each data point to its
nearest neighbors
*Connect each data point to a few
faraway neighbors
*Search: Follow the edges to the
target point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Hierarchical Navigable Small World (HNSW)
basic concept

A

*Create a hierarchy of graphs
*Each vertex is elevated to the higher level with a probability
*Search starts at the highest level and goes to the next level when the search converges

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Key indexing tools:

A

LSH, HNSW, Product Quantization (PQ)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Vector DBs integrate what

A

embedding storage, ANN search, and RAG

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly