BDR General Knowledge Flashcards

(62 cards)

1
Q

What is a Tech Stack

A

A tech stack is the combination of tools used to build software.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Database?

A

A database is a system for storing, organizing, and accessing data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the different types of databases?

A
  1. Relational (like PostgreSQL) that stores data in rows and columns.
  2. NoSQL - better for flexible or unstructured data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is SQL?

A

SQL is that language used to work with relational databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is PostgreSQL and why is TigerData built on postgres?

A
  1. PostgreSQL is a very flexible and powerful open-source database.
  2. Developers use it for everything from side projects to big apps.
  3. It’s reliable, battle-tested, and supports lots of different use cases.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is TigerData?

A
  1. TigerData makes Postgres even better for time-series data.
  2. Developers get the best of both worlds: full SQL and time-series performance.
  3. No need to learn a new language or system - just use the Postgres you know and love.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the Key takeaways of TigerData?

A
  1. TigerData is a time-series database built on PostgreSQL
  2. It helps developers store and analyze data that changes over time—like metrics, logs, and events
  3. The company was founded to solve real-world problems developers were facing with time-series data
  4. Our vision: make powerful data tools that are easy to use and built for the long haul
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Time-Series Data?

A

Time-series data is a collection of data points indexed in time order. They are simply measurements or events that are tracked, monitored, downsampled, and aggregated over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Time-Series Database?

A

A Time-Series Database is a type of database specifically designed for handling time-stamped or time-series data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are some of TigerData’s differentiators?

A
  1. Built on top of PostgreSQL
  2. Ecosystem innovation flywheel
  3. Performance at scale
  4. Resource Efficiency
  5. Developer Productivity
  6. Production Grade
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are Continuous Aggregates?

A

CAGGs (Continuous Aggregates) are precomputed summary views that update automatically as new data arrives. They make aggregate queries (like averages or totals over time) much faster by storing results instead of recalculating every time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Hypercore?

A
  1. Hypercore is a new storage engine built for speed and scale.
  2. It’s optimized for high-ingest time-series workloads (think lots of data, very fast)
  3. Combines the flexibility of row-based storage with performance optimizations under the hood.
  4. Makes TigerData ideal for dashboards, observability, and real-time use cases.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a Hypertable?

A

A hypertable is a big time-series table that TimescaleDB automatically splits into smaller pieces (called chunks) to keep things fast — but you still query it like one normal table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does TigerData’s columnar compression reduce storage and improve performance?

A
  1. TigerData compresses historical data using columnar storage.
  2. Compression significantly reduces storage costs - up to 90% in savings
  3. Compressed data remains queryable with standard SQL
  4. Helps improve performance for historical queries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How is TigerData’s storage different from native PostgreSQL?

A

TigerData optimizes PostgreSQL with automatic compression and tiering, reducing storage needs and speeding up queries — unlike native Postgres, which stores everything in row format without built-in time-series optimizations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is tiered storage in Timescale/TigerData?

A

Tiered storage moves older data to cheaper object storage (like S3) while keeping recent data on fast local storage — so you can scale PostgreSQL affordably without losing query access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How does tiered storage work in TigerData and how does it help manage large volumes of time-series data efficiently?

A
  1. Tiered storage separates hot (recent) and cold (historical) data
  2. Recent data stays on fast local storage for quick access
  3. Older data is offloaded to object storage (like Amazon S3), which is much cheaper
  4. All data remains queryable via SQL—no need to manually manage or rehydrate it
  5. Great for reducing storage costs while retaining long-term historical data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why does TigerData use compression?

A

To help teams handle large volumes of time-series data affordably by reducing storage costs, improving query performance, and keeping infrastructure lean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What kind of data does TigerData compress?

A

Historical (older) time-series data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Can compressed data still be queried normally?

A

Yes - it’s fully queryable with standard SQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How is TigerData’s compression organized?

A

Data is compressed by column (columnar storage), not by row like regular PostgreSQL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the benefits of columnar compression?

A

Smaller storage footprint, less disk I/O, and faster analytical queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Why is columnar compression good for time-series data?

A

Because time-series data is often append-only and rarely updated, making it perfect for column-based blocks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is PostgreSQL’s default method for handling large values?

A

TOAST (The Oversized-Attribute Storage Technique).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Why isn’t TOAST great for time-series data?
It’s not optimized for massive, append-only datasets and can cause storage bloat.
26
How does TigerData improve on TOAST?
TigerData’s compression is purpose-built for large time-series workloads, offering higher efficiency and faster queries.
27
What is tiered storage?
A way to separate “hot” recent data from “cold” historical data for cost efficiency.
28
Where is “hot” data stored?
On fast local storage for quick access.
29
Where is “cold” data stored?
On cheaper object storage (like Amazon S3).
30
Can you still query data stored in object storage?
Yes — all data remains accessible via SQL without manual rehydration.
31
What’s the main benefit of tiered storage?
It allows teams to scale PostgreSQL affordably without losing access to historical data.
32
How has AI evolved over time?
From traditional rule-based AI to machine learning to Gen AI and large language models (LLMs)
33
Why is data retrieval important for AI?
Because GenAI systems need the right data at the right time for accurate and relevant outputs.
34
How does AI relate to TigerData?
TigerData helps retrieve data fast and at scale - which is key for GenAI apps that depend on context-rich retrieval.
35
What is pgvector?
Pgvector is an open-source Postgres extension that adds vector search capabilities directly inside PostgreSQL.
36
Why is Postgres a good fit for AI workloads?
1. Developers can store and query embeddings in one familiar system. 2. No need to add a separate vector database like Pinecone or Weaviate. 3. Leverages Postgres's SQL ecosystem and simplicity.
37
How does TigerData enhance Postgres for AI?
It adds scale, performance, and cost-efficiency for running AI workloads in production.
38
How does TigerData + pgvector compare to other vector databases like Pinecone?
1. Matches or beats Pinecone in search performance 2. Costs about 75% less 3. 100% open source and fully SQL-based
39
What is the business value with TigerData's offering?
Fragmentation — they use multiple databases for different tasks (transactions, logs, search, embeddings), which is expensive and hard to manage.
40
Tiger Postgres = all-in-one system for:
1. Time-series ingestion 2. Real-time analytics 3. Keyword (BM25) + vector search
41
How does TigerData handle high-throughput pipelines?
Timescale hypertables allow 10M+ rows/sec inserts with compression.
42
What’s “real-time retrieval”?
The ability to quickly run hybrid searches (vector + keyword) for use cases like fraud detection or “show me similar cases.”
43
Which extensions make hybrid search possible in Postgres?
pgvectorscale + BM25 — both run directly inside Postgres.
44
Why do enterprises prefer Tiger Cloud?
They get one fully Postgres-compatible stack instead of managing multiple datastores.
45
What’s the main goal of this section?
To understand who TigerData is built for, what problems it solves best, and how to identify ideal customers and use cases.
46
How would you describe TigerData in one line?
It’s Postgres for demanding, real-time, large-scale applications — not just time series.
47
What makes TigerData unique?
It breaks the tradeoff between performance and flexibility — as fast as ClickHouse, but as versatile as Postgres.
48
Who is TigerData designed to serve?
Technical teams and data leaders who manage large-scale, high-ingest data systems — especially those using PostgreSQL for real-time analytics, time-series data, or AI workloads.
49
What type of customer is a great fit for TigerData?
Companies with high-ingest workloads, time-stamped data, and a preference for SQL/Postgres who want performance and scalability without switching databases.
50
What roles or personas do we focus on?
technical leaders, data engineers, and infrastructure-focused teams. Not front-end or low-code devs.
51
What main problem does TigerData solve?
It helps teams handle massive data volumes in real time — fast ingest, fast queries, long retention — all while keeping costs low and staying on Postgres.
52
How does TigerData make Postgres better?
It removes the tradeoff between performance and flexibility — as fast as ClickHouse, but as flexible and familiar as Postgres.
53
What kind of data is TigerData best for?
Time-series data — metrics, events, and logs that grow quickly and need fast access for analysis.
54
What are TigerData’s top use cases?
1. Real-time analytics (dashboards, monitoring) 2. Observability & metrics (DevOps, infra, telemetry) 3. IoT monitoring (sensors, devices) 4. Financial & transactional data (payments, trading) 5. User behavior analytics (app usage, events)
55
When do customers choose TigerData?
When they need real-time ingest, fast queries, and long-term storage — without breaking their SQL workflows.
56
Why is real-time analytics hard in Postgres?
Because traditional Postgres slows down as data volume grows — TigerData fixes that with optimized storage, compression, and architecture built for scale.
57
What’s TigerData’s core technology built on?
PostgreSQL — enhanced for time-series and real-time workloads.
58
What makes TigerData unique compared to other databases?
1. Faster ingest and queries for big datasets 2. SQL compatible (no new language to learn) 3. Columnar compression and tiered storage to cut costs 4. Optimized for scale + developer experience
59
What databases does TigerData outperform?
InfluxDB, MongoDB, and native Postgres — especially for large, time-based workloads.
60
What makes TigerData easy for teams to adopt?
It’s SQL-compatible and Postgres-based — so developers can scale without changing tools or retraining teams.
61
What’s TigerData’s biggest differentiator?
We help teams scale Postgres to handle massive real-time workloads with no performance tradeoff — combining speed, flexibility, and cost efficiency.
62
What kind of customer pain points does TigerData solve?
1. Postgres slowing down at scale 2. Data storage costs exploding 3. Fragmented data systems (multiple databases) 4. Limited real-time visibility into fast-moving data