Data Platform & AI Flashcards

(40 cards)

1
Q

What is market data?

A

Dynamic financial data such as prices, quotes, trades, volumes, order book updates, yields, and implied volatilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is reference data?

A

Instrument metadata such as ticker, ISIN, exchange, currency, lot size, and trading calendar.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a security master?

A

A centralized system or dataset used to maintain canonical instrument identities and metadata.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is a security master important?

A

It ensures the same instrument is recognized consistently across research, trading, risk, and operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is symbology in finance?

A

The identifiers used to refer to instruments, such as ticker, ISIN, CUSIP, or SEDOL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why is symbology hard in finance?

A

Different vendors and systems may use different identifiers or represent instruments differently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is corporate actions data?

A

Data describing events like dividends, splits, mergers, and spin-offs that affect instruments and positions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is corporate actions handling a hard data problem?

A

It changes positions, pricing histories, entitlements, and instrument relationships over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is event time?

A

The time an event actually occurred in the market.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is processing time?

A

The time a system received or processed the event.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why does event time matter?

A

Accurate sequencing and backtesting require the true market-time ordering of events.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is out-of-order data?

A

Events arriving in a different order than they occurred.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is late data?

A

Events that arrive after the time they should ideally have been processed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a corrected trade or cancel/correct message?

A

A message that changes or reverses a previously reported trade.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is survivorship bias in market data?

A

The error of only analyzing securities that still exist and ignoring delisted or failed ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is look-ahead bias?

A

Using information in a backtest that would not have been known at the time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why are survivorship bias and look-ahead bias dangerous?

A

They make historical performance look unrealistically good.

18
Q

What is a feature store in a quant or AI context?

A

A managed repository of reusable model features with consistent definitions and lineage.

19
Q

Why is lineage important in financial data platforms?

A

It helps trace where data came from, how it was transformed, and whether it is trustworthy.

20
Q

What is data provenance?

A

The documented origin and history of a dataset.

21
Q

Why is auditability important in finance?

A

Systems often need to explain how outputs were produced for risk, compliance, and operational review.

22
Q

What is a data quality check in market data?

A

A rule that detects missing, stale, outlier, duplicated, or inconsistent data.

23
Q

What is stale data?

A

Data that has not updated when it should have.

24
Q

What is a market calendar?

A

The schedule of trading days, holidays, and session times for a venue or market.

25
Why are market calendars critical?
Trading sessions, holidays, and settlement rules differ across markets.
26
What is a point-in-time dataset?
A dataset reconstructed as it would have been known at a specific historical moment.
27
Why are point-in-time datasets important?
They are essential for honest backtesting and model validation.
28
What is a research data platform in a hedge fund?
Infrastructure that supports ingestion, normalization, storage, querying, and modeling of financial and alternative data.
29
What is alternative data?
Non-traditional datasets such as satellite, card spend, web, shipping, geolocation, or textual datasets used in research.
30
What is an alpha signal?
A model output or indicator intended to predict excess return.
31
What is model drift in finance?
When model performance degrades because market behavior or data relationships change.
32
Why is model monitoring important in hedge funds?
Market regimes change, data can break, and poor models can cause losses.
33
What is a realistic AI use case in a hedge fund?
Entity resolution, document parsing, data quality anomaly detection, reconciliation break classification, or research search.
34
What is entity resolution in finance?
Matching records that refer to the same real-world entity or instrument across inconsistent sources.
35
What is reconciliation break classification?
Using rules or models to identify and categorize mismatches between systems.
36
What is an operational copilot in a hedge fund?
An AI assistant that helps engineers or operations teams diagnose incidents, navigate lineage, or query runbooks.
37
What is a safe framing for AI in a hedge fund interview?
AI is often most valuable for data quality, operational efficiency, and research workflows rather than blindly generating trades.
38
What is one strong technical sentence about finance data?
In finance, a large share of the engineering challenge is symbol mastering, event-time correctness, corporate actions, and cross-source reconciliation.
39
What is one strong technical sentence about AI in finance?
The best AI use cases in hedge funds usually improve data trust, research productivity, and operational resilience rather than replacing disciplined risk processes.
40
What is a strong summary of a hedge fund data platform?
It connects research, market data, execution, risk, PnL, and post-trade workflows with reliable, auditable, point-in-time data products.