PowerBI Flashcards

Question

🟥 CARDINALITY — PERFORMANCE FACTOR 🃏 Card 10 — Cardinality Definition Q: What is cardinality in the context of relationships?

Answer 1

A: The uniqueness of values in columns: One-to-many Many-to-one Many-to-many High cardinality → lower compression.

Answer 2

A: Because compression relies on repeated values. Unique values → larger memory footprint → slower scans.

Answer 3

A: Ambiguous filter paths Double counting Complex DAX Often indicates poor model design.

Answer 4

A: It introduces an intermediate dimension that: Normalizes relationships Controls filter flow Restores star structure

Answer 5

A: Although normalized, it: Increases relationship complexity Requires more joins Reduces performance Star schema preferred for analytics.

Answer 6

A: They: Ensure uniqueness Support slowly changing dimensions Improve performance Avoid business logic changes

Answer 7

A: Because time intelligence requires: Continuous date range Hierarchies Fiscal calendars Custom attributes

Answer 8

A: Incorrect totals Need for complex DAX workarounds Slow performance Circular relationships Many-to-many everywhere

Answer 9

A: Excel: Cell-by-cell calculation Static references DAX: Columnar evaluation Context-dependent Query-based DAX formulas answer: “What value should be returned under the current filter context?” Not: “What is this cell equal to?”

Answer 10

A: You specify what result you want, not how to compute it. The engine decides execution strategy using: Storage engine (VertiPaq) Formula engine

Answer 11

A: A set of column filters that restrict visible rows before aggregation. It originates from: Report filters Slicers Visual axes DAX functions Filter context determines: Which rows exist for calculation.

Answer 12

A: Row context = current row pointer during iteration. Exists in: Calculated columns Iterators (SUMX, FILTER, etc.) Does NOT automatically filter tables.

Answer 13

A: Because it modifies filter context. It can: Add filters Remove filters Replace filters Trigger context transition Nearly all advanced measures rely on CALCULATE.

Answer 14

A: Row context → converted into equivalent filter context. This allows row-based values to affect aggregations. Without it: Measures inside row context ignore the current row.

Answer 15

A: Measures evaluate only in filter context. Row context exists but is invisible to them.

Answer 16

A: Removes filters from specified columns or tables. Creates a new filter context with those filters cleared. Used for: Percent of total Baseline calculations

Answer 17

A: It expresses intent clearly: Remove filters without returning a table. Improves readability and avoids confusion.

Answer 18

A: When you want totals across all categories except certain grouping columns. Example: Total sales across all products but per region.

Answer 19

A: Because simple aggregations cannot evaluate row-level expressions before aggregation. Iterators: Create row context Evaluate expression per row Aggregate results

Answer 20

A: They invoke the formula engine row by row rather than using storage engine aggregations.

Answer 21

A: They modify filter context on the date table. Example: SAMEPERIODLASTYEAR shifts date filters backward one year.

Answer 22

A: Measures: Computed at query time Not stored Calculated columns: Stored physically Increase model size

Answer 23

A: Storage Engine: Retrieves compressed data Performs scans and aggregations Formula Engine: Executes DAX logic Handles complex operations Performance depends on minimizing Formula Engine work.

Answer 24

Back (Advanced) Power BI Service is the cloud SaaS platform where BI content is: * Published * Shared * Collaborated on * Consumed It enables organizational analytics distribution. Key roles: Authoring workflow Power BI Desktop → Publish → Service Consumption workflow Service → Dashboards → Apps → Users Key capabilities: * Sharing reports * Creating dashboards * Managing workspaces * Data refresh * Security & permissions The Service is the collaboration and distribution layer of Power BI. 👉 Desktop = Development 👉 Service = Deployment + Consumption Publishing reports to the cloud enables sharing and collaboration.

Answer 25

Back 🔹 Dataset (Semantic Model) The structured data model used by reports. Contains: * Tables * Relationships * Measures * Calculations Acts as the single source of truth.

Answer 26

Multi-page interactive document built from a dataset. Features: * Visualizations * Filters * Drillthrough * Slicers Reports answer specific analytical questions.

Answer 27

Single-page canvas composed of pinned visuals from reports. Purpose: * Executive overview * Monitoring KPIs * Cross-report summary Dashboards aggregate visuals from multiple reports

Answer 28

Back Hierarchy: Dataset → feeds → Reports → feed → Dashboards Key principle: MANY reports can use ONE dataset ONE dashboard can use visuals from MANY reports Dashboards = curated insight layer

Answer 29

Back A workspace is a collaboration container for BI assets. Contains: * Datasets * Reports * Dashboards * Dataflows Workspaces enable: * Team collaboration * Access control * Version management Personal vs Organizational: My Workspace Private sandbox for individual use App Workspace Team-shared environment Workspaces allow teams to collaborate on reports.

Answer 30

Back Typical role hierarchy: Admin Full control (including permissions) Member Can edit content Contributor Can publish & update Viewer Read-only access Enterprise BI depends heavily on role governance.

Answer 31

Back Apps are packaged BI content distributed to end users. They provide: * Controlled distribution * Read-only experience * Versioned releases Apps are ideal for: * Business users * Executives * Non-technical stakeholders Apps provide curated consumption experience.

Answer 32

Back Workspace = Development & collaboration App = Consumption & distribution Workspace users build content App users consume content Think: Workspace = Factory App = Finished product

Answer 33

Back Power BI Service maintains up-to-date data through refresh mechanisms: Scheduled Refresh Automatic updates at intervals Manual Refresh Triggered by user Real-time options * DirectQuery * Live Connection These allow near real-time analytics.

Answer 34

Back Gateway enables secure connection between: On-premises data sources ↔ Power BI Service Use cases: * SQL Server in corporate network * Local databases * Internal APIs Gateway is essential for enterprise deployments.

Answer 35

Back RLS restricts data visibility based on user identity. Example: Manager sees all regions Sales rep sees only their region Security is enforced inside the dataset. RLS ensures users only see authorized data.

Answer 36

Back Power BI Service supports: * Cross-filtering * Cross-highlighting * Drill-down * Drill-through These create exploratory analytics.

Answer 37

Back Power BI Mobile allows access to dashboards anywhere. Mobile BI is crucial for decision-makers on the move.

Answer 38

Back A tenant is the organizational container for all Power BI activity. It defines: * Users * Security policies * Licensing * Data governance rules * Admin settings Everything in Power BI Service exists inside a tenant. Enterprise insight: Tenant configuration determines what users are allowed to do.

Answer 39

Back Without governance: * Dataset sprawl * Duplicate metrics * Security risks * Inconsistent reporting Governance ensures: Single source of truth Controlled sharing Compliance Enterprise BI success depends more on governance than dashboards.

Answer 40

Back Shared Capacity Resources shared among tenants Lower cost Performance variability Premium Capacity Dedicated resources Predictable performance Large model support Advanced features Premium is required for enterprise-scale BI.

Answer 41

A: Not just model design — also capacity resources: * CPU * Memory * Concurrent queries * Refresh workload Performance = Model Quality × Capacity

Answer 42

Back Dataflows move data preparation to the cloud. Benefits: * Reusable transformations * Centralized ETL * Consistent data across reports * Reduced duplication Think: Power Query in the Service.

Answer 43

Back Dataflow = Data preparation layer Dataset = Semantic modeling layer Separation improves scalability and governance.

Answer 44

Back Deployment pipelines enable controlled release of BI content. Stages: Development Authoring and experimentation Test Validation and QA Production Business consumption Prevents breaking reports used by executives.

Answer 45

A: They bring software engineering practices to BI: * Version control mindset * Safe updates * Rollback capability Enterprise BI requires change management.

Answer 46

Back Certification levels communicate trust: Promoted Recommended by authors Certified Approved by governance team Encourages reuse of authoritative data.

Answer 47

A: Because of: * Duplicate datasets * Independent report creation * Lack of governance Centralized certified datasets solve this.

Answer 48

Back Large organizations separate workspaces by: * Department * Project * Sensitivity level This limits data exposure and simplifies management.

Answer 49

Back Allows restricting access to specific tables or columns within a dataset. Used when: Different users require different data visibility.

Answer 50

A: Because shared capacity imposes size and memory limits. Premium enables: * Large models * Incremental refresh * Better concurrency

Answer 51

Back Refresh only recent partitions instead of entire dataset. Benefits: * Faster refresh * Reduced resource usage * Enables very large historical datasets

Answer 52

Back Centralized datasets (hub) feed multiple reports (spokes). Advantages: * Consistency * Reusability * Reduced maintenance Common in mature BI environments.

Answer 53

A: Self-service: * Flexibility * Speed Governance: * Control * Consistency Enterprise BI must balance both.

Answer 54

A: VertiPaq is the in-memory columnar storage engine used by Power BI (Import mode) and Analysis Services Tabular. Why it’s fast: COLUMNAR STORAGE (not row-based): Stores each column separately Queries scan only needed columns Eliminates unnecessary I/O Row store (SQL): Row1: A B C D E Row2: A B C D E Column store (VertiPaq): A: A A A A A B: B B B B B C: C C C C C 👉 Aggregations (SUM, COUNT, AVG) operate on one column vector

Answer 55

Example: Product column: Laptop Phone Laptop Tablet Phone Dictionary: 1 = Laptop 2 = Phone 3 = Tablet Data stored: 1,2,1,3,2

Answer 56

Best for sorted columns 111111122222333 → (1x7), (2x5), (3x3)

Answer 57

If numeric range small: Actual values: 1000–1005 Stored as: 0–5

Answer 58

Typical compression ratio: 10x — 100x smaller than raw data

Answer 59

A: VertiPaq optimizes for: Few large fact tables Many small dimension tables One-to-many relationships

Answer 60

More joins More relationship traversals Larger filter propagation cost

Answer 61

VertiPaq can: Push filters efficiently Reduce scan space Use compressed dictionaries effectively

Answer 62

Bad examples: GUIDs Transaction IDs Timestamps (to the second) Text fields Why bad? Dictionary becomes huge → memory ↑ → scan cost ↑

Answer 63

A: 1️⃣ Storage Engine (SE) VertiPaq engine Handles: Data scans Aggregations Filtering Compression usage Runs in parallel Highly optimized C++ code FAST 🔥

Answer 64

Handles: DAX evaluation Complex logic Iterators (SUMX, FILTER, etc.) Context transitions Single-threaded Not vectorized Slower

Answer 65

The more work pushed to Storage Engine, the faster the query.

Answer 66

A: Iterators force row-by-row evaluation in Formula Engine. Example: SUMX(Sales, Sales[Quantity] * Sales[Price]) Process: Iterate each row Compute expression Aggregate results Instead of vectorized column aggregation.

Answer 67

Better: Create calculated column: Sales[LineTotal] = Quantity * Price Then: SUM(Sales[LineTotal]) Now Storage Engine can aggregate directly.

PowerBI Flashcards

(91 cards)