Unity Catalog Flashcards

(19 cards)

1
Q

What is Unity Catalog?

A

Unity Catalog is a centralized data catalog that provides access control, auditing, lineage, quality monitoring, and data discovery capabilities across Databricks workspaces

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 3 architectural levels that surrounds UC?

A

1.Compute (Apache)
2.UC
3.Data Storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the four levels UC catalog object model?

A

1.Metastore
2.Catalog and Non-Data Objects
3.Schema
4.Data Objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Metastore?

What is the best practice for using them?

A

A top level container that registers metadata about data and the permissions that govern access to them.

Each cloud region should have a different Metastore.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Whats is a Catalog?
How are they typically used?

A

Catalog is an organization unit for scheams.

They should mirror functional organizations e.g. (Sales, HR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are 4 examples of a securable object for external data source access?

A

1.Storage Access Credentials
2.Service credentials
3.External locations
4.Connections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are 4 examples of a securable object for shared access?

A

1.Clean Rooms
2.Shares
3.Recipients
4.Providers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a Schema?

A

Logical layer containing data objects under Catalogs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are 5 examples of Objects?

A

1.UDF
2.Volume
3.Table
4.View
5.Function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a Volume?

How is it used?

A

Logical object layer under Schema.

Volumes store, organize, and access files containing structured and unstructured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 2 key features of a Managed volume ?

A

1.Contains managed tables

  1. Automatically stored in the default manage storage location of its schema.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the 2 key features of an external volume?

What is its use case?

A

1.Placed in a prexisting cloud storage location

2.Allows read/write w/o cloud specific priviliges

Onboarding existing data lakes, governing non-Delta data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 3 roles that have default admin access to Unity Catalog?

A

1.Account Admins
2.Workspace Admins
3.Metastore Admins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a Managed Table?

A

UC manages both the governance and the underlying data files.
When table deleted underlying data is deleted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is an External Table ?

A

Specified storage location where data persists after table is droped.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is a Connection created?

How is it used?

A
  1. Create a new access connector in Azure Portal
  2. Assign Storage Blob Contributor role to the acess connector in IAM
  3. Create a storage credential in UC
  4. Create an external location and link it to the path and SC
17
Q

What happens when a batch of Insert records violates on or more table constraints? What happens if this situation occurs in a stream?

A

The whole batch/job fails and no records are written.

18
Q

Can a constraint be added to a table that contains records that violate it?