S7 - Operations & Reliability Flashcards

(20 cards)

1
Q

SRE

A

Site Reliability Engineering — apply software principles to maintain reliability and performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

SLI, SLO, SLA

A

SLI = metric; SLO = target; SLA = contract.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

error budget

A

The acceptable failure allowance within SLO before breaching SLA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Cloud Operations suite

A

Monitoring, Logging, Trace, Profiler, Debugger, Error Reporting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

export logs for long-term compliance

A

Aggregated sinks exporting to BigQuery or Cloud Storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

backup and DR

A

Backup = data copy; DR = live standby system ready to take over.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

RTO and RPO

A

RTO = time to recover; RPO = data loss window.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

HA for VMs

A

Use Regional Managed Instance Groups with load balancing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

HA for Cloud SQL

A

Enable high availability (dual-zone replicas).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

alerts trigger automation

A

Cloud Monitoring alert → Pub/Sub → Cloud Function → remediation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

reduce toil

A

Automate repetitive ops tasks to increase reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

visualize network performance issues

A

Network Intelligence Center or Monitoring dashboards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Blue-Green deployment

A

Two environments — switch traffic only after verifying the new version.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

rolling deployment

A

Gradually update instances with no downtime.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Infrastructure as Code

A

To define infrastructure declaratively (Deployment Manager/Terraform).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

service analyzes CPU/memory performance

A

Cloud Profiler.

17
Q

service traces latency across distributed apps

18
Q

Detect suspicious IAM activity

A

Event Threat Detection (SCC).

19
Q

Prevent overspend while maintaining uptime

A

Budget alerts + sustained/committed use discounts.

20
Q

Self-healing compute layer

A

Managed Instance Groups with health checks.