Chapter 4 Flashcards

Designing a Data Processing Solution (14 cards)

1
Q

Which 4 computing engines are available in GCP?

1

A
  • Compute Engine
  • Kubernetes Engine
  • App Engine
  • Cloud functions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain Compute Engine

A

Provides scalable virtual machines (VMs) on Google’s infrastructure. It offers deep customization, full control over OS, networking, and storage, ideal for traditional applications, batch processing, and workloads requiring consistent utilization or custom OS kernels.

  • Runs VMs
  • Highly configurable
  • Maximum control
  • Scales with MIGs
  • Good for lift and shift
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain Kubernetes Engine

A

A managed Kubernetes service for deploying, managing, and scaling containerized applications. Ideal for microservices architectures, hybrid or multi-cloud deployments, and Kubernetes orchestration.

  • Runs containers
  • Container orchestration
  • Managed service
  • Multiple environments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain App Engine

A

A fully managed platform-as-a-service (PaaS) for building scalable web applications without infrastructure management. It automatically handles scaling and is suitable for stateless apps, rapid development, and API/backend services.

  • Focus on app code
  • Language runtimes
  • containers
  • Serverless
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain Cloud Functions

A

Serverless compute for event-driven applications. Users write small pieces of code that respond to events from various GCP sources without managing servers.

Cloud Functions is a serverless, managed compute service for running code in
response to events that occur in the cloud. Events are supported for Cloud Pub/
Sub, Cloud Storage, HTTP events, Firebase, and Stackdriver Logging.

  • Focus on function code
  • Event driven
  • Serverless
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Hub-and-spoke message broker pattern works

A

The hub-and-spoke message broker pattern is an architectural style where a central hub acts as the message broker, and connected applications or systems act as spokes. The hub is responsible for receiving messages from the various spokes, performing message routing, transformation, and protocol mediation, then distributing the messages to the appropriate destination spokes. This central hub decouples the sender and receiver applications, allowing them to communicate without awareness of each other’s details.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A startup is designing a data processing pipeline for its IoT platform. Data from sensors
will stream into a pipeline running in GCP. As soon as data arrives, a validation process,
written in Python, is run to verify data integrity. If the data passes the validation, it
is ingested; otherwise, it is discarded. What services would you use to implement the
validation check and ingestion?

A

IoT sensors can write data to a Cloud Pub/Sub topic. When a
message is written, it can trigger a Cloud Function that runs the associated code. Cloud
Functions can execute the Python validation check, and if the validation check fails, the
message is removed from the queue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Your finance department is migrating a third-party application from an on-premises
physical server. The system was written in C, but only the executable binary is available.
After the migration, data will be extracted from the application database, transformed, and
stored in a BigQuery data warehouse. The application is no longer actively supported by
the original developer, and it must run on an Ubuntu 14.04 operating system that has been
configured with several required packages. Which compute platform would you use?

A

This scenario calls for full control over the choice of the operating
system, and the application is moving from a physical server so that it is not containerized.
Compute Engine can run the application in a VM configured with Ubuntu 14.04 and the
additional packages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A team of developers has been tasked with rewriting the ETL process that populates an
enterprise data warehouse. They plan to use a microservices architecture. Each microservice
will run in its own Docker container. The amount of data processed during a run can
vary, but the ETL process must always finish within one hour of starting. You want to
minimize the amount of DevOps tasks the team needs to perform, but you do not want to
sacrifice efficient utilization of compute resources. What GCP compute service would you
recommend?

A

Kubernetes Engine, because the application will be designed
using containerized microservices that should be run in a way that minimizes DevOps
overhead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Your consulting company is contracted to help an enterprise customer negotiate a contract
with a SaaS provider. Your client wants to ensure that they will have access to the SaaS
service and it will be functioning correctly with only minimal downtime. What metric
would you use when negotiating with the SaaS provider to ensure that your client’s
reliability requirements are met?

A

Mean time between failure is used for measuring reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

To ensure high availability of a mission-critical application, your team has determined that
it needs to run the application in multiple regions. If the application becomes unavailable
in one region, traffic from that region should be routed to another region. Since you are
designing a solution for this set of requirements, what would you expect to include?

A

Mean time between failure is used for measuring reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

To ensure high availability of a mission-critical application, your team has determined that
it needs to run the application in multiple regions. If the application becomes unavailable
in one region, traffic from that region should be routed to another region. Since you are
designing a solution for this set of requirements, what would you expect to include?
A. Cloud Storage bucket
B. Cloud Pub/Sub topic
C. Global load balancer
D. HA VPN

A

A global load balancer is needed to distribute workload
across multiple regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is global load balancer

A

A global load balancer in Google Cloud Platform (GCP) is a load balancing service that distributes incoming network traffic across backend instances located in multiple regions worldwide.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is SOAP

A

SOAP (Simple Object Access Protocol) is a messaging protocol used for exchanging structured information in decentralized, distributed application environments, primarily through web services. It facilitates communication between applications over a network by encapsulating messages in XML format, allowing different systems to interact regardless of their underlying platforms or languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly