What is the definition of “Cloud”?
a) Place in the cloud where data is stored.
b) Hardware component inside the computer.
c) Type of malware.
d) Ability to store data and access programs over the internet rather than on local devices.
D
Ability to store data and access programs over the internet rather than on
local devices.
What type of data does BigQuery store?
a) Flat files.
b) Machine learning models.
c) Application logs.
d) Structured tabular data.
D
Structured tabular data.
Which tool helps orchestrate data pipelines on Google Cloud?
a) Cloud Composer
b) Data Catalog
c) Looker
d) Firestore
A
Cloud Composer
What type of cloud service allows users to have a complete development
environment to deploy and manage applications without worrying about the
underlying infrastructure?
a) IaaS (Infrastructure as a Service).
b) PaaS (Platform as a Service).
c) SaaS (Software as a Service).
d) DaaS (Data as a Service).
B
PaaS (Platform as a Service).
What is the main library for model training in Google Cloud?
a) TensorFlow
b) Pandas
c) Scikit-learn
d) Numpy
A
TensorFlow
What is the main goal of data engineering on Google Cloud?
a) Building mobile apps
b) Training machine learning models
c) Creating UI components
d) Designing and managing scalable data processing systems
D
Designing and managing scalable data processing systems
Which Google Cloud service allows you to create analytical dashboards?
a) Looker Studio.
b) BigQuery.
c) Cloud Functions.
d) AI Platform.
A
Looker Studio.
Which service is used to implement data pipelines in Google Cloud?
a) Cloud Functions.
b) BigQuery ML.
c) Cloud Dataflow.
d) AI Platform Training.
C
Cloud Dataflow.
What does Cloud Pub/Sub do?
a) Processes batch data
b) Provides SQL-like querying
c) Ingests and delivers event streams
d) Hosts relational databases
C
Ingests and delivers event streams
An organization has servers running critical workloads in its facilities around
the world. It wants to be able to manage these workloads in a uniform,
centralized manner, with basic infrastructure management.
What should the organization do?
a) Migrate the workloads to a central office building.
b) Migrate the workloads to multiple joint local facilities.
c) Migrate the workloads to a public cloud.
d) Migrate the workloads to multiple local private clouds.
C
Migrate the workloads to a public cloud.
What does the term “multitenancy” mean in cloud computing?
a) The installation of multiple operating systems on a single server.
b) The creation of multiple backup copies of a file.
c) The ability to access the cloud from multiple devices at the same time.
d) The ability of a cloud service to serve multiple users or clients independently within a single infrastructure.
D
The ability of a cloud service to serve multiple users or clients independently within a single infrastructure.
What are the cloud categories according to the role and control exercised by
user and provider?
a) Private, Public and Hybrid.
b) IaaS - PaaS - SaaS.
c) Compute Engine, App Engine and Kubernetes Engine.
d) Agile, DevOps and Containers.
A
Private, Public and Hybrid.
In the context of IaaS (Infrastructure as a Service), what responsibility
typically falls on the customer?
a) Physical hardware maintenance.
b) Network and virtual server management.
c) Configuration and management of the operating system and applications.
d) Network and storage infrastructure.
C
Configuration and management of the operating system and applications.
What format is recommended for storing large datasets in Google Cloud?
a) Parquet
b) JSON
c) CSV
d) TXT
A
Parquet
Which Google Cloud resource is included in the IaaS (Infrastructure as a
Service) model?
a) Google Cloud Functions.
b) Google Compute Engine.
c) Google Dataflow.
d) Google App Engine.
B
Google Compute Engine.
Which storage class in Cloud Storage is best for long-term, infrequently
accessed data?
a) Nearline
b) Standard
c) Archive
d) Coldline
C
Archive
nearline > 1 mes
coldline > 3 meses
archive > 1 año
Which service is best for running Apache Spark or Hadoop jobs on Google
Cloud?
a) Dataproc
b) Cloud SQL
c) BigQuery
d) Cloud Run
A
Dataproc
What is Google Cloud Operations Suite (Cloud’s Observability)?
a) A massive data processing platform.
b) A cloud application monitoring and diagnostics service.
c) A real-time messaging system.
d) A relational database service.
B
A cloud application monitoring and diagnostics service.
Which service allows scaling deep learning models in Google Cloud?
a) AI Platform Training.
b) BigQuery.
c) Vertex AI.
d) Cloud Run.
C
Vertex AI.
Which of the following is a benefit of using serverless technologies in data
engineering?
a) More manual control.
b) Lower developer productivity.
c) Automatic scaling and reduced ops overhead.
d) Increased hardware costs.
C
Automatic scaling and reduced ops overhead.
What is digital transformation?
a) The process of converting physical files to digital.
b) A type of photo editing software.
c) The integration of digital technologies to improve processes, increase
efficiency and offer new value propositions to customers.
d) The creation of mobile applications for all companies.
C
The integration of digital technologies to improve processes, increase
efficiency and offer new value propositions to customers.
In one organization, updates to virtual machine-based applications take a
long time to complete due to operating system boot times.
What should the organization do to speed up its application upgrades?
a) Migrate the virtual machines to the cloud and add more resources to them.
b) Increase virtual machine resources.
c) Automate application update deployments.
d) Convert the applications in the virtual machines to container-based
applications.
D
Convert the applications in the virtual machines to container-based
applications.
You are deploying 10,000 new Internet of Things (IOT) devices to collect temperature data in your warehouses globally. You need to process, store and analyze these very large datasets in real time.
What should you do?
a) Send the data to Google Cloud Pub/Sub, stream Cloud Pub/Sub to Google Cloud Dataflow, and store the data in Google BigQuery.
b) Send the data to Google Cloud Datastore and then export to BigQuery.
c) Send the data to Cloud Storage and then spin up an Apache Hadoop cluster as needed in Google Cloud Dataproc whenever analysis is required.
d) Export logs in batch to Google Cloud Storage and then spin up a Google Cloud SQL instance, import the data from Cloud Storage, and run an analysis as needed.
A
Send the data to Google Cloud Pub/Sub, stream Cloud Pub/Sub to Google
Cloud Dataflow, and store the data in Google BigQuery.
An organization is developing an application that will capture a large amount of data from millions of sensors distributed around the world. The organization needs a database that is suitable for high-speed storage of unstructured data. Which Google Cloud product should this organization choose?
a) Cloud Firestore
b) Cloud Bigtable
c) Cloud Data Fusion
d) Cloud SQL
B
¿Por qué la opción B es la correcta? ✅
Cloud Bigtable es una base de datos NoSQL de tipo columna ancha (wide-column), totalmente gestionada y diseñada para cargas de trabajo a una escala masiva. Es la elección perfecta para este escenario por las siguientes razones:
Escalabilidad Masiva: Está diseñada desde cero para manejar petabytes de datos. De hecho, es la misma tecnología que Google utiliza internamente para servicios como Gmail, Google Maps y la Búsqueda. Es ideal para el “gran volumen de datos” mencionado.
Alto Rendimiento para IoT: Su principal caso de uso es la ingesta de datos de series temporales, como los provenientes de millones de sensores (IoT) o datos financieros. Puede manejar millones de lecturas y escrituras por segundo a muy baja latencia, cumpliendo el requisito de “alta velocidad”. 🚀
Ideal para Datos No Estructurados/Semi-estructurados: Su modelo de columna ancha es flexible. No requiere un esquema estricto, por lo que es perfecto para almacenar datos de sensores, donde diferentes dispositivos pueden enviar diferentes tipos de mediciones.
Analogía: Piensa en Cloud Bigtable como una hoja de cálculo gigantesca e infinitamente ancha. Cada fila tiene una clave única (por ejemplo, ID_sensor#timestamp) y puedes añadir tantas columnas como necesites para esa fila específica, sin que las demás filas se vean afectadas.
¿Por qué las otras opciones son incorrectas? ❌
a) Cloud Firestore
Cloud Firestore es una base de datos de documentos NoSQL, pero está optimizada principalmente para el desarrollo de aplicaciones web y móviles. Su fuerte es la sincronización de datos en tiempo real entre muchos clientes y el soporte offline, no la ingesta masiva de datos de series temporales para análisis a gran escala.
c) Cloud Data Fusion
Esto es incorrecto porque Cloud Data Fusion no es una base de datos. Es una herramienta de integración de datos (ETL/ELT). Se utiliza para construir y gestionar canalizaciones que mueven y transforman datos desde un origen hacia un destino (como una base de datos), pero no almacena los datos en sí.
d) Cloud SQL
Cloud SQL es un servicio de bases de datos relacionales (MySQL, PostgreSQL, SQL Server). No es la opción adecuada por dos motivos principales:
Requiere un esquema estructurado: Las bases de datos relacionales necesitan una estructura de tabla predefinida y rígida, lo que choca con el requisito de “datos no estructurados”.
Escalabilidad de escritura: Aunque es potente, no está diseñada para el nivel extremo de rendimiento de escritura sostenido que requieren millones de sensores, un área donde las bases de datos NoSQL como Bigtable sobresalen.