The requirements for this architecture include dynamic scaling, streaming data on the fly and batch processing data that arrives late, and using SQL to query massive scales of batch data, all through a managed service. Dataflow, GCS, Pub/Sub, and BigQuery are the only solutions that meet all these requirements.
Dataflow is a serverless solution that can be leveraged for both batch and stream processing. Dataproc is not fully serverless.
Dataproc is designed for Spark and Hadoop workloads.
Pub/Sub offers first in, first out (FIFO) ordering of messages, but when the content is stored, it will need to be stored in an ACID-based system such as Cloud SQL.
This is the only valid solution here. They’re looking to investigate a production VM, so taking the server down is not a recommended action at this point. They also want to conduct forensics in a secure location to ensure the evidence is not tampered with.
There are a few indicators here as to why BigQuery is the right answer: large-scale migration, requirement to use SQL, and an analytical use case.
The dead giveaway here is leveraging a time-series database for IoT sensors. This is where Bigtable shines.
Using an archival storage class will be sufficient and the most cost-effective here because the use case is infrequently accessing the data, at most once a year.
Cloud Spanner is the OLTP solution that is relational and offers petabyte scalability. Cloud SQL is not designed for petabyte-scale data.
Cloud Firestore, formerly known as Datastore, is a great solution for profile storage and purchasing history.
When you’re taking the exam, knowing which Google Cloud storage technologies are related to file, object, and block storage may help you get to a more clear answer.
Be careful, though, and don’t assume a Google-managed service is always the answer. Read through each question very carefully for the requirements.
Persistant Disk
If you need to modify the size of your persistent disk, it’s as easy as increasing the size in the Cloud Console. If you need to resize your mounted file system, you can use the standard resize2fs command in Linux to do online resizing.
PDs are not actually physically attached to the servers that host your VMs, but they are virtually attached. You can only resize up, but not down!
Persistant Disk
The command to modify the persistent disk auto-delete behavior for instances attached to VMs is gcloud compute instances set-disk-auto-delete.
Auto-delete is on by default, so you will need to turn this syntax off if you don’t want your PD to be deleted when the instance attached to it is deleted.
Local SSD
Local SSDs disappear when you stop an instance, whereas all three types of persistent disks persist when you stop an instance—hence the name, persistent disk.
Each Local SSD is only 375GB, but you can attach 24 Local SSDs per instance. Because of their benefits and limitations, Local SSDs make a great use case for temporary storage such as caches, processing space, or low-value data.