Cloud Information Systems Flashcards

(50 cards)

1
Q

Promises of Cloud Computing to the Customer

A

-Scalability
-Elasticity
-Resource Decoupling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is EC2 and what does it provide?

A

Amazon Elastic Compute Cloud

Provides scalable computing capacity in the cloud

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an EC2 instance?

A

Virtual Computing Environment

(physical servers are sliced into smaller virtual machines: instances)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data centers

A

centralized compute facilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Racks

A

Allows squeezing many serves into a small space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

FAT Tree Topology

A

Los enlaces superiores son mas anchos o mas rapidos para manejar mas trafico y evitar bottlenecks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Leaf Spine Topology

A

Spine Switches
Switches de nivel superior

Leaf Switches
Switches conectados directamente a los servidores (como ToR switches en cada rack)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Hypervisors (xen, KVM)

A

-allow virtualizing hardware
-emulates a specific complete computer (x86, ENA, NVMe)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

I/O Devices in a VM

A

-hypervisor provides virtual hardware devices for each VM/guest OS

  • when an operating system attempts to access a device, control passes to the hypervisor, which invokes virtual device software
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to split ressources?

A

-fair share: static split ressources
-competitive: dynamically split ressources depending on workloads
-token-bucket: temporary allow overcommiting but converge to fair share

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Virtual Disk

A

AWS: EBS (elastic block store)
-Can be provisioned on demand
-Somewhat elastic pricing
-allows stopping a VM without losing the data
-also allows changing instance type: stop VM, start on other instance with same virtual disk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How Are Virtual Disk Implemented?

A

virtual disks are connected through the network

1.Network Attached Storage (NAS) reuses existing network

(es como tener una carpeta compartida en red donde varias VMs guardan datos)

  1. : Storage Area Network (SAN) has separate network just for virtual
    storage (more expensive, but has more predictable performance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

VM vs container

A

Virtual machines:

-heavyweight (boot time, memory)
-better security
-better performance isolation
-allow changing OS

Containers:

-lightweight, isolated environments
-lower startup times
-better memory utilization

if you run a container in a public cloud, it will probably be within a VM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Unikernels

A

Simple single-address space operating systems (No hay múltiples procesos ni usuarios)

Ejemplos: unikraft, osv, nanos

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

TCO

A
  • Capital expenses (CAPEX): facilities, compute, storage, networking, … (gastos de inversion inicial) servers, data centers
  • Operational expenses (OPEX): energy, maintenance, employees, … (gastos continuos) server electricity

TCO = data center depreciation + data center OPEX + server depreciation + server OPEX

very cheap machines do not necessarily have the lowest TCO due to other costs (cooling, energy)

ojo al kw/h

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Caracteristicas AWS EC2

A

Doesn’t support live migration (no se puede mover una instancia a otro servidor fisica sin apagarla)

to change an instance (e.g., increase its size), it has to be stopped

to avoid service disruption: load balancer in front of compute nodes
Load balancer distruye el trafico entre varios servidores

Google Cloud and Azure support live migration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Region/AZ Choice: Considerations

A

legal reasons
availability
fault tolerance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Pricing Models

A

on demand: per second pricing
spot: variables costs, interruption possible
reserved: reservation for 1 or 3 year

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

SLAs

A

SLAs are a way to commit to quality standards for a service, e.g., availability, performance, durability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Burstable instances

A

Diseñados para trabajos que normalmente usan poca CPU, pero que a veces necesitan picos cortos de alto rendimiento

available burstable instance families: t4g (ARM), t3 (Intel), t3a (AMD), t2 (Intel)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

AWS Lambda

A

-Function as a Service (Serverless)

-Automatic scaling

-intended for short invocations, not suitable for long-running services

-Solo pago por el tiempo que se ejecuta el codigo

-Maximum duration of 15 minutes

-No network communication with other Lambdas

-Implementation: KVM (Kernel Virtual machine) + special virtual machine monitor (Firecracker instead of QEMU hypervisor) son micro vms

22
Q

Lambda vs EC2

A

Ventajas de Lambda (frente a EC2)

-Automatic elasticity and scalability, no server management/administration

-Low latency startup (arranca en milisegundos gracias a las micro vms)

-Fine-grained pricing (pagas por uso exacto)

-Available in very small sizes low cost for small jobs

Desventajas

-more than 2× higher cost for long-running jobs (Lambda runs on EC2)
little hardware customization (no se puede elegir CPU)

-limitations (duration, size, networking) (max 15 min de ejecucion)

23
Q

AWS Fargate

A

(lambda pero para contenedores)
* container-based compute service alternative to EC2

-minimo tiempo de facturacion 1 min

24
Q

Fargate vs EC2

A

Ventajas de Fargate (frente a EC2)

-no manual instance management
-finer CPU/RAM granularity than EC2
-CPU and RAM can be configured separately

Desventajas

-slightly more expensive than EC2
-requires ECS/EKS
-fewer hardware options (no se puede elegir GPU)

25
Instance Storage
(disco físico directamente conectado al servidor donde corre tu instancia EC2.) No es realmente persistente: Si apagas o pierdes la instancia (por ejemplo, por fallo de hardware), los datos se borran. Sobrevive a fallos de energía (no se borra si la instancia se reinicia),
26
EBS (Elastic Block Store)
Es un disco virtual que puedes conectar a tus instancias EC2. Funciona como un disco duro externo, pero en la nube. -virtual disk -block device, not a file system -used as root volume for operating system -EBS volume should usually only be attached by one VM at any point in time, but can attach multiple volumes to one instance -can be replicated across multiple servers in one AZ -EBS Variants: io1, io2, st1, sc1
27
S3
Es un servicio de almacenamiento en la nube donde puedes guardar archivos (fotos, documentos, videos, backups, etc.) de forma segura, escalable y económica. redundant storage across AZs in one region (buckets viven en la region pero se replican en todos los AZs) sometimes called object storage Terminology: “object” = file “prefix” = file path “bucket” = named collection of files in a region bucket lives in region, not AZ http(s) API: GET, PUT, LIST, DELETE (para acceder a los archivos) bucket can be internet public or private
28
Other Storage Alternatives
-Keep everything in main memory (RAM) -Amazon Elastic File System (EFS): network file system (compartido por red) -DynamoDB: distributed key/value store -Relational Database Service (RDS): relational OLTP (online transaction processing) database system
29
PUE (Power Usage Effectiveness)
Total energy consumption of the data center / Total energy consumption of the IT (servidores, redes, storage) Optimal PUE = 1.0 Legal Requirement: PUE <= 1.2 Total Energy Consumption (DC) = IT Equipment + HVAC + Lighting + Power Losses + Infrastructure + Miscellaneous (CCTV) Total IT Consumption = Servidores + Almacenamiento + Networking (swithces, router, firewall, load balancer) + Periféricos IT (backup systems)
30
Measures to reduce PUE
-Increase number of AC units -Server fan replacement -Location in countries with low temperature (Finland)
31
Scaling
Scale up: vertical scaling (bigger machines) -Enables high bandwidth and low latency communication -May not be enough Scale out; horizontal scaling (more machines) -Cloud enables elasticity: add/remove machines when workload changes -Enables fault tolerance through redundancy
32
Downsides of scaling out
More nodes -> higher the possibility that one of them fails network bandwidth and latency often becomes the bottleneck (en sistemas distribuidos esperar duele mas que no poder enviar mucho) New failure modes : network partitions (some nodes cannot talk to each other anymore), transient failures (fallos temporales), clock skew and drift (un nodo puede pensar que algo paso antes y otro puede pensar que algo paso despues)
33
Lift And Shift
Tomar una aplicación que ya existe en servidores propios (on-premise) y moverla a la nube casi sin cambiar nada. (copiar y pegar tu sistema actual pero en la nube) -Downsides of public cloud without benefits *pagas los caro de la nube pero sin recibir beneficios*
34
Partitioning (Sharding)
*en vez de una sola database, se dividen los datos en partes y esas partes se reparten en diferentes servidores* -multi-tenant service where each tenant has a small, independent database
35
Separation of Compute and Storage
*Separar las máquinas que calculan (compute) de las que guardan los datos (storage).* * scale compute and storage independently * separate billing for compute and storage
36
Control Plane and Data Plane
Common pattern control plane (brain): coordination, scheduling, monitoring, cost counting, etc. data plane (muscle): does the actual work
37
Microservices
* each running in its own VM(s)/container(s) Each microservice: should be independent and loosely coupled (changes should not affect others) * should be scaled independently
38
Criteria for splitting a system into microservices
Independent functionality Different scaling needs DIfferent technology stack
39
Serverless Architecture
Advantages: Decoupled architecture All components scale independently Easy to add another detector Observed Problems: scalability bottlenecks due to per-customer account limits
40
How to improve Latency?
Parallelism Consolidation More powerful machines Faster Network
41
Automation of data centers
Creation and deployment of new virtual resources Software update and upgrade (OS, applications, libraries) Administration of security policies (firewalls, secrets)
42
Infrastructure as Code (IaC)
Treat infrastructure like software (version control, code review, testing) Immutable Infrastructure: Do not patch running servers; replace them with new, updated images.
43
Orchestration
It is usually not a good idea to manually configure and deploy a large number of heterogeneous containers Orchestration software automates this, and provides additional features like * dynamic scaling of services * coordination across multiple servers * resilience and automatic recovery
44
Kubernetes
-Service naming and discovery In Kubernetes each service can be given a name, either * A domain name * An IP address -Load balancing Kubernetes uses load balancer software to divide requests among the instances of the service -Storage orchestration When the container is launched, Kubernetes connects each attachment point to external storage -Optimized container placement Assigns containers to nodes to optimize the use of the nodes (bin packing) -Automated Initiation and Recovery To handle automated failure detection and replacement, Kubernetes, * Continually probes each container with a user-defined health check * Terminates any container that stops responding -Mangement of configurations and Secrets Kubernetes separates * Service configuration and management information * Container images used to provide the service The reason: separation allows the owner of a service to change configuration and management policies that affect for deployment without rebuilding the container images used in the service -Automated Rollouts And Rollbacks -What Kubernetes does not do: Focus on a specific type of application or have application-specific optimizations Manage source code or build containers (e.g., assumes Docker handles it) Supply event-passing middleware Have a built-in facility to collect, log, or otherwise report measurements or events
45
CLuster, Node, Pod
* Cluster: set of nodes that run containerized applications * Node: physical or virtual machine in the cluster * Pod: smallest deployable unit Kubernetes Pods -All containers for a pod run on the same node -All containers in a pod share an IP address
46
Pros, cons orchestrations
+ Increase overall efficiency of operations + Coordinate computational, communication, and storage resource management + Avoid human errors in configuration and operation (= cost savings) − System-wide (cascading) failures − Risk of run-away resource use − Increased security attack surface − Complexity, overlapping functionality
47
S3
Cada acceso a S3 es lento comparado con RAM, parecido a leer de un disco → latencia ≈ >10 ms → velocidad ≈ 50 MB/s La gran ventaja: S3 tiene millones de discos en paralelo → no es rápido por acceso individual → es rápido si haces muchas peticiones a la vez Si haces muchas requests simultáneas, puedes lograr muchísimo ancho de banda total Pedir objetos pequeños es caro → el coste y la latencia por request pesan mucho Objetos grandes son ideales para S3 → el coste por request se vuelve insignificante → compite bien (o mejor) que EC2 Otros servicios tipo S3 (Azure Blob, GCS, etc.) se comportan muy parecido
48
Amazon Aurora
OLTP
49
Snowflake
OLAP
50