Analytics & AI/ML Flashcards

Question 1

Q

Amazon Athena

Answer

A

🔍 Amazon Athena = “SQL for your S3 data”
What it is: A serverless query service that lets you run SQL queries directly on data stored in Amazon S3.

How it works: You point Athena to your data in S3 (like CSV, JSON, or Parquet), write SQL queries, and get results—no servers or ETL needed.

Why it’s useful:
- Serverless—no infrastructure to manage 🛠️
- Pay only for the data you scan 💰
- Supports standard SQL and integrates with AWS Glue for schema discovery 📊
- Great for ad hoc analysis, dashboards, and big data exploration 🔍

🧠 Easy way to remember:
“Athena is your cloud data detective.”

It digs through your S3 data using SQL—fast, flexible, and no setup required. Perfect for quick insights without building a whole pipeline.

Question 2

Q

AWS Glue (Serverless ETL)

Answer

A

🧪 AWS Glue = “Serverless data prep lab”

What it is: A fully managed ETL (Extract, Transform, Load) service that helps you clean, enrich, and move data between sources—without managing servers.

How it works: You define jobs that extract data (e.g., from S3, RDS), transform it (e.g., filter, join, format), and load it into a destination (e.g., Redshift, S3, or a data lake).

Why it’s useful:
- Serverless—no infrastructure to manage 🛠️
- Built-in data catalog to discover and organize data 🔍
- Supports Python and Scala for custom transformations 🐍
- Scales automatically and integrates with Athena, Redshift, Lake Formation, and more 🔗

🧠 Easy way to remember:
“AWS Glue is your cloud’s data janitor and mover.”

It finds your data, cleans it up, and delivers it where it needs to go—automatically and at scale.

Question 3

Q

Amazon Kinesis Data Firehose

Answer

A

🔥 Kinesis Data Firehose = “Real-time data delivery pipeline”

What it is: A fully managed service that captures, transforms, and loads streaming data into destinations like S3, Redshift, OpenSearch, or third-party tools (e.g., Splunk).

How it works: Your app or devices send streaming data to Firehose, which can buffer, batch, compress, and convert it before delivering it to the target.

Why it’s useful:
- Serverless—no infrastructure to manage 🛠️
- Handles real-time data ingestion at scale 📈
- Supports data transformation with AWS Lambda 🔄
- Automatically scales and retries on failure 🔁
- Ideal for logs, metrics, clickstreams, and IoT data 📊

🧠 Easy way to remember:
“Firehose is your real-time data delivery truck.”

It picks up streaming data, cleans it if needed, and drops it off at the right destination—fast, reliable, and hands-free.

Question 4

Q

Amazon QuickSight

Answer

A

📊 Amazon QuickSight = “Business dashboard for your AWS data”

What it is: A cloud-powered business intelligence (BI) service that lets you create interactive dashboards, visualizations, and reports from your data.

How it works: You connect QuickSight to data sources like Redshift, S3, Athena, RDS, or even Excel files, then build charts and dashboards using its drag-and-drop interface.

Why it’s useful:
- Fast, scalable, and serverless 🛠️
- Uses SPICE (Super-fast, Parallel, In-memory Calculation Engine) for lightning-fast performance ⚡
- Accessible via web and mobile 🌐📱
- Supports ML insights, forecasting, anomaly detection, and natural language queries 🤖

🧠 Easy way to remember:
“QuickSight is your cloud’s data storyteller.”

It turns raw data into beautiful, interactive visuals—so you can explore, explain, and act on insights with ease.

Question 5

Q

Amazon Bedrock

Answer

A

🧠 Amazon Bedrock = “Your gateway to generative AI in AWS”

What it is: A fully managed service that lets you build and scale generative AI applications using foundation models from leading AI companies—without managing infrastructure.

How it works: You access models from providers like Anthropic (Claude), AI21, Meta, Mistral, Cohere, and Amazon Titan via a simple API. You can customize, fine-tune, and integrate them into your apps securely.

Why it’s useful:
- No need to host or train models yourself 🛠️
- Supports text, image, and embedding generation 🎨
- Integrates with other AWS services like SageMaker, Lambda, and API Gateway 🔗
- Built-in security, governance, and scalability 🔐

🧠 Easy way to remember:
“Bedrock is your launchpad for AI magic.”

It gives you powerful models, ready to use—so you can build smart apps fast, securely, and at scale.

Question 6

Q

Amazon Rekognition

Answer

A

👁️ Amazon Rekognition = “Eyes for your apps”

What it is: A deep learning–based image and video analysis service that can detect objects, people, text, scenes, and activities—and even recognize faces.

How it works: You upload an image or video, and Rekognition returns detailed labels, facial attributes, or unsafe content flags. It can also compare faces or track people across frames.

Why it’s useful:
- Detects faces, emotions, and celebrities 😃
- Finds objects and scenes in images and videos 🏞️
- Flags inappropriate or unsafe content 🚫
- Supports facial comparison and search (e.g., for identity verification) 🕵️
- Scales automatically and integrates with S3, Lambda, and more 🔗

🧠 Easy way to remember:
“Rekognition is your app’s visual brain.”

It sees what’s in your images and videos—so you can build smart, secure, and responsive experiences.

Question 7

Q

Amazon Comprehend

Answer

A

🧠 Amazon Comprehend = “Language understanding for your text”
What it is: A natural language processing (NLP) service that uses machine learning to extract insights and meaning from text.

How it works: You feed it text (like emails, reviews, documents), and it returns things like sentiment, key phrases, entities, language, and topics.

Why it’s useful:
- Detects positive, negative, neutral, or mixed sentiment 😊😠
- Extracts names, places, dates, and more 🏷️
- Identifies topics and themes in large document sets 📚
- Supports custom entity recognition and PII detection 🔐
- Works in multiple languages 🌍

🧠 Easy way to remember:
“Comprehend is your text’s translator and analyzer.”

It reads between the lines—so you can understand what your customers, documents, or data are really saying.

Question 8

Q

Amazon Polly

Answer

A

🗣️ Amazon Polly = “Text-to-speech for your apps”

What it is: A cloud service that turns text into lifelike speech using advanced deep learning.

How it works: You send text to Polly via an API, and it returns an audio stream in a natural-sounding voice. You can choose from dozens of voices and languages.

Why it’s useful:
- Adds voice to apps, websites, and devices 🎧
- Supports real-time streaming or file output 🔊
- Offers neural voices for ultra-realistic speech 🧠
- Customizes pronunciation with SSML and lexicons 🛠️
- Great for audiobooks, IVR systems, accessibility, and more 📚📞

🧠 Easy way to remember:
“Polly gives your text a voice.”

It speaks your words out loud—clearly, naturally, and in the language your users understand.

Question 9

Q

Amazon SageMaker Serverless Inference

Answer

A

🧠⚡ SageMaker Serverless Inference = “ML predictions without managing servers”
What it is: A deployment option in Amazon SageMaker that lets you run machine learning inference without provisioning or managing infrastructure.

How it works: You deploy your trained model, and SageMaker automatically spins up compute resources when a request comes in, then scales down to zero when idle.

Why it’s useful:
- No need to manage EC2 instances or clusters 🛠️
- Auto-scaling based on traffic 📈
- Cost-effective for intermittent or unpredictable workloads 💰
- Supports popular ML frameworks like TensorFlow, PyTorch, XGBoost 🔬
- Integrated with SageMaker Studio and Pipelines 🔗

🧠 Easy way to remember:
“Serverless Inference is your model’s quiet genius.”

It waits silently, springs into action when needed, and vanishes when done—smart, efficient, and invisible.-

Question 10

Q

Redshift

Answer

A

🚀 Amazon Redshift = “Super-fast data warehouse in the cloud”

What it is: A fully managed data warehouse that lets you run complex SQL queries on huge amounts of structured and semi-structured data—quickly and at scale.

How it works: You load data from sources like S3, RDS, or streaming services, and Redshift uses columnar storage and parallel processing to deliver lightning-fast analytics.

Why it’s useful:
- Ideal for business intelligence (BI), dashboards, and reporting 📊
- Handles petabyte-scale data with high performance ⚡
- Integrates with tools like QuickSight, Tableau, and Power BI 🔗
- Offers Redshift Serverless for on-demand, no-cluster setup 🛠️

🧠 Easy way to remember:
“Redshift is your cloud’s data rocket.”

It takes massive data, crunches it fast, and launches insights—perfect for teams that need speed, scale, and simplicity.

Question 11

Q

Amazon CloudWatch Logs Insights

Answer

A

🔎 CloudWatch Logs Insights = “Search engine for your logs”

What it is: An interactive log analytics tool built into Amazon CloudWatch that lets you query, analyze, and visualize log data in real time.

How it works: You write queries in a simple query language to filter, aggregate, and extract insights from logs stored in CloudWatch Logs.

Why it’s useful:
- Helps you troubleshoot applications and infrastructure quickly 🛠️
- Supports powerful queries with filters, stats, sort, and parse 📊
- Visualize results in dashboards and graphs 📈
- Works with logs from Lambda, EC2, ECS, API Gateway, VPC Flow Logs, and more 🔗
- Serverless—no setup or provisioning needed ⚡

🧠 Easy way to remember:
“Logs Insights is your cloud’s log detective.”

It helps you dig through mountains of logs to find the root cause, trends, or anomalies—fast and without managing any servers.

Question 12

Q

AWS Textract

Answer

A

📄🔍 Amazon Textract = “Your document’s data extractor”

What it is: A machine learning service that automatically extracts text, tables, forms, and checkboxes from scanned documents and images.

How it works: You upload a document (PDF, image, etc.), and Textract returns structured data—not just OCR, but also context-aware extraction.

Why it’s useful:
- Reads printed and handwritten text ✍️
- Detects form fields and values (e.g., Name: John Doe) 🧾
- Extracts tables with rows and columns 📊
- Integrates with S3, Lambda, Comprehend, and more 🔗
- Ideal for invoices, receipts, medical forms, and contracts 🏥📑

🧠 Easy way to remember:
“Textract is your document’s data detective.”

It doesn’t just read—it understands the layout and meaning, so you can automate data entry and analysis.

Question 13

Q

AWS Lake Formation

Answer

A

🌊 AWS Lake Formation = “Your data lake builder and bodyguard”

What it is: A fully managed service that helps you build, secure, and manage data lakes on AWS—quickly and with fine-grained access control.

How it works: You ingest data from sources (like S3, RDS, Redshift), organize it into a catalog, and apply permissions to control who can access what. It integrates with services like Athena, Redshift, Glue, and EMR.

Why it’s useful:
- Simplifies data lake setup and governance 🛠️
- Centralized data catalog and access control 🔐
- Supports row-level and column-level security 🧩
- Works with multiple analytics services seamlessly 🔗
- Ideal for secure, scalable, multi-team data sharing 🤝

🧠 Easy way to remember:
“Lake Formation is your data lake architect and security chief.”

It builds the lake, organizes the data, and makes sure only the right people can dive in.

Analytics & AI/ML Flashcards

(13 cards)