AI, Machine Learning, Analytics Technology and Services Flashcards

(185 cards)

1
Q

What is RedShift?

A

RedShift is a data warehousing service used for reporting and analytics that can store and query petabytes of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does RedShift allow you to do with multiple sources of data?

A

RedShift allows you to combine multiple sources of data into one place, enabling you to perform analytics on the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does MPP stand for and what does it mean in the context of RedShift?

A

MPP stands for massively parallel processing. In the context of RedShift, it means that RedShift is capable of running complex queries in parallel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the benefits of using RedShift for OLAP?

A

RedShift is designed for Online Analytical Processing (OLAP), making it great for analytics and reporting. It provides automated data management, including backup, replication, and scaling without downtime.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is RedShift Serverless?

A

RedShift Serverless is a serverless option of RedShift that simplifies the use of RedShift by eliminating the need to manage any infrastructure. It automatically provisions and scales everything.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some use cases for RedShift?

A

Some use cases for RedShift include complex querying and reporting for businesses that need to analyze large volumes of data, integration with data lakes for querying structured and unstructured data, and operational analytics for making time-sensitive decisions based on real-time data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the advantages of using RedShift Serverless for unpredictable workloads?

A

RedShift Serverless eliminates the need to manage infrastructure, allowing you to focus on analyzing your data. It automatically provisions and scales everything, making it a great option for unpredictable workloads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a data lake and how can RedShift integrate with it?

A

A data lake is a central repository of structured and unstructured data, often stored in S3. RedShift can integrate with a data lake, allowing you to query that data using RedShift.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What type of workload is RedShift designed for?

A

RedShift is designed for business intelligence workloads, specifically for reporting and analytics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are some automated data management features provided by RedShift?

A

RedShift provides automated data backup, replication, and scaling without any downtime.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the main purpose of Kinesis?

A

The main purpose of Kinesis is to collect, process, and analyze streaming data in real time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does the name “Kinesis” mean and why is it a fitting name for this service?

A

“Kinesis” is a Greek word that means movement or motion. It is a fitting name for this service because Kinesis deals with data that is in motion, moving from one place to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does Kinesis data streams store and retain data?

A

Kinesis data streams store data in shards, which are sequences of data records. The data is retained by default for 24 hours, with a maximum retention of 365 days.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the difference between streaming data and static data?

A

Streaming data refers to data that is generated continuously by multiple data sources or producers, while static data is data that is stored on disk, in S3, or in a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Give examples of the types of data that can be handled by Kinesis.

A

Examples of data that can be handled by Kinesis include financial transactions, stock prices, in-game data, social media feeds, location tracking data, IoT sensor data, clickstream data, and application log files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are shards in Kinesis data streams?

A

Shards in Kinesis data streams are storage units that hold data records. Each data record has a unique sequence number, and a Kinesis stream is made up of one or more shards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What role do data consumers play in the Kinesis architecture?

A

Data consumers in the Kinesis architecture consume data from the shards and process it. They can perform various actions on the data, such as running algorithms, analyzing sentiment, or generating recommendations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Give examples of actions that data consumers can perform on the data.

A

Data consumers can perform actions such as running algorithms on stock prices, sentiment analysis on social media feeds, or analyzing clickstream data to generate product recommendations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are some possible destinations for data after it has been processed by data consumers?

A

After being processed by data consumers, the data can be sent to permanent storage destinations such as DynamoDB, S3, Elastic MapReduce, or Redshift.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Explain the main purpose of Kinesis data streams and Kinesis video streams.

A

Kinesis data streams is designed for handling streaming data, while Kinesis video streams is specifically designed for streaming video data from connected video devices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is Kinesis Data Firehose?

A

Kinesis Data Firehose, also known as Kinesis Firehose, is a fully managed service that allows you to capture, transform, and load data streams into AWS data stores for near real-time analytics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the primary functions of Kinesis Data Firehose?

How does Kinesis Data Firehose handle varying data volumes?
Kinesis Data Firehose dynamically adjusts its resources to handle varying data volumes, scaling automatically.

What is the typical processing time for data in Kinesis Data Firehose?
Kinesis Data Firehose processes and delivers data within 60 seconds for timely insights.

A

The primary functions of Kinesis Data Firehose are capturing, transforming, and loading data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Is there any data retention in Kinesis Data Firehose?

A

No, Kinesis Data Firehose does not retain data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Can you transform data with Kinesis Data Firehose before loading it into storage?

A

Yes, you can transform and customize the data using AWS Lambda before loading it into permanent storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What tools can be used for analytics after data is loaded by Kinesis Data Firehose?
Business intelligence tools can be used for analytics after data is loaded into its final destination by Kinesis Data Firehose.
26
What monitoring tools are integrated with Kinesis Data Firehose?
Kinesis Data Firehose includes integrated monitoring with CloudWatch.
27
What happens if there is an error in data processing within Kinesis Data Firehose?
Kinesis Data Firehose has automatic error retries if something goes wrong.
28
Does Kinesis Data Firehose retain data temporarily?
No, Kinesis Data Firehose does not retain data, even temporarily.
29
What does a data lake refer to in the context of Kinesis Data Firehose?
A data lake refers to a large-scale data repository for storing streaming data.
30
What AWS service can you use to transform data in Kinesis Data Firehose?
You can use AWS Lambda to transform data in Kinesis Data Firehose.
31
What are some use cases for Kinesis Data Firehose?
Use cases include real-time analytics, feeding data into data lakes, log data management, and IoT data integration.
32
What are some common destinations for data after processing in Kinesis Data Firehose?
Common destinations include Amazon S3, Amazon Redshift, and Amazon OpenSearch Service.
33
What is the difference between Kinesis Data Streams and Kinesis Data Firehose?
Kinesis Data Streams capture and store streaming video and data, whereas Kinesis Data Firehose captures, transforms, and loads data continuously into data stores.
34
What is Amazon Athena?
Amazon Athena is an interactive query service that enables you to run standard SQL queries on data stored in Amazon S3.
35
What type of queries can you run with Amazon Athena?
You can run standard SQL queries with Amazon Athena.
36
What is a key feature of Amazon Athena regarding infrastructure?
Amazon Athena is serverless, meaning there is nothing to provision and manage.
37
How do you pay for using Amazon Athena?
You pay per query and per terabyte scanned when using Amazon Athena.
37
Is there a need for complex ETL processes when using Amazon Athena?
No, there is no need for complex extract, transform, and load (ETL) processes when using Amazon Athena. It works directly with data stored in S3.
38
What are some use cases for Amazon Athena?
Use cases for Amazon Athena include querying log files stored in S3, analyzing AWS cost and usage reports, generating business reports on data stored in S3, and running queries on clickstream data stored in S3.
39
What is AWS Glue used for?
AWS Glue is used to prepare your data for analytics and machine learning.
40
Why is AWS Glue important for analytics and machine learning?
AWS Glue is important because it prepares and transforms data, making it ready for use by analytics applications and machine learning models.
41
What is the purpose of the data catalog created by AWS Glue?
The data catalog serves as the central repository containing metadata about the data, including its type and format.
42
Into which AWS services can transformed data be loaded using AWS Glue?
Transformed data can be loaded into AWS services like RDS, Redshift, S3, or Athena.
43
What are some specific transformations AWS Glue can perform on data?
AWS Glue can categorize data, clean it, remove duplicates, and join multiple datasets.
44
What does AWS Glue do with your data?
AWS Glue crawls your data and creates the data catalog, which is the central repository containing the metadata, such as the type or format of your data.
45
What can AWS Glue do after creating the data catalog?
After creating the data catalog, AWS Glue can extract data from various sources, transform it (e.g., categorize, clean, remove duplicates, or join multiple datasets), and then load it into other AWS services.
46
What is AWS Data Exchange used for?
AWS Data Exchange allows you to securely exchange and use data provided by third parties on a subscription basis.
47
Who provides the data products available on AWS Data Exchange?
Data products are available from a variety of suppliers, including financial services, healthcare, weather, manufacturing, and telecommunications.
48
What can the data from AWS Data Exchange be used for?
The data can be used for analytics, machine learning workloads, and decision-making.
49
Can you give an example use case for AWS Data Exchange?
An example use case is analyzing customer spending patterns based on geographic location using data products provided by companies like MasterCard, Experian, and Equifax.
50
What is Elastic Map Reduce (EMR)?
Elastic Map Reduce (EMR) is a big data platform provided by AWS that supports large-scale parallel data processing and petabyte-scale interactive analysis.
51
What types of data does EMR support?
EMR supports structured data (e.g., financial transaction data), semi-structured data (e.g., text or documentation), and unstructured data (e.g., application logs or click-stream data).
52
Give an example of a use case for EMR.
One example of a use case for EMR is processing genomic data using statistical algorithms and predictive models to discover hidden patterns and find correlations.
53
How can EMR help in analyzing click-stream data?
EMR can analyze click-stream data to understand customer preferences or market trends.
54
What are some of the data sources from which EMR can extract data?
EMR can extract data from sources like S3, DynamoDB, or Redshift.
55
Which real-time data streaming service is compatible with EMR for event analysis?
EMR can be used to analyze events from streaming data sources in real time using Amazon Kinesis.
56
Name some popular open-source frameworks supported by EMR.
EMR supports popular open-source frameworks like Apache Spark, Apache Hive, Presto, and Hadoop.
57
What are the benefits of using EMR as a fully managed big data solution?
The benefits of using EMR include not having to worry about provisioning and managing infrastructure, configuring and managing open-source applications, capacity planning, and it can dynamically scale as required by the workload. It is also optimized for performance and is claimed to be faster and less costly than deploying an on-premises big data solution.
58
How does AWS claim EMR compares in cost to deploying your own big data solution on-premises?
AWS claims that EMR is less than 50% of the cost of deploying your own big data solution on-premises.
59
What is Amazon OpenSearch?
Amazon OpenSearch is a fully-managed service based on open-source Elasticsearch technology, compatible with Elasticsearch open-source APIs, Logstash for data collection and processing, and Kibana for search and data visualization.
60
Which open-source technologies is Amazon OpenSearch compatible with?
Amazon OpenSearch is compatible with industry-standard Elasticsearch open-source APIs, Logstash, and Kibana.
61
Why might a business choose to use Amazon OpenSearch?
A business might choose to use Amazon OpenSearch because it is a fully-managed service that simplifies the use of Elasticsearch open-source technology, while also supporting data collection, processing, and visualization tools like Logstash and Kibana. It is suitable for various analytics use cases, including log, application, security, and business data analytics
62
What AWS services can you ingest data from into Amazon OpenSearch?
You can ingest data into Amazon OpenSearch from AWS services such as CloudWatch Logs, S3, DynamoDB, and Firehose.
63
Name a tool that is used for data collection and processing in conjunction with Amazon OpenSearch.
Logstash is used for data collection and processing in conjunction with Amazon OpenSearch.
64
What tool can be used with Amazon OpenSearch for search and data visualization?
Kibana is used with Amazon OpenSearch for search and data visualization.
65
List some use cases for Amazon OpenSearch.
Use cases for Amazon OpenSearch include log analytics, application monitoring, security analytics, and business data analytics.
66
How does Amazon OpenSearch relate to Elasticsearch?
Amazon OpenSearch is a fully-managed service that is based on open-source Elasticsearch technology and is compatible with Elasticsearch open-source APIs.
67
What kind of analytics can you perform using Amazon OpenSearch?
Using Amazon OpenSearch, you can perform log analytics, application monitoring, security analytics, and business data analytics.
68
Can you use Amazon OpenSearch with AWS CloudWatch Logs? If so, how?
Yes, you can use Amazon OpenSearch with AWS CloudWatch Logs by ingesting data from CloudWatch Logs into Amazon OpenSearch.
69
You are building an application that will be used to analyze customer spending patterns. Your application relies on data from third parties in the retail and financial services sector. Which of the following services can be used to securely exchange and use data provided by third parties on a subscription basis?
AWS Data Exchange
70
Which AWS service can be used to perform sentiment analysis on customer feedback data?
Amazon Comprehend
71
Which AWS service captures, transforms, and loads data continuously into data stores?
Kinesis Data Firehose
72
You would like to use Apache Kafka to process a continuous stream of data that you need to track and analyze in real-time. You are looking for a fully managed service to avoid building and maintaining your own Kafka platform. Which AWS service can you use?
Amazon MSK (Managed Streaming for Apache Kafka)
73
You would like to use deep learning technology to add natural-sounding speech to your website so that the contents of certain web pages can be read out loud to help people who are visually impaired. Which AWS service can you use to implement this?
Polly
74
Your company needs a solution to collect website clickstream data in real time so that it can be processed for real-time insights. Which AWS service do you suggest?
Kinesis enables you to collect, process, and analyze streaming data in real time.
75
Which AWS service can be used to extract text and data from documents, including extracting drivers licenses numbers or passport numbers to help verify the identity of loan applicants?
Amazon Textract
76
Which AWS service can be used to create a data catalog and perform ETL (Extract, Transform, and Load) on your data so that it can be used by your data analytics and machine learning applications?
AWS Glue
77
You need to query data stored in S3 using standard SQL queries. Which of the following AWS services will enable you to do this?
Athena is an interactive query service for data in S3. It enables you to query data stored in S3 using standard SQL.
78
Your genomics application generates a large amount of unstructured and semi-structured data. You now require a Big Data solution so that you can process the data to identify patterns and trends, using open-source technologies like Apache Spark and Apache Hive. Which of the following services would you recommend?
Amazon EMR (Elastic MapReduce)
79
What can you use to group and visualize AWS resources by project, environment, or application?
Tags
80
Which service allows you to create dashboards where you visualize metrics produced by services and applications on AWS?
Amazon CloudWatch
81
Which service provides account-wide recommendations around cost optimization, service limits, and security best practices?
Trusted Advisor
82
Which service allows you to most easily convert an existing application into a cloud-hosted Software-as-a-Service?
AppStream will handle hosting, scaling, and user management for your application and help you convert it into a SaaS product for your employees or customers.
83
Which of the following is NOT a function of AWS Audit Manager? 1. Centralize audit data from AWS Config and various security services. 2. Generate insights and recommendations to help you adhere to the Well-Architected Framework. 3. Use pre-built frameworks to help you meet industry-specific security and configuration standards. 4. Find root causes of noncompliance and generate reports.
Generate insights and recommendations to help you adhere to the Well-Architected Framework.
84
How long are CloudWatch Logs stored by default?
Indefinitely
85
What feature can be used to analyze your workloads and generate action plans to help you achieve more reliable and cost-effective architecture? 1. Systems Manager 2. Audit Manager 3. AWS Config 4. The Well-Architected Tool
The Well-Architected Tool helps you use the Well-Architected Framework as a set of lenses through which to analyze your workloads. You can use it to learn about the Well-Architected Framework and generate action plans to bring your architectures into alignment with it.
86
What does AWS Config do? 1. AWS Config allows you to take automated actions on large groups of cloud resources. 2. AWS Config allows you to set up account-wide rules and detect non-compliant resources. 3. AWS Config allows you to set up account-wide rules and enforce compliance by disallowing the creation of non-compliant resources. 4. AWS Config allows you to audit non-compliant resources and generate audit reports.
AWS Config allows you to set up account-wide rules and detect non-compliant resources.
87
Which service will notify you about service events, outages, planned changes, and account notifcations? 1. Trusted Advisor 2. CloudWatch Alarms 3. AWS Health Dashboard 4. Systems Manager
AWS Health Dashboard will give you a view of all outages across AWS, as well as a personal dashboard that displays only those services and Regions that are relevant to your cloud resources.
88
How can you receive a notification when CPU utilization of your EC2 instance reaches 90%? 1. Trusted Advisor will automatically generate a recommendation when CPU utilization of your EC2 instance reaches 90%. 2. Systems Manager automatically tracks CPU utilization and will notify administrators when it exceeds 90% on any given instance. 3. Create a CloudWatch alarm that triggers when CPU utilization reaches 90%. 4. Create a CloudTrail Alarm that triggers when CPU utilization reaches 90%.
CloudWatch alarms can be used to send notifications or trigger automated events when metrics reach defined thresholds.
89
90
What is machine learning?
The science of developing algorithms that learn patterns from historical data to make predictions without explicit instructions.
91
What is artificial intelligence?
A field focused on solving cognitive problems such as learning, problem solving, and pattern recognition.
92
What are the three levels of AWS ML services?
AI services, ML services, and ML frameworks and infrastructure.
93
What does Amazon Translate do?
Provides text translation and localization through API calls.
94
What does Amazon Polly do?
Converts text into lifelike speech using text-to-speech capabilities.
95
What does Amazon Lex do?
Builds conversational chatbots using voice and text interactions.
96
What does Amazon Rekognition do?
Adds image and video analysis such as object, person, and text detection.
97
What is Amazon Comprehend?
A natural language processing service for sentiment and entity analysis.
98
What is Amazon Forecast?
A service for building accurate time-series forecasts using ML.
99
What is Amazon CodeGuru?
A service that uses ML to analyze code quality and optimize performance.
100
What is AWS SageMaker?
A managed ML platform to build, train, and deploy ML models at scale.
101
What is Amazon CodeWhisperer?
An ML-powered code generator offering real-time code suggestions and security scans.
102
Which ML frameworks are available on AWS?
TensorFlow, PyTorch, and Apache MXNet.
103
What instances are optimized for ML training?
EC2 P3 and P3dn instances for accelerated compute.
104
Which AI service adds visual analysis features to applications?
Amazon Rekognition.
105
What is data analytics?
The process of converting raw data into actionable insights for decisions and optimization.
106
What is Amazon Athena?
A serverless interactive query service for analyzing S3 data using SQL.
107
How is Athena priced?
You pay per query or per terabyte scanned.
108
When should you use Athena?
When querying S3 data such as logs, reports, clickstreams, or cost and usage data.
109
Which AWS service detects and protects PII?
Amazon Macie.
110
What does Amazon Macie do?
Uses ML to discover, classify, and protect sensitive data in S3 including PII.
111
What is the keyword hint for Macie and PII?
Macie contains an 'I', like 'PII'.
112
What is Amazon Redshift?
A petabyte-scale, columnar data warehouse for OLAP workloads.
113
What is Redshift Spectrum?
A feature allowing Redshift to run queries directly against S3 data.
114
How is Athena different from Redshift Spectrum?
Athena is serverless while Spectrum requires a Redshift cluster.
115
What is Amazon Kinesis used for?
Ingesting and processing real-time streaming data at scale.
116
What is the keyword for Kinesis?
Real-time data streaming.
117
What is AWS Glue?
A serverless data integration and ETL service for preparing and loading data.
118
What does AWS Glue help with?
Discovering, preparing, and integrating data for analytics and ML.
119
What is Amazon QuickSight?
A BI service that provides dashboards and ML-powered insights.
120
Which service supports Spark for transformations: QuickSight, OpenSearch, or EMR?
EMR supports Apache Spark for transformation and analytics workloads.
121
Amazon Comprehend: What is it?
A natural language processing service that extracts insights like key phrases, sentiment, and language from text.
122
Amazon Transcribe: What is it?
A speech-to-text service with features like speaker ID, custom vocabulary, and real-time transcription.
123
Amazon Transcribe: Use cases?
Call transcription, subtitling, and metadata generation for media content.
124
Amazon Kendra: What is it?
An enterprise search service using NLP to return accurate answers from large content repositories.
125
Amazon Kendra: Use cases?
Intelligent search, chatbots, and application search integrations.
126
Amazon Rekognition: What is it?
A video and image analysis service that identifies objects, people, text, and activities.
127
Amazon Rekognition: Use cases?
Content moderation, identity verification, and media analysis.
128
Amazon Textract: What is it?
A service that extracts text and data from scanned documents, including forms and tables.
129
Amazon Textract: Use cases?
Financial, healthcare, and government form processing.
130
Amazon Lex: What is it?
A service to build conversational interfaces using NLU and ASR.
131
Amazon Lex: Use cases?
Virtual assistants, FAQ bots, and automated customer interactions.
132
Amazon Personalize: What is it?
A service that provides real-time personalized recommendations using historical data.
133
Amazon Personalize: Use cases?
Personalized product, content, and trend recommendations.
134
What is Amazon SageMaker JumpStart?
A machine learning hub with foundation models built-in algorithms and prebuilt ML solutions for quick deployment.
135
What models does SageMaker JumpStart provide?
It provides proprietary and publicly available foundation models for text image and video tasks.
136
What algorithms are available in JumpStart?
Hundreds of built-in ML algorithms with pretrained models from TensorFlow PyTorch Hugging Face and others.
137
What solutions does JumpStart include?
Prebuilt solutions with reference architectures for common ML use cases.
138
What is Amazon Bedrock?
A platform for building generative AI applications with access to foundation models and agent development tools.
139
What models does Bedrock offer access to?
Hundreds of foundation models from leading AI companies including Amazon Anthropic Meta Mistral and Stability AI.
140
What customization options does Bedrock support?
Model customization using knowledge bases prompt engineering fine‑tuning and data automation.
141
What security features does Bedrock provide?
Built‑in guardrails privacy controls encrypted data and automated reasoning checks to minimize harmful content.
142
What is Amazon Q Business?
A generative AI-powered enterprise assistant that retrieves information and completes tasks using company data.
143
What tasks can Amazon Q Business perform?
Answer questions generate content automate workflows and provide unified insights across enterprise systems.
144
Which data sources can Amazon Q Business connect to?
It connects to systems like Amazon S3 SharePoint Salesforce and databases for enterprise search.
145
How does Amazon Q Business present answers?
It provides citations and references for transparency and delivers responses through a conversational interface.
146
What is Amazon Q Developer?
A generative AI assistant that helps build operate and transform software with capabilities across the SDLC.
147
What coding features does Amazon Q Developer support?
Code generation testing reviewing refactoring debugging and documentation.
148
How does Amazon Q Developer assist with AWS?
It provides guidance on AWS architecture cost optimization troubleshooting and cloud operations.
149
Where is Amazon Q Developer available?
In code editors like VS Code JetBrains GitHub preview and in the AWS Console and CLI.
150
What advanced capabilities does Q Developer include?
Agentic capabilities that automate multi-step tasks and accelerate development workflows.
151
What is Amazon Kinesis Data Streams used for?
Real-time data ingestion with low-latency processing.
152
Why is Kinesis Data Streams important for real-time apps?
It allows multiple apps to consume data from the same stream simultaneously.
153
Example use case for Kinesis Data Streams?
Ingesting real-time stock market data for immediate trading decisions.
154
What is Amazon Data Firehose?
A fully managed near-real-time streaming ETL service that delivers data from source to destination.
155
Example use case for Amazon Data Firehose?
Collecting smart home device data for long-term storage and analysis.
156
What is the purpose of a centralized data catalog?
To store metadata about an organization's datasets.
157
Which AWS service includes the Data Catalog?
AWS Glue.
158
What is AWS Glue used for?
Cleaning and transforming data with visual ETL tools and scheduling.
159
Who is AWS Glue ideal for?
Organizations wanting simplified code-free or low-code ETL.
160
What is Amazon EMR used for?
Large-scale data processing with frameworks like Spark Hadoop and Hive.
161
Who is Amazon EMR best suited for?
Teams with big data expertise needing custom configurations.
162
What is Amazon Athena?
A serverless SQL query service for analyzing data stored in S3 or hybrid sources.
163
What types of data can Athena analyze?
Relational nonrelational object and custom data sources.
164
What is Amazon Redshift ideal for?
Complex queries on large datasets and high-performance analytical workloads.
165
What is Amazon QuickSight used for?
Business intelligence dashboards and reports.
166
What does Amazon Q in QuickSight enable?
Natural language insights and dashboard creation.
167
What is Amazon OpenSearch Service used for?
Real-time search monitoring and analysis of operational and business data.
168
Common use cases for OpenSearch Service?
Application monitoring log analytics observability and website search.
169
What is Amazon Kinesis Data Streams used for?
Real-time ingestion of terabytes of data from applications streams and sensors with automatic scaling.
170
What is Amazon Data Firehose?
A fully managed near-real-time streaming ETL service that delivers data to data lakes warehouses and analytics services.
171
What is Amazon S3 commonly used for?
A popular choice for data lakes capable of securely storing large amounts of structured or unstructured data.
172
How does Amazon S3 scale?
Amazon S3 automatically scales storage as data is added or removed.
173
What is Amazon Redshift?
A fully managed data warehouse that stores petabytes of structured or semistructured data for analysis.
174
What makes Amazon Redshift ideal for analytics?
Its columnar storage and massively parallel processing support complex queries on large datasets.
175
What is AWS Glue Data Catalog?
A centralized scalable managed metadata repository that enhances data discovery and supports analytics services.
176
What is AWS Glue used for?
A fully managed ETL service that simplifies data preparation and uses Data Catalog metadata for transformations.
177
What is Amazon EMR ideal for?
Large-scale data processing using frameworks like Apache Spark Hadoop and Hive.
178
What is Amazon Athena?
A serverless SQL query service that analyzes data stored in Amazon S3 and other data sources.
179
What is Amazon QuickSight used for?
A business intelligence service for creating dashboards and reports with natural language insights.
180
What is Amazon OpenSearch Service?
A service for real-time search monitoring and visualization of logs traces and metrics.
181
Which two AWS services are suitable for data ingestion?
Amazon Kinesis Data Streams and Amazon Data Firehose.
182
Which AWS service is best for storing unstructured data?
Amazon S3.
183
Which AWS service is best for data processing in a pipeline?
AWS Glue.
184
Which two AWS services can be used for data visualization?
Amazon QuickSight and Amazon OpenSearch Service.