AI, Machine Learning, Analytics Technology and Services Flashcards

Question

What tools can be used for analytics after data is loaded by Kinesis Data Firehose?

Answer 1

Business intelligence tools can be used for analytics after data is loaded into its final destination by Kinesis Data Firehose.

Answer 2

Kinesis Data Firehose includes integrated monitoring with CloudWatch.

Answer 3

Kinesis Data Firehose has automatic error retries if something goes wrong.

Answer 4

No, Kinesis Data Firehose does not retain data, even temporarily.

Answer 5

A data lake refers to a large-scale data repository for storing streaming data.

Answer 6

You can use AWS Lambda to transform data in Kinesis Data Firehose.

Answer 7

Use cases include real-time analytics, feeding data into data lakes, log data management, and IoT data integration.

Answer 8

Common destinations include Amazon S3, Amazon Redshift, and Amazon OpenSearch Service.

Answer 9

Kinesis Data Streams capture and store streaming video and data, whereas Kinesis Data Firehose captures, transforms, and loads data continuously into data stores.

Answer 10

Amazon Athena is an interactive query service that enables you to run standard SQL queries on data stored in Amazon S3.

Answer 11

You can run standard SQL queries with Amazon Athena.

Answer 12

Amazon Athena is serverless, meaning there is nothing to provision and manage.

Answer 13

You pay per query and per terabyte scanned when using Amazon Athena.

Answer 14

No, there is no need for complex extract, transform, and load (ETL) processes when using Amazon Athena. It works directly with data stored in S3.

Answer 15

Use cases for Amazon Athena include querying log files stored in S3, analyzing AWS cost and usage reports, generating business reports on data stored in S3, and running queries on clickstream data stored in S3.

Answer 16

AWS Glue is used to prepare your data for analytics and machine learning.

Answer 17

AWS Glue is important because it prepares and transforms data, making it ready for use by analytics applications and machine learning models.

Answer 18

The data catalog serves as the central repository containing metadata about the data, including its type and format.

Answer 19

Transformed data can be loaded into AWS services like RDS, Redshift, S3, or Athena.

Answer 20

AWS Glue can categorize data, clean it, remove duplicates, and join multiple datasets.

Answer 21

AWS Glue crawls your data and creates the data catalog, which is the central repository containing the metadata, such as the type or format of your data.

Answer 22

After creating the data catalog, AWS Glue can extract data from various sources, transform it (e.g., categorize, clean, remove duplicates, or join multiple datasets), and then load it into other AWS services.

Answer 23

AWS Data Exchange allows you to securely exchange and use data provided by third parties on a subscription basis.

Answer 24

Data products are available from a variety of suppliers, including financial services, healthcare, weather, manufacturing, and telecommunications.

Answer 25

The data can be used for analytics, machine learning workloads, and decision-making.

Answer 26

An example use case is analyzing customer spending patterns based on geographic location using data products provided by companies like MasterCard, Experian, and Equifax.

Answer 27

Elastic Map Reduce (EMR) is a big data platform provided by AWS that supports large-scale parallel data processing and petabyte-scale interactive analysis.

Answer 28

EMR supports structured data (e.g., financial transaction data), semi-structured data (e.g., text or documentation), and unstructured data (e.g., application logs or click-stream data).

Answer 29

One example of a use case for EMR is processing genomic data using statistical algorithms and predictive models to discover hidden patterns and find correlations.

Answer 30

EMR can analyze click-stream data to understand customer preferences or market trends.

Answer 31

EMR can extract data from sources like S3, DynamoDB, or Redshift.

Answer 32

EMR can be used to analyze events from streaming data sources in real time using Amazon Kinesis.

Answer 33

EMR supports popular open-source frameworks like Apache Spark, Apache Hive, Presto, and Hadoop.

Answer 34

The benefits of using EMR include not having to worry about provisioning and managing infrastructure, configuring and managing open-source applications, capacity planning, and it can dynamically scale as required by the workload. It is also optimized for performance and is claimed to be faster and less costly than deploying an on-premises big data solution.

Answer 35

AWS claims that EMR is less than 50% of the cost of deploying your own big data solution on-premises.

Answer 36

Amazon OpenSearch is a fully-managed service based on open-source Elasticsearch technology, compatible with Elasticsearch open-source APIs, Logstash for data collection and processing, and Kibana for search and data visualization.

Answer 37

Amazon OpenSearch is compatible with industry-standard Elasticsearch open-source APIs, Logstash, and Kibana.

Answer 38

A business might choose to use Amazon OpenSearch because it is a fully-managed service that simplifies the use of Elasticsearch open-source technology, while also supporting data collection, processing, and visualization tools like Logstash and Kibana. It is suitable for various analytics use cases, including log, application, security, and business data analytics

Answer 39

You can ingest data into Amazon OpenSearch from AWS services such as CloudWatch Logs, S3, DynamoDB, and Firehose.

Answer 40

Logstash is used for data collection and processing in conjunction with Amazon OpenSearch.

Answer 41

Kibana is used with Amazon OpenSearch for search and data visualization.

Answer 42

Use cases for Amazon OpenSearch include log analytics, application monitoring, security analytics, and business data analytics.

Answer 43

Amazon OpenSearch is a fully-managed service that is based on open-source Elasticsearch technology and is compatible with Elasticsearch open-source APIs.

Answer 44

Using Amazon OpenSearch, you can perform log analytics, application monitoring, security analytics, and business data analytics.

Answer 45

Yes, you can use Amazon OpenSearch with AWS CloudWatch Logs by ingesting data from CloudWatch Logs into Amazon OpenSearch.

Answer 46

AWS Data Exchange

Answer 47

Amazon Comprehend

Answer 48

Kinesis Data Firehose

Answer 49

Amazon MSK (Managed Streaming for Apache Kafka)

Answer 50

Kinesis enables you to collect, process, and analyze streaming data in real time.

Answer 51

Amazon Textract

Answer 52

Athena is an interactive query service for data in S3. It enables you to query data stored in S3 using standard SQL.

Answer 53

Amazon EMR (Elastic MapReduce)

Answer 54

Amazon CloudWatch

Answer 55

Trusted Advisor

Answer 56

AppStream will handle hosting, scaling, and user management for your application and help you convert it into a SaaS product for your employees or customers.

Answer 57

Generate insights and recommendations to help you adhere to the Well-Architected Framework.

Answer 58

Indefinitely

Answer 59

The Well-Architected Tool helps you use the Well-Architected Framework as a set of lenses through which to analyze your workloads. You can use it to learn about the Well-Architected Framework and generate action plans to bring your architectures into alignment with it.

Answer 60

AWS Config allows you to set up account-wide rules and detect non-compliant resources.

Answer 61

AWS Health Dashboard will give you a view of all outages across AWS, as well as a personal dashboard that displays only those services and Regions that are relevant to your cloud resources.

Answer 62

CloudWatch alarms can be used to send notifications or trigger automated events when metrics reach defined thresholds.

Answer 63

The science of developing algorithms that learn patterns from historical data to make predictions without explicit instructions.

Answer 64

A field focused on solving cognitive problems such as learning, problem solving, and pattern recognition.

Answer 65

AI services, ML services, and ML frameworks and infrastructure.

Answer 66

Provides text translation and localization through API calls.

Answer 67

Converts text into lifelike speech using text-to-speech capabilities.

Answer 68

Builds conversational chatbots using voice and text interactions.

Answer 69

Adds image and video analysis such as object, person, and text detection.

Answer 70

A natural language processing service for sentiment and entity analysis.

Answer 71

A service for building accurate time-series forecasts using ML.

Answer 72

A service that uses ML to analyze code quality and optimize performance.

Answer 73

A managed ML platform to build, train, and deploy ML models at scale.

Answer 74

An ML-powered code generator offering real-time code suggestions and security scans.

Answer 75

TensorFlow, PyTorch, and Apache MXNet.

Answer 76

EC2 P3 and P3dn instances for accelerated compute.

Answer 77

Amazon Rekognition.

Answer 78

The process of converting raw data into actionable insights for decisions and optimization.

Answer 79

A serverless interactive query service for analyzing S3 data using SQL.

Answer 80

You pay per query or per terabyte scanned.

Answer 81

When querying S3 data such as logs, reports, clickstreams, or cost and usage data.

Answer 82

Amazon Macie.

Answer 83

Uses ML to discover, classify, and protect sensitive data in S3 including PII.

Answer 84

Macie contains an 'I', like 'PII'.

Answer 85

A petabyte-scale, columnar data warehouse for OLAP workloads.

Answer 86

A feature allowing Redshift to run queries directly against S3 data.

Answer 87

Athena is serverless while Spectrum requires a Redshift cluster.

Answer 88

Ingesting and processing real-time streaming data at scale.

Answer 89

Real-time data streaming.

Answer 90

A serverless data integration and ETL service for preparing and loading data.

Answer 91

Discovering, preparing, and integrating data for analytics and ML.

Answer 92

A BI service that provides dashboards and ML-powered insights.

Answer 93

EMR supports Apache Spark for transformation and analytics workloads.

Answer 94

A natural language processing service that extracts insights like key phrases, sentiment, and language from text.

Answer 95

A speech-to-text service with features like speaker ID, custom vocabulary, and real-time transcription.

Answer 96

Call transcription, subtitling, and metadata generation for media content.

Answer 97

An enterprise search service using NLP to return accurate answers from large content repositories.

Answer 98

Intelligent search, chatbots, and application search integrations.

Answer 99

A video and image analysis service that identifies objects, people, text, and activities.

Answer 100

Content moderation, identity verification, and media analysis.

Answer 101

A service that extracts text and data from scanned documents, including forms and tables.

Answer 102

Financial, healthcare, and government form processing.

Answer 103

A service to build conversational interfaces using NLU and ASR.

Answer 104

Virtual assistants, FAQ bots, and automated customer interactions.

Answer 105

A service that provides real-time personalized recommendations using historical data.

Answer 106

Personalized product, content, and trend recommendations.

Answer 107

A machine learning hub with foundation models built-in algorithms and prebuilt ML solutions for quick deployment.

Answer 108

It provides proprietary and publicly available foundation models for text image and video tasks.

Answer 109

Hundreds of built-in ML algorithms with pretrained models from TensorFlow PyTorch Hugging Face and others.

Answer 110

Prebuilt solutions with reference architectures for common ML use cases.

Answer 111

A platform for building generative AI applications with access to foundation models and agent development tools.

Answer 112

Hundreds of foundation models from leading AI companies including Amazon Anthropic Meta Mistral and Stability AI.

Answer 113

Model customization using knowledge bases prompt engineering fine‑tuning and data automation.

Answer 114

Built‑in guardrails privacy controls encrypted data and automated reasoning checks to minimize harmful content.

Answer 115

A generative AI-powered enterprise assistant that retrieves information and completes tasks using company data.

Answer 116

Answer questions generate content automate workflows and provide unified insights across enterprise systems.

Answer 117

It connects to systems like Amazon S3 SharePoint Salesforce and databases for enterprise search.

Answer 118

It provides citations and references for transparency and delivers responses through a conversational interface.

Answer 119

A generative AI assistant that helps build operate and transform software with capabilities across the SDLC.

Answer 120

Code generation testing reviewing refactoring debugging and documentation.

Answer 121

It provides guidance on AWS architecture cost optimization troubleshooting and cloud operations.

Answer 122

In code editors like VS Code JetBrains GitHub preview and in the AWS Console and CLI.

Answer 123

Agentic capabilities that automate multi-step tasks and accelerate development workflows.

Answer 124

Real-time data ingestion with low-latency processing.

Answer 125

It allows multiple apps to consume data from the same stream simultaneously.

Answer 126

Ingesting real-time stock market data for immediate trading decisions.

Answer 127

A fully managed near-real-time streaming ETL service that delivers data from source to destination.

Answer 128

Collecting smart home device data for long-term storage and analysis.

Answer 129

To store metadata about an organization's datasets.

Answer 130

Cleaning and transforming data with visual ETL tools and scheduling.

Answer 131

Organizations wanting simplified code-free or low-code ETL.

Answer 132

Large-scale data processing with frameworks like Spark Hadoop and Hive.

Answer 133

Teams with big data expertise needing custom configurations.

Answer 134

A serverless SQL query service for analyzing data stored in S3 or hybrid sources.

Answer 135

Relational nonrelational object and custom data sources.

Answer 136

Complex queries on large datasets and high-performance analytical workloads.

Answer 137

Business intelligence dashboards and reports.

Answer 138

Natural language insights and dashboard creation.

Answer 139

Real-time search monitoring and analysis of operational and business data.

Answer 140

Application monitoring log analytics observability and website search.

Answer 141

Real-time ingestion of terabytes of data from applications streams and sensors with automatic scaling.

Answer 142

A fully managed near-real-time streaming ETL service that delivers data to data lakes warehouses and analytics services.

Answer 143

A popular choice for data lakes capable of securely storing large amounts of structured or unstructured data.

Answer 144

Amazon S3 automatically scales storage as data is added or removed.

Answer 145

A fully managed data warehouse that stores petabytes of structured or semistructured data for analysis.

Answer 146

Its columnar storage and massively parallel processing support complex queries on large datasets.

Answer 147

A centralized scalable managed metadata repository that enhances data discovery and supports analytics services.

Answer 148

A fully managed ETL service that simplifies data preparation and uses Data Catalog metadata for transformations.

Answer 149

Large-scale data processing using frameworks like Apache Spark Hadoop and Hive.

Answer 150

A serverless SQL query service that analyzes data stored in Amazon S3 and other data sources.

Answer 151

A business intelligence service for creating dashboards and reports with natural language insights.

Answer 152

A service for real-time search monitoring and visualization of logs traces and metrics.

Answer 153

Amazon Kinesis Data Streams and Amazon Data Firehose.

Answer 154

Amazon S3.

Answer 155

Amazon QuickSight and Amazon OpenSearch Service.

AI, Machine Learning, Analytics Technology and Services Flashcards

(185 cards)