Module 8 - AI ML and Data Analytics Flashcards

(76 cards)

1
Q

What is artificial intelligence?

A

A broad field of computer science focused on the development of intelligent computer systems capable of performing humanlike tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is machine learning?

A

A type of AI for training machines to perform complex tasks without explicit instructions. Machine learning training finds the patterns hidden in vast amounts of historical data to produce an ML model. This ML model can then be applied to new data to make predictions or decisions based on the patterns it’s learned.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Fill in the blank

With AI model training, the goal is to create a mathematical model that accurately identifies an ____________ while balancing the many different possible variables, outliers, and complications in data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is deep learning?

A

Deep learning (DL) is a subset of machine learning where models are trained using layers of artificial neurons that mimic the human brain. Each layer of these neural networks summarizes and feeds information to the next layer until a final model is produced.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a foundation model?

A

A very large pre-trained, deep learning model trained on an enormous and diverse training set. While traditional ML models are trained to perform singular tasks, FMs can be adapted to perform multiple tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a Large Language Model (LLM)?

A

A popular type of Foundation Model trained, on massive data sets of text documents, to predict the probability of the next word(s) that should appear following a previous word. LLMs can generate clear, easy-to-understand text for a wide variety of target audiences.

(Culled from various sources and over-simplified.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is generative AI?

A

Models that can create (“generate”) content that is complex, coherent and original. LLMs are an example of generative AI.

See also Generative AI in the Google Developer Machine Learning Glossary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is natural language processing (NLP)?

A

The field of teaching computers to process what a user said or typed using linguistic rules. Almost all modern natural language processing relies on machine learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a GPU?

A

A Graphics Processing Unit. Originally designed to handle rendering of 3D graphics. GPUs feature thousands of small efficient cores optimized for high-throughput parallel processing. Thus they are now being leveraged for AI/ML.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the difference between a CPU and a GPU?

A

A CPU is a general purpose processor composed of a few powerful cores with lots of cache memory that can handle a few software threads at a time. A GPU is composed of hundreds (or thousands) of small cores that handle thousands of threads simultaneously.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the common uses of ML models in business today?

A
  1. Customer prediction & targeting
  2. Recommendation systems
  3. Fraud detection
  4. Pricing & revenue optimization
  5. Forecasting
  6. Operations & process automation
  7. Searching, ranking & matching
  8. Customer support
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Name the three tiers of the AWS AI/ML stack

A
  1. AWS AI Services
  2. AWS ML Services
  3. ML Frameworks and Infrastructures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Fill in the blank

The AI services tier of the AI/ML stack includes pre-built models that are already ____________ to perform specific functions.

A

trained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fill in the blanks

The ML services tier of the AWS AI/ML stack enables AWS customers to build, ____________ and deploy their own ML models with ____________ infrastructure.

A
  1. train
  2. fully-managed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Fill in the blanks

The ML frameworks and infrastructure tier of the AWS AI/ML stack is a custom approach to building models using ____________ chips that integrate with popular ML ____________.

A
  1. purpose-built
  2. frameworks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Name the three types of AI services in the AI Services tier of the AWS AI/ML stack

A
  1. Language services
  2. Computer vision and search services
  3. Conversational AI and personalization services
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Name the four (4) language services in the AI Services tier of the AWS AI/ML stack

A
  1. Amazon Comprehend
  2. Amazon Polly
  3. Amazon Transcribe
  4. Amazon Translate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Fill in the name of the AWS AI Service Offering

____________ uses natural language processing (NLP) to extract insights about the content of documents without the need of any special preprocessing. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document.

A

Amazon Comprehend. The documents are text files. With Amazon Comprehend you can search social networking feeds for mentions of products, scan an entire document repository for key phrases, or determine the topics contained in a set of documents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Fill in the name of the AWS AI Service Offering

____________ is a Text-to-Speech (TTS) cloud service that converts text into lifelike speech.

A

Amazon Polly. It supports multiple languages and includes a variety of lifelike voices, so you can build speech-enabled applications that work in multiple locations and use the ideal voice for your customers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Fill in the name of the AWS AI Services Offering

____________ provides transcription services for your audio files and audio streams. It uses advanced machine learning technologies to recognize spoken words and transcribe them into text.

A

Amazon Transcribe

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Fill in the name of the AWS AI Services Offering

____________ is a neural machine translation service for translating text to and from English across a breadth of supported languages.

A

Amazon Translate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Name the three (3) computer vision and search services in the AI Services Tier of the AWS AI/ML stack

A
  1. Amazon Kendra
  2. Amazon Rekognition
  3. Amazon Textract
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Fill in the name of the AWS AI Services Offering

____________ is a search service, powered by machine learning, that helps users to search unstructured text using natural language.

A

Amazon Kendra
Because it understands the context of a query, it can return more precise and relevant answers than just a list of documents with matching keywords.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Fill in the name of the AWS AI Services Offering

____________ is a video analysis service. It can identify objects, people, text, scenes, and activities within images and videos stored in Amazon Simple Storage Service (Amazon S3).

A

Amazon Rekognition. It can detect any inappropriate content as well. It also provides highly accurate facial analysis and facial recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
# Fill in the name of the AWS AI Services Offering ____________ enables you to add document text detection and analysis to your applications. You provide a document image and the service detects the document text, including both typed and handwritten text.
Amazon Textract ## Footnote See [What is Amazon Textract?](https://docs.aws.amazon.com/textract/latest/dg/what-is.html)
26
Name the two (2) conversational AI and personalization services in the AI Services Tier of the AWS AI/ML stack
1. Amazon Lex 2. Amazon Personalize
27
# Fill in the name of the AWS AI Services Offering ____________ is an AWS service for building conversational interfaces into applications using voice and text.
Amazon Lex With Amazon Lex, the same deep learning engine that powers Amazon Alexa is now available to any developer, enabling you to build sophisticated, natural language chatbots into your new and existing applications. ## Footnote See [Amazon Lex Documentation](https://docs.aws.amazon.com/lex/)
28
# Fill in the name of the AWS AI Services Offering ____________ is a fully managed machine learning service that uses your data to generate item recommendations for your users.
Amazon Personalize ## Footnote See [What is Amazon Personalize?](https://docs.aws.amazon.com/personalize/latest/dg/what-is-personalize.html)
29
Name the primary AWS offering in the ML services tier
SageMaker AI
30
# Fill in the blank Amazon SageMaker AI is a ____________ machine learning (ML) service.
Fully managed. With SageMaker AI, you can store and share your data without having to build and manage your own servers. ## Footnote See [What is Amazon SageMaker AI?](https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html)
31
# Fill in the blanks With SageMaker AI, data scientists and developers can quickly ____________, ____________ and ____________ ML models into a production-ready hosted environment.
build, train, deploy ## Footnote See [What is Amazon SageMaker AI?](https://docs.aws.amazon.com/sagemaker/latest/dg/whatis.html)
32
# True or False? SageMaker AI offers access to hundreds of pre-trained models that you can deploy in a few quick steps.
True
33
# Fill in the blank SageMaker AI includes an IDE for data scientists and a ____________ for business analysts.
no-code interface
34
What are the two (2) core components of the AWS AI/ML ML frameworks and infrastructure tier?
1. ML frameworks 2. AWS ML infrastructure
35
What are ML frameworks?
A software library that provides ML practitioners with pre-built optimized components for building ML models.
36
Name three (3) ML frameworks supported by AWS
1. PyTorch 1. Apache M-X Net 1. TensorFlow ## Footnote See [PyTorch](https://pytorch.org/), [Apache MXNet](https://en.wikipedia.org/wiki/Apache_MXNet) (no longer in development), and [TensorFlow](https://www.tensorflow.org/).
37
What are 3 examples of AWS ML infrastructure that can support ML workloads?
1. ML-optimized Amazon Elastic Compute Cloud (EC2) instances 2. Amazon EMR 3. Amazon Elastic Container Service (ECS)
38
What is Amazon EMR?
Previously called Amazon Elastic MapReduce, EMR is a managed cluster platform that simplifies running big data frameworks, such as Apache Hadoop and Apache Spark, on AWS to process and analyze vast amounts of data. ## Footnote See also [What is Amazon EMR?](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-what-is-emr.html)
39
What is the relationship between Amazon SageMaker AI and an ML framework such as PyTorch?
PyTorch is the ML engine, "doing the math". SageMaker AI is a platform from which you can run PyTorch.
40
Name three (3) AWS Generative AI solutions
1. Amazon SageMaker JumpStart 2. Amazon Bedrock 3. Amazon Q
41
Describe Amazon SageMaker JumpStart
A machine learning hub with foundation models and prebuilt ML solutions that you can deploy with a few clicks. These pre-trained models are fully customizable for your specific use case, by using your own data. ## Footnote See [SageMaker JumpStart pretrained models](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html)
42
What is Amazon Bedrock?
Amazon Bedrock is a **fully managed** service that offers a broad choice of high-performing, pre-trained foundation models from Amazon and other leading AI companies. You can privately adapt these models with your data and deploy them without managing servers and infrastructure. ## Footnote See [What is Amazon Bedrock?](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
43
# True or False Amazon Bedrock enables experimentation with different Foundation Models.
True. All through a single unified API.
44
What is Amazon Q Business?
A fully managed, generative-AI powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data. ## Footnote See [What is Amazon Q Business?](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/what-is.html)
45
What is Amazon Q Developer?
A generative AI powered conversational assistant that can help you understand, build, extend, and operate AWS applications. You can ask questions about AWS architecture, your AWS resources, best practices, documentation, support, and more. ## Footnote See [What is Amazon Q Developer?](https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/what-is.html)
46
# True or False The Amazon Q solutions are built on Amazon Bedrock
True. Because Amazon Q Business and Amazon Q Developer are built on Amazon Bedrock, users can take full advantage of the controls implemented in Amazon Bedrock to enforce safety, security, and the responsible use of artificial intelligence (AI). ## Footnote See [Amazon Q Documentation](https://docs.aws.amazon.com/amazonq/)
47
What is Data Analytics?
Examining and interpreting large sets of data to uncover trends, patterns, and actionable insights. ## Footnote See [What is data analytics? at adobe.com](https://business.adobe.com/blog/basics/data-analytics#what-is-data-analytics)
48
What is ETL?
Extract, Transform and Load A process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system. ## Footnote See [What is ETL? at ibm.com](https://www.ibm.com/think/topics/etl)
49
What is a data pipeline?
A process that moves data from a source to a destination. An ETL data pipeline makes the ETL process efficient and repeatable.
50
Described the Extract step of ETL
Ingesting data from various sources and storing it
51
Describe the Transform step of ETL
Transforming data into a consistent, usable format for downstream tools to consume.
52
Describe the Load step of ETL
Loading data into a destination system, like a data warehouse or analytics platform.
53
What are some common phases of a typical AWS data pipeline?
* Ingestion * Storage * Cataloging * Transformation (data processing) * Analysis, Queries and Visualization
54
# Fill in the blank Data ingestion can be performed in ____________ or as a ____________.
real-time batch (when some latency is ok)
55
Name two (2) Amazon services for ingestion
1. Amazon Kinesis Data Streams 2. Amazon Data Firehose
56
# Fill in the blank You can use ____________ for real-time ingestion of terabytes of data from applications, streams, and sensors.
Kinesis Data Streams ## Footnote See [What is Amazon Kinesis Data Streams?](https://docs.aws.amazon.com/streams/latest/dev/introduction.html)
57
# Fill in the blank ____________ is a fully managed service for delivering streaming data, in near real-time, to destinations such as Amazon S3, Amazon Redshift, Amazon OpenSearch Service, Amazon OpenSearch Serverless, Splunk, Apache Iceberg Tables, and any custom HTTP endpoint or HTTP endpoints owned by supported third-party service providers.
Amazon Data Firehose ## Footnote See [What is Amazon Data Firehose?](https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html)
58
How are Kinesis Data Streams and Amazon Data Firehose different?
* Kinesis Data streams is intended as a real-time low-latency ingestion service while Data Firehose is intended as a data transfer service (near real-time) for loading data into S3, Splunk, RedShift, etc. * Kinesis Data streams is a managed service while Data Firehose is a **fully** managed service * Kinesis is more customizable while Firehose is simpler to configure
59
What is the difference between a data lake and a data warehouse?
A data lake stores vast amounts of raw data (and is thus more flexible) while data warehouses store more structured data and are optimized for business intelligence.
60
Which Amazon service is a popular choice for data lakes?
Amazon S3 (Simple Storage Service)
61
What Amazon service is a fully managed petabyte-scale data warehouse?
Amazon Redshift ## Footnote See [Amazon Redshift Documentation](https://docs.aws.amazon.com/redshift/)
62
What Amazon service can you use to catalog data?
AWS Glue Data Catalog provides a centralized, scalable, and managed metadata repository that enhances data discovery. ## Footnote See [Data discovery and cataloging in AWS Glue](https://docs.aws.amazon.com/glue/latest/dg/catalog-and-crawler.html)
63
Which AWS service allows you to transform, prepare, and clean data for analysis?
AWS Glue. You can also: * Discover and organize data * Build and monitor data pipelines ## Footnote See also [What is AWS Glue?](https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html) and [AWS Glue concepts](https://docs.aws.amazon.com/glue/latest/dg/components-key-concepts.html)
64
Name two (2) Amazon services for transforming data.
1. AWS Glue 2. Amazon EMR ## Footnote See [What is AWS Glue?](https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html) and [Amazon EMR Documentation](https://docs.aws.amazon.com/emr/)
65
Name three (3) big data frameworks supported by Amazon EMR
1. Apache Spark 2. Apache Hadoop 3. Apache Hive ## Footnote See [Apache Spark](https://spark.apache.org/), [Apache Hadoop](https://hadoop.apache.org/) and [Apache Hive](https://hive.apache.org/).
66
# Fill in the blank Apache ____________, ____________ and ____________ work together to form a powerful big data processing and analytics stack.
Hadoop, Spark, and Hive ## Footnote See [Apache Spark](https://spark.apache.org/), [Apache Hadoop](https://hadoop.apache.org/) and [Apache Hive](https://hive.apache.org/).
67
What is the HDFS component of Hadoop?
Hadoop Distributed File System Stores data on compute nodes in a cluster, providing very high aggregate bandwidth and fault tolerance. ## Footnote See [Hadoop wiki](https://cwiki.apache.org/confluence/display/hadoop)
68
What is the MapReduce component of Hadoop?
A computational paradigm in which the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. ## Footnote See [Hadoop wiki](https://cwiki.apache.org/confluence/display/hadoop)
69
What are some limitations of Hadoop MapReduce?
* Not suitable with huge numbers of small files. * Support for batch processing but not streamed data. * Slower performance because it does not employ in-memory caching.
70
What Hadoop (MapReduce) problems does Apache Spark solve?
It is designed to perform both batch processing and new workloads like streaming, interactive queries, and machine learning. Performance. It utilizes in-memory caching, and optimized query execution for fast analytic queries against data of any size. ## Footnote See [What is Apache Spark?](https://aws.amazon.com/what-is/apache-spark/)
71
# Fill in the blank Apache Hive is a data warehouse built on top of Apache Hadoop for that provides a ____________ interface.
SQL ## Footnote See [Apache Hive](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive.html)
72
Name four (4) Amazon query and visualization tools for data analytics
1. Amazon Athena 2. Amazon Redshift 3. Amazon QuickSight 4. Amazon OpenSearch Service
73
# Fill in the blank ____________ is a fully-managed serverless interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (S3) using standard SQL.
Amazon Athena ## Footnote See [What is Amazon Athena?](https://docs.aws.amazon.com/athena/latest/ug/what-is.html)
74
Amazon Redshift supports complex SQL queries on data stored in a Redshift ____________.
data warehouse Its columnar storage and massively parallel processing architecture make it ideal for analyzing large datasets.
75
# Fill in the blank With ____________, both technical and non-technical users can quickly create modern interactive dashboards and reports from various data sources without managing infrastructure.
Amazon QuickSight (for data visualization). Now part of Amazon Quick Suite. QuickSight supports natural language queries from business analysts. ## Footnote See [What is Amazon Quick Suite?](https://docs.aws.amazon.com/quicksuite/latest/userguide/what-is.html)
76
# Fill in the blank Amazon OpenSearch Service is a managed service that makes it easy to deploy, operate, and scale ____________, a popular open-source search and analytics engine.
OpenSearch ## Footnote See [Amazon OpenSearch Service Documentation](https://docs.aws.amazon.com/opensearch-service/)