Large Language Models (LLMs) Flashcards

Learn how large language models are built, trained, and how they generate responses. (20 cards)

1
Q

What is a Large Language Model?

(LLM)

A

It is a type of artificial intelligence designed to understand and generate human-like text based on large amounts of text data.

LLMs are used in applications like chatbots, translation services, and content creation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the architecture of LLMs primarily rely on?

A

transformers

Transformers are a type of neural network architecture that improves the handling of sequential data and parallel processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a transformer in the context of AI?

A

It is a neural network architecture that uses self-attention mechanisms to process and generate sequences of data efficiently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

True or False:

Transformers are only used in language models.

A

False

Transformers are also used in various domains like image processing and protein folding predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the primary function of the self-attention mechanism in transformers?

A

To weigh the significance of different words in a sequence to each other, helping the model understand context and relationships.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does BERT stand for?

A

Bidirectional Encoder Representations from Transformers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the main difference between BERT and GPT?

A
  • BERT is designed for understanding the context of words in a sentence (bidirectional).
  • GPT is designed for generating text (unidirectional).

BERT processes text in both directions for better understanding, whereas GPT generates text in a forward manner.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or False:

BERT is mainly used for text generation.

A

False

BERT is primarily used for understanding text, such as in tasks like sentiment analysis and question answering.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does fine-tuning mean in the context of LLMs?

A

It is the process of adapting a pre-trained language model to a specific task by training it further on a smaller, task-specific dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is fine-tuning important for LLMs?

A

It allows general-purpose models to be specialized for specific tasks, improving their performance on those tasks.

For example, fine-tuning can customize a model for better results in medical text analysis or customer service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Fill in the blank:

A common issue with LLMs is ______, where the model generates incorrect or nonsensical information.

A

hallucination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is hallucination in LLMs?

A

It occurs when an LLM generates information that is not based on the input data or factual reality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can hallucination be mitigated in LLMs?

A

By carefully curating input data, using external verification systems, and applying techniques like reinforcement learning from human feedback.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which LLM type is better suited for tasks requiring deep understanding of text context?

A

BERT

BERT’s bidirectional nature makes it excellent at understanding complex text contexts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the role of the attention mechanism in transformers?

A

To focus on different parts of the input sequence, allowing the model to weigh the importance of each part when making predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True or False:

GPT is a generative model.

A

True

GPT generates new text based on input prompts, making it a powerful tool for content creation.

17
Q

In what type of tasks is GPT commonly used?

A

Text generation, such as writing essays, creating dialogues, and completing text prompts.

18
Q

Why are transformers more efficient than previous sequence models like RNNs?

A

Transformers can process data in parallel, unlike RNNs which process sequentially, making transformers faster and more scalable.

19
Q

Which model would you choose for a task that requires generating a coherent paragraph from a short prompt?

A

GPT

GPT is designed for generating fluent and coherent text based on minimal input.

20
Q

Fill in the blank:

The key innovation of transformers is the use of ______ attention.