Natural Language processing Flashcards

Question 1

Q

Define natural language processing

Answer

A

goal is to make machines to understand and interpret human language the way it is written or spoken

Question 2

Q

What are different levels of linguistic analysis used in NLP

Answer

A

syntax – what part of given text is grammatically right
semantics – what is the meaning of given text

Question 3

Q

What does NLU do

Answer

A

tries to understand the meaning of given text – the nature and structure of each word inside text must be known for
NLU

Question 4

Q

What are applications of natural language understanding

Answer

A

Search and Information Retrieval
Enhancing search results by identifying key information, like people or locations, for better relevance.
Word Prediction
Predicting the next word in a sentence based on context and prior words.
Text Classification
Categorising text into predefined categories (e.g., spam vs. not spam, topic classification).

Question 5

Q

What is sentence segmentation

Answer

A

identify sentence boundaries in the given text, i.e., where one sentence ends and
where another sentence begins; sentences are often marked ended with punctuation mark ‘.’

Question 6

Q

What is tokenisation

Answer

A

identify different words, numbers, and other punctuation symbols

Question 7

Q

What is stemming

Answer

A

strip the ending of words like ‘eating’ is reduced to ‘eat’

Question 8

Q

What is part of speech (POS) tagging

Answer

A

assign each word in a sentence its own part-of-speech tag such as designating
word as noun or adverb

Question 9

Q

What is named entity recognition

Answer

A

identify entities such as persons, location and time within the documents

Question 10

Q

What is co reference (discourse) resolution

Answer

A

define the relationship of the given word in a sentence with previous and
next sentence

Question 11

Q

What does stemming do

Answer

A

try to find the base form of words

Stemming usually refers to a crude heuristic process that chops off the ends of words
in the hope of achieving this goal correctly most of the time, and often includes the removal
of derivational affixes.

Question 12

Q

What does lemmaliser do

Answer

A

Lemmatization usually refers to doing things properly with the use of a vocabulary and
morphological analysis of words, normally aiming to remove inflectional endings only
and to return the base or dictionary form of a word, which is known as the lemma.

Question 13

Q

What are stop words

Answer

A

Stop words usually refer to the most common words such as “and”
“the”, “a” in a language,
but there is no single universal list of stopwords. The list of the stop words can change
depending on your application.

Question 14

Q

What is part of speech

Answer

A

The part of speech explains how a word is used in a sentence. There are eight main parts of speech - nouns,
pronouns, adjectives, verbs, adverbs, prepositions, conjunctions and interjections.

Question 15

Q

WHat is bag of words

Answer

A

Any information about the order or structure of words is discarded. That’s why it’s called
a bag of words. This model is trying to understand whether a known word occurs in a
document, but don’t know where is that word in the document.
The intuition is that similar documents have similar contents. Also, from a content, we can
learn something about the meaning of the document.

Question 16

Q

What is Term Frequency(TF)

Answer

Study These Flashcards

A

a scoring of the frequency of the word in the current document.
This part measures how often a word appears in a document.
The more frequently a word appears, the higher its TF score for that document.

Question 17

Q

what is inverse Term. frequency

Answer

Study These Flashcards

A

a scoring of how rare the word is across documents.
This part measures how unique or rare a word is across multiple documents in the corpus.
The rarer the word, the higher its IDF score.

Question 18

Q

What does N-Gram do

Answer

Study These Flashcards

A

basic idea underlying the statistical approach to word prediction is to use the probabilities of SEQUENCES OF
WORDS to choose the most likely next word

Question 19

Q

What is the Markov assumption

Answer

Study These Flashcards

A

only prior local context – the last few words – affects the next word
* making the Markov assumption for word prediction means assuming
that the probability of a word only depends on the previous N-1 words
(N-GRAM model)

Natural Language processing Flashcards

(19 cards)