What is Natural Language Processing?
The ability of a computer program to understand human speech as it is spoken.
Various kinds of knowledge of language?
What is Ambiguity?
one phrase often has multiple meanings
Most important models?
machine learning tools for language tasks?
2. sequence models
What is Regular Expressions (RE)?
Regex search requires?
pattern and corpus
Simplest kind of regex?
a sequence of simple characters
most common anchors in regex?
2. dollar sign, matches the end of a line
RE: \d
any digit
ex: [0-9]
RE: \D
any non-digit
ex: [^0-9]
RE: \w
any alphanumeric or space
ex: [a-zA-Z0-9 ]
RE: \W
a non-alphanumeric
ex: [^\w]
RE: \s
whitespace (space, tab)
ex: [ \r\t\n\f]
RE: \S
non-whitespace
ex: [^\s]
special characters yang perlu pake backslash?
\* (tanda bintang) \. (tanda titik) \? (tanda tanya) \n (newline) \t (tab)
regular language can be describe by?
regular expressions and finite-state automata
3 standard solutions to the problem of non-determinism in finite-state automata?
primitive operations of a regular expression?
operations in regular languages?
closure known as?
kleene star
process steps in NLP?
What is Orthographic?
Orthographic rules tell us that English words ending in -y are pluralized by changing the -y to -i- and adding an -es.
What is Morphological?
Morphological rules tell us that fish has a null plural, and that the plural of goose is formed by changing the vowel.