Phrase is consist of two words or a more words that forms a constituent and by which it functions
as a single unit in the syntax of a sentence. Sentence is a group of words that are put
together to mean something which expresses a complete thought. It does this by following the
grammatical rules of syntax. There are four type of Sentence: Simple Sentence, Compound
Sentence, Complex Sentence and Complex- Compound Sentence. But on this research, it is
focus only on Simple Sentence.
2.2. Natural Language Processing
Natural Language Processing NLP that is a part of computer science, Artificial Intelligence
AI, and Linguistics which deals with the interaction between human and computer so that
the computer has the ability to be able to understand natural human language.
There are four NLP that used for this research : Part-of-Speech Tagging POS Tagging, Named
Entity Recognition, Constituency Parsing dan Dependency Parsing.
2.2.1. Part-Of-Speech Tagging
Part-of-Speech Tagging POS Tagging is the process of determining the words according to
the grammar for each of the words in the sentences of natural language. POS Tagging is
also can provide information of the word from syntactic or morphology of a sentence.
This research use Indonesian Language POS Tagging iPOSTagger that made by Alfan
Farizki Wicaksono and Ayu Purwarianti. iPOSTagger is very important in this research,
especially in classified and tagging the words on a sentence.
2.2.2. Named Entity Recognition
Named Entity Recognition NER is one of the components of information extraction to detect
and classify the named-entity in a text. NER is generally used to detect peoples names, place
names and organization of a document.
This research use LingPipe NER for English. in this reasearch, so that LingPipe NER can be
use. While for detect the other names will be use some addition rules for categorizing.
2.2.3. Constituency Parsing
Constituency Parsing or Phrase Structure Grammar is a parser used in NLP to parse a
sentence. The function of this parser is as decomposers sentence to make a Constituency
Tree or Phrase Structure Grammar Tree Tree Grammar Pattern. Constituency Parser using the
grammar rules to generate Constituency Tree of a sentence so that it becomes a model of grammar
patterns.
But Constituency Parsing is not developed in this research. This research use Constituency
Parsing by Stanford Parser to generate a Constituency Tree. Stanford Parser is an English
parser. Since Indonesian Language has the similarities with English in terms of sentence
structure, the Stanford Parser can be used.
2.2.4. Dependency Parsing
Dependency Parsing is a parser which generates a grammar that describes the
dependence between the components which one is the head and the other is dependent. Head also
called modifier as a determinant for the partner.
Dependency Parsing will be done using the method of mapping from Constituent Structure to
Dependencies Structure, because the input to this parser is a constituent-based sentence that is
output from last process.
2.3. Natural Question-Guided Search