Part-Of-Speech Tagging Named Entity Recognition Constituency Parsing Dependency Parsing

Phrase is consist of two words or a more words that forms a constituent and by which it functions as a single unit in the syntax of a sentence. Sentence is a group of words that are put together to mean something which expresses a complete thought. It does this by following the grammatical rules of syntax. There are four type of Sentence: Simple Sentence, Compound Sentence, Complex Sentence and Complex- Compound Sentence. But on this research, it is focus only on Simple Sentence.

2.2. Natural Language Processing

Natural Language Processing NLP that is a part of computer science, Artificial Intelligence AI, and Linguistics which deals with the interaction between human and computer so that the computer has the ability to be able to understand natural human language. There are four NLP that used for this research : Part-of-Speech Tagging POS Tagging, Named Entity Recognition, Constituency Parsing dan Dependency Parsing.

2.2.1. Part-Of-Speech Tagging

Part-of-Speech Tagging POS Tagging is the process of determining the words according to the grammar for each of the words in the sentences of natural language. POS Tagging is also can provide information of the word from syntactic or morphology of a sentence. This research use Indonesian Language POS Tagging iPOSTagger that made by Alfan Farizki Wicaksono and Ayu Purwarianti. iPOSTagger is very important in this research, especially in classified and tagging the words on a sentence.

2.2.2. Named Entity Recognition

Named Entity Recognition NER is one of the components of information extraction to detect and classify the named-entity in a text. NER is generally used to detect peoples names, place names and organization of a document. This research use LingPipe NER for English. in this reasearch, so that LingPipe NER can be use. While for detect the other names will be use some addition rules for categorizing.

2.2.3. Constituency Parsing

Constituency Parsing or Phrase Structure Grammar is a parser used in NLP to parse a sentence. The function of this parser is as decomposers sentence to make a Constituency Tree or Phrase Structure Grammar Tree Tree Grammar Pattern. Constituency Parser using the grammar rules to generate Constituency Tree of a sentence so that it becomes a model of grammar patterns. But Constituency Parsing is not developed in this research. This research use Constituency Parsing by Stanford Parser to generate a Constituency Tree. Stanford Parser is an English parser. Since Indonesian Language has the similarities with English in terms of sentence structure, the Stanford Parser can be used.

2.2.4. Dependency Parsing

Dependency Parsing is a parser which generates a grammar that describes the dependence between the components which one is the head and the other is dependent. Head also called modifier as a determinant for the partner. Dependency Parsing will be done using the method of mapping from Constituent Structure to Dependencies Structure, because the input to this parser is a constituent-based sentence that is output from last process.

2.3. Natural Question-Guided Search