Research Method Research Setting Data Source

25 Figure 3.1 British National Corpus Website The previous figure was the appearance of British National Corpus site. In the window, there wa s a “look up” column. In order to find expressions collected in British National Corpus library, the researcher typed the keyword. For each typing, there will be 50 expressions randomly appeared which contained the keyword. British National Corpus provided code for each expression in which enables the researcher to track the source of the data. The next figure was the result of the keyword typing. 26 Figure 3.2 The Result of BNC Cluster Sampling In Figure 3.2, there were 50 expressions selected randomly by the computerized system. Although the researcher typed the same keyword, the results would be different every time the keyword wa s entered in the “look up” column. The different system would apply to Corpus of Contemporary American English. Each computerized corpus had different strength and weakness. Although British National Corpus provided the code which enabled the researcher to search 27 the source, the provided data was randomly selected that the researcher could not know whether the expression had been selected before. Meanwhile, Corpus of Contemporary American English provided 100 expressions in each keyword typing and eliminated the expression which had been selected before. The weakness was Corpus of Contemporary American English did not provide the code for each expression. Figure 3.3 Corpus of Contemporary American English Website Figure 3.3 showed the appearance of Corpus of Contemporary American English site. There wa s “word” column where the keyword should be typed. There were some features to restrict the results of the keyword typing. The example of keyword typing results was presented in the next figure. 28 Figure 3.4 The Result of COCA Cluster Sampling Figure 3.4 showed the example of keyword typing result. Corpus of Contemporary American English provided pages in order to keep the data organized. Each page consisted of 100 expressions. The existence of the same recorded expression could be prevented since the data was kept in organized way.