hemispheres. While in the posting, the author also mentioned Bandung, Asian and Indonesian. Finding and analysis section of this paper will describe how the use of pronouns in this posting can
derive such identity assumption. Here, corpus approach is employed as it has been proven useful in discourse analysis as described by Conrad Biber 2008, and also Sinclair 2006. Some of the
literatures are reviewed in section two.
2. LITERATURE REVIEW
Electronic and online media have become center of attention as it shows pattern of interaction in a community. The online media itself is one of the objects of discourse studies [ CITATION Mat05 \l
1033 ]. Internet is a facility where people can get connected, reducing time and space factors. Netizens internet citizens, a.k.a Internet users can get connected to other netizens on different time
zone and continent without having to travel and meet the individual in person. The information can be shared via a website. However, not all people are capable of designing a website plus a web hosting
service requires money. A blog weblog is a more user friendly spot where people can share information without having to be able to design a web. It is already equipped with web publishing
tool; therefore no coding or scripting knowledge is required. Some of the web functionalities are developed to support interactive communication. The social function of a blog is better known than its
informative function, as people may also respond to the posting.
About Pronoun
Language is often related to identity [ CITATION Ide09 \l 1033 ], by means of community. However, the use of language can be different person to person. Therefore, the identity itself must be
related to the discourse created by an individual in this case, blogging discourse media. One of the aspects that builds discourse coherence is pronoun.
Besides its grammatical function to build cohesion and coherence, as described Halliday Hasan 1976, the relation of pronoun and the background of speakers have already come into
attention of Brown Gilman 1960, which is more popular as T-V Tous and Vous relation. The use of pronoun in specific discourse has also received serious attention: such as the use of exclusive and
inclusive pronoun in academic discourse as described by Harwood 2005. Recently, in discourse analysis, the use of pronoun is becoming more essential as it represents how the author puts
himherself on one side over another, as it is well described by Coleman Ross 2010. As for this, pronoun reference is a crucial importance.
As pronoun is used to replace the reference noun, then the reference is ideally recoverable. The reference can be made anaphorically before the pronoun, cataphorically after the pronoun. In
computational linguistics, recovering anaphoric and cataphoric pronoun are still quite a challenge, as described in [ CITATION Wol04 \l 1033 ]. One that is interesting, is that the reference can also be
made exophorically beyond the text.
As for the latest exophoric, it is quite difficult to recover the pronoun independently solely depending on the text, without understanding the background of the writer. Even by a qualitative
approach, multi interpretation is still most likely to take place. Recovering this kind of pronoun reference to social reality[ CITATION Sch08 \l 1033 ] is not as easy as it seems. How the readers
understand the text and how they decide the exophoric reference is largely affected by this factor.
3. METHODOLOGY
Research Data and Digital Text Processing
Linguistic data can be composed of speech or written texts. These days, the data are digitalized. In terms of data preservation and maintenance, digital format has some advantageous over
the manual format. Archiving linguistics data in physical form definitely requires enormous time, space and effort, not to mention the maintenance cost yet. In terms of processing, the digital format
provides significant supports for researchers such as; data retrieval, classification and extraction.
The digital data processing had received a considerable amount of criticism in the beginning [ CITATION McE12 \l 1033 ], addressed particularly to the Brown corpus by Chomsky around 1960’s.
At that period, the criticism did sound, as the technological supports were not significant. Most of the
processing stages at that time still had to be done by hands manually. These days, with the advancement of technology, the analysis of linguistic data can be performed, to some extent,
automatically by text processing software. In the next sub section, I describe the nature of the data as digital text, and the corpus processing software that I used to process the data.
Research Procedure
The data in this paper is obtained from a blog posting ‘city of pigs’ the nature of blog as on- line social media allows internet users to freely read the blog and to leave comments as well. This
data is processed by AntConc[ CITATION Law06 \l 1033 ], freely available corpus processing software. The posting on HTML format is converted to .txt format, a format the AntConc can process.
The digital text is processed as raw corpus, which means no annotation is performed. The first processing is obtaining a general description of the corpus. In this processing, the description is
quantitative. The second processing is meant to understand the specific distribution of pronouns, the linguistic devices that became one of the sources of controversy. Still, the analysis is quantitative. The
third stage of the processing is identifying the pronoun references. The analysis here is qualitative, which is by using concordance function to recognize the keyword pronoun in context. Concordance
is a very basic function in corpus linguistics that presents keywords in context so that users can analyze the context [ CITATION Ado061 \l 1033 ]. These three processing are the building blocks to
support my argument that the sense of in-group identity is strong in this posting, presented in section 4 of this paper.
4. FINDINGS AND DISCUSSION