Studi Kasus (Sederhana)

4.4 Studi Kasus (Sederhana)

1. Menghitung Kemunculan Kata dalam file dokumen: - Buat file dokumen yang akan diuji (misal):

nidos@master:/usr/local/hadoop$ cd nidos@master:~$ cd /home/nidos/Desktop/ nidos@master:~/Desktop$ mkdir data nidos@master:~/Desktop/data$ >> a.txt nidos@master:~/Desktop/data$ gedit a.txt

Source Code 4.31 Buat File Dokumen Uji

Gambar 4.69 Tampilan Dokumen Uji

- Buat file “WordCount.java”: nidos@master:~/Desktop/data$ cd

/usr/local/hadoop nidos@master:/usr/local/hadoop$ >> WordCount.java nidos@master:/usr/local/hadoop$ gedit WordCount.java nidos@master:/usr/local/hadoop$ ls bin include libexec logs README.txt share etc lib LICENSE.txt NOTICE.txt sbin WordCount.java

Source Code 4.32 File Wordcount.Java - Siapkan file *.java (misal WordCount.java Part 1 of 2) untuk

dicompile ke *.jar: import java.io.IOException;

import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.Fil eInputFormat; import org.apache.hadoop.mapreduce.lib.output.Fi leOutputFormat;

public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {

Source Code 4.33 File *.java Part 1 Source Code 4.33 File *.java Part 1

value, Context context) throws IOException, InterruptedException {

StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one);

Source Code 4.34 File *.Java Part 2 Cont

- Siapkan file *.java (misal WordCount.java Part 2 of 2) untuk dicompile ke *.jar:

public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values)

{ sum += val.get(); } result.set(sum); context.write(key, result);

} } public static void main(String[] args) throws Exception {

Source Code 4.35 File*.Java Part 2

Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count");

job.setJarByClass(WordCount.class);

job.setMapperClass(TokenizerMapper.class);

job.setCombinerClass(IntSumReducer.class);

job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class) ;

FileInputFormat.addInputPath(job, new Path(args[0]));

FileOutputFormat.setOutputPath(job, new Path(args[1]));

System.exit(job.waitForCompletion(true) ?

Source Code 4.36 File *.Java Part 2 Cont - file “WordCount.java”:

Gambar 4.70 Tampilan File wordcount.java

- WordCount.java dicompile ke *.jar: - Lakukan hal berikut: - nidos@master:/usr/local/hadoop$

bin/hdfscom.sun.tools.javac.Main WordCount.java nidos@master:/usr/local/hadoop$

Gambar 4.71 Tampilan WordCount.Java dalam folder

- Hasil: nidos@master:/usr/local/hadoop$ jar cf wc.jar Word- Count*.class

Gambar 4.72 Hasil nidos@master:/usr/local/hadoop$ jar cf wc.jar

WordCount*.class

- Copy file /home/nidos/Desktop/data/a.txt ke /user/hduser/wordcount/input

dan Running proses perhitungan kata dalam file dokumen:

Jika menggunakan hdfs, maka gunakan dfs Jika menggunakan hadoop, maka gunakan fs nidos@master:/usr/local/hadoop$ bin/hdfs dfs -copyFromLocal /home/nidos/Desktop/data/a.txt /user/nidos/wordcount/input Jika folder output sudah ada, maka sebaiknya membuat output lainnya, misal

“output2” nidos@master:/usr/local/hadoop$ bin/hadoop

jar wc.jar WordCount /user/nidos/wordcount/input/a.txt /user/nidos/wordcount/output nidos@master:/usr/local/hadoop$ bin/hdfs dfs -ls /user/nidos/wordcount/output Found 2 items -rw-r--r-- 3 nidos supergroup 0 2016-12-05 08:29 /user/nidos/wordcount/output/_SUCCESS -rw-r--r-- 3 nidos supergroup 1189 2016-12-05 08:29 /user/nidos/wordcount/output/part-r-00000 nidos@master:/usr/local/hadoop$ bin/hdfs dfs -cat /user/nidos/wordcount/output/part*

Source Code 4.37 Running Proses Perhitungan Kata nidos@master:/usr/local/hadoop$

bin/hdfs dfs -cat /user/nidos/wordcount/output/part*

Gambar 4.73 Tampilan nidos@master:/usr/local/hadoop$ bin/hdfs dfs -cat /user/nidos/wordcount/output/part*

Gambar 4.74 Browse Directory pada Forefox

Gambar 4.75 Browse Directory pada Firefox

Gambar 4.76 File Information Pada Firefox - Siapkan

b.txt, Copy file /home/nidos/Desktop/data/b.txt

file,

misal

ke /user/hduser/wordcount/input

dan Running proses perhitungan kata dalam file dokumen: - Lakukan hal berikut:

nidos@master:/usr/local/hadoop$ bin/hdfs dfs -copyFromLocal /home/nidos/Desktop/data/b.txt /user/nidos/wordcount/input Menjalankan JAR untuk wordcount untuk satu file dalam satu folder (misal file b.txt): nidos@master:/usr/local/hadoop$ bin/hadoop jar wc.jar WordCount /user/nidos/wordcount/input/b.txt /user/nidos/wordcount/output2 Atau, menjalankan JAR untuk wordcount untuk semua file dalam satu folder (file a.txt dan b.txt): nidos@master:/usr/local/hadoop$ bin/hadoop jar wc.jar WordCount /user/nidos/wordcount/input/ /user/nidos/wordcount/output2 Cara menghapus folder HDFS (misal hapus folder /user/nidos/wordcount/output): nidos@master:/usr/local/hadoop$ hadoop fs -rm -r -f /user/nidos/wordcount/output