Editor + GUI untuk Spark Java/ Spark Scala/ PySpark

8.1 Editor + GUI untuk Spark Java/ Spark Scala/ PySpark

 Editor + GUI, dengan adanya 2 bagian ini maka harapannya akan memudahkan developer dalam membuat koding dari awal dalam lingkungan IDE atau mengembangkan implementasi algoritma untuk penyelesaian kasus apapun menjadi lebih cepat, nyaman dan lebih profesional.

Gambar 8.1 Get Eclipse OXYGEN  Java/ Scala/ Python/ R/ etc, ini hanya beberapa macam dari

bahasa pemrograman yang masing-masing memiliki kelebihan dan keterbatasan. Silahkan dipilih dengan bijak, manakah bahasa pemrograman untuk Analisis Big Data yang anda gunakan, sesuai dengan style anda sebagai developer.

Gambar 8.2 Bahasa Java/ Scala/ Python/ R

8.1.1 Install Sublime Text

- Ketikkan perintah berikut

sudo add-apt-repository ppa:webupd8team/sublime-text-3 sudo apt-get update sudo apt-get install sublime-text-installer sudo ln -s /usr/lib/sublime-text-3/sublime_text /usr/local/bin/sublime

8.1.2 Eclipse + Spark Standalone (Java EE)

- Link kode wordcount Spark Standalone: https://goo.gl/DNMsNG

Jika muncul “Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration”

Ganti kode “SparkSession spark = SparkSession.builder().appName("JavaWordCount") .getOrCreate();” Dengan SparkSession spark = SparkSession.builder().appName("JavaWordCount").config("spar k.master", "local[*]")

.getOrCreate();

8.1.3 Eclipse + Spark + Scala IDE + Maven

- Install Scala IDE pada Eclipse, sebelum Buka Eclipse, ketikkan kode berikut:

nidos@Master:~$ su hduser hduser@Master:/home/nidos$ cd hduser@Master:~$ sudo chmod 777 -R /home/hduser/eclipse- workspace/ hduser@Master:~$ sudo chmod 777 -R /home/hduser/eclipse

Klik Help, pilih “Install New Software”:

Pada work with, masukkan “http://download.scala- ide.org/sdk/lithium/e47/scala212/stable/site”, klik Add

Masukkan name, misal “Scala IDE”, klik OK

klik Select All, klik Next klik Select All, klik Next

Pilih accept, klik Finish

Tunggu beberapa waktu sampai instalasi selesai

Klik install anyway

Klik Restart Now

Open Perspective Scala, klik Other

Scala IDE berhasil di-install

Setelah di Klik Open

Cek FileNew

- Latihan 1: “HelloScala.scala”. Cek FileNew  misal mencoba membuat “Scala Object” dengan nama “HelloScala”

package com.nidos.myscala

object HelloScala { def main(args:Array[String]){ println("Hello my Scala") } }

Cara menjalankannya degan menggunakan Run Configuration di atas.

- Latihan 2: Scala Spark Project dengan Maven

Klik Kanan di “Package Explorer”  New  Project

Pilih “Maven Project”, klik Next Pilih “Maven Project”, klik Next

klik Next

Isikan, misal seperti berikut, klik Finish

Tunggu beberapa waktu

Klik kanan “mysparkexample”, pilih Configure, klik “Add Scala Nature”

Hassil “Add Scala Nature”

Klik kanan “mysparkexample”, pilih Properties

Klik “Java Build Path”, klik Tab Source

Klik “Add Folder”, klik “main”, lalu klik “Create New Folder”

Isikan folder name, misal “scala”, klik Next

Klik Add

Isikan “**/*.scala”, lalu klik OK

Klik Finish

Klik OK

Klik “Apply and Close”

Pada “Package Explorer”, pada project “mysparkwxample”, “src/main/scala” sudah muncul

Pada project “mysparkwxample”, klik kanan pada “src/main/scala”, klik ”Package”

Isikan name dengan, misal “com.nidos.mysparkexample”, klik Finish

Package sudah muncul

Klik kanan pada package “com.nidos.mysparkexample”, klik “Scala Object”

Masukkan name, misal “WordCount”, klik Finish, link kode: https://goo.gl/ootdZN

Setelah diklik “Finish”, link kode: https://goo.gl/ootdZN

Buat main “ketik main”, lalu tekan Ctrl+Space

Konfigurasi file “pom.xml” untuk Spark, tambahkan dependencies dibawah ini, setelah line 17

<dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>2.10.4</version>

</dependency>

<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.4.1</version>

</dependency>

Tunggu beberapa waktu

Sampai Selesai

Cek Auto format Spark

Ketika running, jangan memilih pada “Java Application”

dan muncul error “Error: Could not find or load main class com.nidos.spark.mysparkexample.WordCount”

Ubah ke “Scala Application”

Jika masih ada error “Error: Could not find or load main class com.nidos.spark.mysparkexample.WordCount” maka coba tambahkan kode berikut “package com.nidos.mysparkexample”, langsung jalankan tanpa dengan “ Scala Application ”

Set Argument “hdfs://localhost:9000/user/hduser/wordcount/input/input3.tx t”, lalu klik Run

Project berhasil dijalankan :D

8.1.4 Eclipse + Spark + Scala IDE + SBT

- Setting up Spark Dev Environment using SBT and Eclipse. Ketikkan kode berikut (http://www.scala-sbt.org/download.html)

nidos@Master:~$ echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list

nidos@Master:~$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823

Ketikkan kode berikut: nidos@Master:~$ sudo apt-get update nidos@Master:~$ sudo apt-get install sbt

Install sbt telah selesai

- Misal membuat project SBT, namanya “SparkSVM”. Ketikkan perintah berikut ( Link kodenya: https://goo.gl/omA1ks ):

nidos@Master:~$ cd ./eclipse-workspace nidos@Master:~/eclipse-workspace$ mkdir SparkSVM nidos@Master:~/eclipse-workspace$ cd ./SparkSVM/ nidos@Master:~/eclipse-workspace/SparkSVM$ mkdir -p src/{main,test}/{java,resources,scala} nidos@Master:~/eclipse-workspace/SparkSVM$ mkdir lib project target nidos@Master:~/eclipse-workspace/SparkSVM$

Cek struktur folder, misal dengan sublime. Ketikkan perintah berikut: nidos@Master:~/eclipse-workspace/SparkSVM$ subl .

Misal kita sudah download file project referensi “Spark_kernel_svm-master” dari link berikut: https://goo.gl/j3dWL4

Ketikkan berikut, lalu cek di sublime (jgn lupa mengganti name :=“SparkSVM”): nidos@Master:~/eclipse-workspace/SparkSVM$ cp /home/nidos/Downloads/Spark_kernel_svm-master/build.sbt ./

Jangan lupa mengganti namenya menjadi  name :=“SparkSVM”

Buat file plugins.sbt pada folder “SparkSVM/project”. Ketikkan perintah berikut, lalu cek di sublime: nidos@Master:~/eclipse-workspace/SparkSVM$ cd project/ nidos@Master:~/eclipse-workspace/SparkSVM/project$ touch plugins.sbt

Masukkan kode “addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse- plugin" % "5.2.4")” pada file plugins.sbt, lalu simpan

Cek Struktur foder lagi. Ketikkan perintah berikut: nidos@Master:~/eclipse-workspace/SparkSVM/project$ cd .. nidos@Master:~/eclipse-workspace/SparkSVM$ find

Jalankan SBT. Ketikkan perintah berikut: nidos@Master:~/eclipse-workspace/SparkSVM$ ls nidos@Master:~/eclipse-workspace/SparkSVM$ sbt

tunggu beberapa waktu

File Dependencies dari SBT sudah berhasil dicreate

Klik File, klik Open Tab

Ketikkan, untuk cek “.classpath” dan lainnya, berikut: nidos@Master:~$ cd ./eclipse-workspace/SparkSVM/ nidos@Master:~/eclipse-workspace/SparkSVM$ ls build.sbt lib project src target nidos@Master:~/eclipse-workspace/SparkSVM$ ls -a . .. build.sbt .classpath lib project .project .settings src target nidos@Master:~/eclipse-workspace/SparkSVM$

Buka Eclipse, klik File, klik Import

Pilih General, klik “Existing Projects into ....”, klik Next

Klik Browse

Cari pada folder “/home/nidos/eclipse-workspace” Pilih “SparkSVM”, klik OK

Klik Finish

Project “SparkSVM” siap untuk dicopykan kode program dari referensi

Copykan 3 code berikut, dari project referen ke Project

Siapkan dataset, misal “iris3.txt” difolder, misal “/home/nidos/eclipse-workspace/SparkSVM”

Run Project “SparkSVM”, dengan klik kanan file “main.scala”, pilih “Run As”, klik “2Scala Application”, jika belum muncul “result.txt”

Set Run Project “SparkSVM” by kode program, dengan mengisi langsung args(0)-nya

Pada file main.scala, ganti kode berikut:

if (args.length != 1 ) { println("Usage: /path/to/spark/bin/spark-submit --packages amplab:spark-indexedrdd:0.1" +

"target/scala-2.10/spark-kernel-svm_2.10-1.0.jar <data file>") }

sys.exit(1) val logFile = "README.md" // Should be some file on your system

//val conf = new SparkConf().setAppName("KernelSVM Test")

Dengan val args = Array.fill(1)("") val logFile = "README.md" // Should be some file on your system

val conf = new SparkConf() conf.setAppName("SparkSVM") conf.setMaster("local[*]")

val sc = new SparkContext(conf) args(0)="file:///home/nidos/eclipse-workspace/SparkSVM/iris3.txt"

Run lagi Project “SparkSVM”, dengan klik kanan file “main.scala”, pilih “Run As”, klik “2Scala Application”

Runing Project “SparkSVM” Sukses

Hasil Runing Project “SparkSVM” berupa file “result.txt”, juga sudah muncul

Isi dari file “result.txt”

8.1.5 Eclipse + PySpark + PyDev

- Setting up Eclipse + PySpark + PyDev. Ikuti langkah-langkah berikut:

Klik Help, pilih “ Install New Software ”:

Pada work with, masukkan “http://www.pydev.org/updates”, Tekan Enter

Select All, klik Next

Klik Next

Pilih “I accept ..”, Klik Finish

Tunggu beberapa waktu untuk “Installing Software..”

Klik Install Anyway

Klik Restart Now

Open Perspective PyDev, klik Other

PyDev berhasil di-install

Setelah di Klik Open, klik FileNew

Ketikkan, hduser@Master:~$ sudo gedit ~/.bashrc

Pastikan pada file “bashrc” anda sudah berisi:

.. export JAVA_HOME=/usr/lib/jvm/java-8-oracle export JRE_HOME=/usr/lib/jvm/java-8-oracle/jre export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native" export HADOOP_CLASSPATH=/usr/lib/jvm/java-8-oracle/lib/tools.jar

export SPARK_HOME=/home/hduser/spark-2.2.0-bin-hadoop2.7 export PATH=$PATH:$SPARK_HOME/bin export PATH=$PATH:$SPARK_HOME/bin/pyspark export XDG_RUNTIME_DIR=""

# Add the PySpark classes to the Python path: export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH

export MAHOUT_HOME=/usr/local/mahout export PATH=$PATH:$MAHOUT_HOME/bin export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

# added by Anaconda2 4.4.0 installer export PATH="/home/hduser/anaconda/bin:$PATH“

- Latihan 1: “HelloPySparkOnEclipse”. Cek FileNew  misal men- coba membuat “PySpark On Eclipse” dengan nama “HelloPySpar- kOnEclipse”

klik “Click here to configure an interpreter not listed”

Klik “Quick Auto-Config”, lalu klik “Libraries”

Klik “New Folder”

Pilih folder Python pada Spark Home, misal di “ /home/hduser/spark-2.2.0-bin-hadoop2.7/python ”, lalu Klik “OK”

Pilih folder Python pada Spark Home, sudah berhasil dimasukkan, lalu klik “New Egg/Zip(s)”

Masuk ke directory “ /home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib ”, ubah “*.egg” ke “*.zip”, lalu klik OK.

Pilih “ /home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib/py4j-0.10.4-src.zip ”, lalu klik OK

File “ /home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib/py4j-0.10.4-src.zip ”, berhasil ditambahkan

Klik Apply

Klik tab “Environment”, klik New

Masukkan Name “SPARK_HOME” dan Value “/home/hduser/spark- 2.2.0-bin- hadoop2.7”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark-shel l”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark-shel l”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukk an Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masuk kan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan

Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark-shel l”, klik OK

Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK

Klik Apply

Klik Finish

Buat Source Folder

Klik kanan “src”  New  Pilih PyDev Module

Beri Name, misal “wordc”, klik Finish

Masukkan kode, misal dari “https://goo.gl/Fu6geJ”

Klik kanan “wordc.py”, pilih “Python Run”

Jika muncul error berikut, maka jalankan dulu hadoop anda hduser@Master:~$ start-all.sh

Lalu Klik kanan lagi “wordc.py”, pilih “Python Run”

File “wordc.py”, berhasil dijalankan

8.1.6 PySpark + Pycharm

- Setting up pySpark Dev Environment using Pycharm. Ikuti langkah- langkah berikut:

Download pycharm lalu ekstrak, misal di folder /usr/local/

Download pycharm lalu ekstrak, misal di folder /usr/local/

Sebelum menjalankan pycharm, jalankan terlebih dahulu Hadoop. Ketikkan: hduser@Master:/home/nidos$ start-all.sh

Lalu jalankan pyspark, lalu baru jalankan pyCharm. Ketikkan: hduser@Master:/home/nidos$ pyspark

Jalankan Pycharm. Ketikkan: hduser@Master:~$ cd /usr/local/pycharm/bin/ hduser@Master:/usr/local/pycharm/bin$ ls

format.sh idea.properties pycharm64.vmoptions restart.py fsnotifier inspect.sh pycharm.png fsnotifier64 log.xml pycharm.sh fsnotifier-arm printenv.py pycharm.vmoptions

hduser@Master:/usr/local/pycharm/bin$ ./pycharm.sh Klik Create New Project

Masukkan nama project pada Location, misal “/home/hduser/PycharmProjects/pySpark”, klik Create

Tampilan project “pySpark”

Pada project “pySpark”, misal buat kode “WordCount.py”

Jika muncul Error

Edit Configuration

Konfigurasi pada Environment Variable, tekan tombol plus, lalu tambahkan path SPARK_HOME, misal di “/home/hduser/spark-2.2.0-bin-hadoop2.7”

Dan PYTHONPATH, misal dari path “/home/hduser/spark-2.2.0- bin-hadoop2.7/python/lib/py4j-0.10.4- src.zip:/home/hduser/spark-2.2.0-bin- hadoop2.7/python”

Konfigurasi pada Environment Variable, bisa juga dengan cara copy kode berikut, lalu klik icon paste, part 1 of 2.

SPARK_HOME=/home/hduser/spark-2.2.0-bin-hadoop2.7 PYTHONPATH=/home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib/py4j-0.10.4-src.zip:/home/hduser/spark- 2.2.0-bin-hadoop2.7/python

Konfigurasi pada Environment Variable, bisa juga dengan cara copy kode berikut, lalu klik icon paste, part 2 of 2 (klik OK, klik Apply, klik OK).

SPARK_HOME=/home/hduser/spark-2.2.0-bin-hadoop2.7 PYTHONPATH=/home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib/py4j-0.10.4-src.zip:/home/hduser/spark- 2.2.0-bin-hadoop2.7/python

Ketikkan: hduser@Master:~$ sudo gedit ~/.bashrc

Pastikan pada file “bashrc” anda sudah berisi:

.. export JAVA_HOME=/usr/lib/jvm/java-8-oracle export JRE_HOME=/usr/lib/jvm/java-8-oracle/jre export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native" export HADOOP_CLASSPATH=/usr/lib/jvm/java-8-oracle/lib/tools.jar

export SPARK_HOME=/home/hduser/spark-2.2.0-bin-hadoop2.7 export PATH=$PATH:$SPARK_HOME/bin export PATH=$PATH:$SPARK_HOME/bin/pyspark export XDG_RUNTIME_DIR=""

# Add the PySpark classes to the Python path: export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH

export MAHOUT_HOME=/usr/local/mahout export PATH=$PATH:$MAHOUT_HOME/bin export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

# added by Anaconda2 4.4.0 installer export PATH="/home/hduser/anaconda/bin:$PATH“ ..

Fie “wordc.py” berhasil dijalankan, link file kode: “https://goo.gl/Fu6geJ” atau dari “https://goo.gl/QZiLyX”

Solusi untuk hidden “Setting default log level to "WARN".”

Masuk ke directory spark, lalu hduser@Master:~$ cd ./spark-2.2.0-bin-hadoop2.7/

Solusi untuk hidden “Setting default log level to "WARN".” Masuk ke directory spark, lalu hduser@Master:~$ cd ./spark-2.2.0-bin-hadoop2.7/ hduser@Master:~/spark-2.2.0-bin-hadoop2.7$ cd ./conf/ hduser@Master:~/spark-2.2.0-bin-hadoop2.7/conf$ ls

Solusi untuk hidden “Setting default log level to "WARN".” hduser@Master:~/spark-2.2.0-bin-hadoop2.7/conf$ sudo cp ./log4j.properties.template ./log4j.properties

Replace semua isi file “log4j.properties” dengan isi file dari link “https://goo.gl/GiWCfy” Restart pySpark dengan tekan control+D atau ketik quit(), lalu tekan enter

Lalu jalankan lagi file *.py

Setting default log level to " WARN“ sudah berhasil terhidden

Cara meng-copy Environment Variables 1 of 3. Misal, anda baru saja membuat file baru dengan nama “hdfs_wordcount.py”, maka biasanya Environment Variables belum lengkap, seperti gambar berikut

Cara meng-copy Environment Variables 2 of 3 Solusinya, anda bisa mengisinya dengan cara meng-copy, misal dari Environment Variables dari file “wordc.py” kita klik, lalu blok pada konfigurasi lalu klik icon Copy

Cara meng-copy Environment Variables 3 of 3 Lalu klik lagi file “hdfs_wordcount.py”, pilih Environment Variables, lalu klik icon Paste, klik OK, klik Apply, klik OK

Code hdfs_wordcount.py streaming berhasil dijalankan. Link file kode “https://goo.gl/vY6f4E”

Karena file kode tersebut streaming, maka akan selalu mencari file baru untuk diproses kembali dengan konsep wordcount setiap waktu (misal per detik)

Proses Streaming, dari setiap waktu koding akan dijalankan untuk melakukan proses wordcount jika ada data baru yang masuk di hdfs, misal : /user/hduser/wordcount/input Pada bagian hijau , Proses belum ada karena pada alamat hdfs tersebut masih belum ada data baru yang masuk

Tampilan isi dari alamat “ /user/hduser/wordcount/input ” pada hdfs yang masih belum ada data baru yang masuk (masih kosong)

M isal kita telah set “ssc.awaitTerminationOrTimeout(60*10) untuk proses streaming supaya dijalankan selama 10 menit”, lalu jalankan code hdfs_wordcount.py streaming

Pada saat koding dijalankan, untuk melakukan proses wordcount maka masukkan file text sembarang ke alamat /user/hduser/wordcount/input pada hdfs, misal file terebut adalah “input.txt” dan “input2.txt”, dari link berikut “ https://goo.gl/6d7CWQ”

Pada saat koding dijalankan, untuk melakukan proses wordcount maka masukkan file text sembarang ke alamat : /user/hduser/wordcount/input pada hdfs, pada Hue klik Upload, klik “Select files”

Klik Open

Tampilan pada Hue setelah Klik Open

Segera kembali ke PyCharm, maka hasil wordcount untuk file “input.txt” akan muncul

Misal melakukan proses wordcount lagi untuk file yang kedua ke alamat : /user/hduser/wordcount/input pada hdfs, pada Hue klik Upload, klik “Select files”

Klik Open

Tampilan pada Hue setelah Klik Open

Segera kembali ke PyCharm, maka hasil wordcount untuk file “input2.txt” akan muncul

Wordcount streaming ke-2. Link file kode “https://goo.gl/cnGuHo”

Copy semua file pada project di PyCharm, misal nama projectnya “pySparkWordCount”

Hasil Copy semua file pada project di PyCharm, misal nam projectnya “pySparkWordCount”.

1. Run file “streaming.py”, lalu 2. Run file “file.py” pada terminal

1. file “streaming.py” telah aktif , lalu 2. Run file “file.py” pada terminal, ketik seperti dibawah

ini, lalu tekan enter

1. file “streaming.py” telah aktif dan telah memproses wordcount lalu

2. file “file.py” telah aktif pada terminal, untuk simulasi membuat file streaming log{}.txt

Wordcount streaming ke-3 (pySpark Streaming (HDFS File) & Pycharm). Link file kode “”

Buat project baru di PyCharm, misal nama projectnya “pySparkWordCountHDFS”

Atau dengan cara buat package pada PycharmProjects, misal nama package- nya “pySparkWordCountHDFS”

Hapus file “__init__.py”

Copy file dari “pySparkWordCountLocal” ke “pySparkWordCountHDFS”

Hasil Copy semua file pada project di PyCharm, misal nam projectnya “pySparkWordCount”.

1. Run file “streaming.py”, lalu 2. Run file “file.py” pada terminal

1. file “streaming.py” telah aktif , lalu 2. Run file “file.py” pada terminal, ketik seperti dibawah

ini, lalu tekan enter

1. file “streaming.py” telah aktif dan telah memproses wordcount lalu

2. file “file.py” telah aktif pada terminal, untuk simulasi membuat file streaming log{}.txt

– pySpark Naive Bayes (From Scratch) & Pycharm

Link file kode “https://goo.gl/i9Cn5v” Buat package baru di PyCharm, misal namanya “pySparkNB”, dan masukkan file “train_pos.txt‘, train_neg.txt‘, test_pos.txt‘, test_neg.txt” ke dalam HDFS pada hdfs://localhost:9000/user/hduser/NB_files

Tampilan file “train_pos.txt‘, train_neg.txt‘, test_pos.txt‘, test_neg.txt” pada hdfs://localhost:9000/user/hduser/NB_files dengan Hue

8.1.7 IntelliJ IDEA + SBT

- Setting up Scala Spark + SBT Dev Environment using IntelliJ IDEA

Download IntelliJ IDEA lalu ekstrak, misal di folder /usr/local/

Tampilan folder hasil download

hduser@Master:~$ sudo chown hduser:hadoop -R /usr/local/idea-IC hduser@Master:~$ sudo chmod 777 -R /usr/local/idea-IC

Edit file bashrc, tambahkan “export IBUS_ENABLE_SYNC_MODE=1” hduser@Master:~$ sudo subl ~/.bashrc Atau gunakan, hduser@Master:~$ sudo gedit ~/.bashrc

hduser@Master:~$ source ~/.bashrc

Jalankan IntelliJ IDEA. hduser@Master:~$ cd /usr/local/idea-IC/bin/ hduser@Master:/usr/local/idea-IC/bin$ ./idea.sh Misal pilih “Do not import settings” Klik OK

Pilih Theme

Centang “Create a desktop ..” dan “For all users ..”

Create Launcher Script

Klik Next

Klik Install Scala

Klik Install and Enable -> Optional (Sebaiknya tidak perlu install IdeaVim)

Klik “Start using IntelliJ IDEA” 1 of 2

Klik “Start using IntelliJ IDEA” 2 of 2

Klik “Create New Project”

Klik Scala, pilih SBT, klik Next

Masukkan name, misal “MYSVMnBPPGD”

Karena menggunakan SBT, sebaiknya uncheck pada Sources pada Scala, lalu klik Finish

Klik File, pilih “Project Structure...”

Klik Modules, Pada “mysvmnbppgd”, Klik tab “Dependencies”, set seperti berikut

Klik Modules, Pada “mysvmnbppgd-build”, Klik tab “Dependencies”, set seperti berikut , klik Apply, klik OK

Tampilan di IntelliJ IDEA

Tampilan di Explorer Linux (nautilus)

Ketikkan berikut, agar folder tidak terkunci hduser@Master:~$ cd ./ideaProject hduser@Master:~/ideaProject$ sudo chmod 777 -R ./

Download kode program SVM dari link: https://goo.gl/TVMZGn (sudah dgn kernel Poly) atau https://goo.gl/ttW5c9 (belum ada kernel Poly)

Copy file build.sbt dari folder “/home/nidos/Download/BPPGD- master” ke “/home/hduser/ideaProject/MYSVMnBPPGD”

Copy semua file dari folder “/home/nidos/Download/BPPGD- master/src/main/scala” ke “/home/hduser/ideaProject/MYSVMnBPPGD/src/main/scala”

Klik Replace

Lalu ketikkan lagi berikut: hduser@Master:~/ideaProject$ sudo chmod 777 -R ./

Masuk Ke “Modules”, pada “mysvmbppgd”, klik “Dependencies”, klik tanda “-” merah. Klik Apply, klik OK

Masuk Ke “Project Structure”, klik Libraries

Klik tanda “-” warna merah, klik OK

Klik Apply, klik OK

Lalu ketikkan lagi hduser@Master:~/ideaProject$ sudo chmod 777 -R ./

Buka file “build.sbt” pada IntelliJ IDEA

Klik “Enable auto-Import”

Tunggu beberapa waktu, sampai semua file dependency sudah selesai didownload

Jika muncul error

Error:Error while importing SBT project:<br/>...<br/><pre>[error] resource for amplab#spark-indexedrdd;0.4.0: res=https://repo1.maven.org/maven2/amplab/spark- public: unable to get indexedrdd/0.4.0/spark-indexedrdd-0.4.0.pom: java.net.UnknownHostException: repo1.maven.org

[error] res=http://dl.bintray.com/spark-packages/maven/amplab/spark-indexedrdd/0.4.0/spark-indexedrdd- Spark Packages Repo: unable to get resource for amplab#spark-indexedrdd;0.4.0: 0.4.0.pom: java.net.UnknownHostException: dl.bintray.com indexedrdd;0.4.0: res=https://raw.githubusercontent.com/ankurdave/maven- [error] Repo at github.com/ankurdave/maven-repo: unable to get resource for amplab#spark- repo/master/amplab/spark-indexedrdd/0.4.0/spark-indexedrdd-0.4.0.pom:

java.net.UnknownHostException: raw.githubusercontent.com [error]

[error] unresolved dependency: com.ankurdave#part_2.10;0.1: Resolution failed several times for dependency: com.ankurdave#part_2.10;0.1 {compile=[default(compile)]}:: [error]

public: unable to get resource for com/ankurdave#part_2.10;0.1: res=https://repo1.maven.org/maven2/com/ankurdave/part_2.10/0.1/part_2.10-0.1.pom: java.net.UnknownHostException: repo1.maven.org

[error]

Spark Packages Repo: unable to get resource for com/ankurdave#part_2.10;0.1: res=http://dl.bintray.com/spark-packages/maven/com/ankurdave/part_2.10/0.1/part_2.10-0.1.pom: java.net.UnknownHostException: dl.bintray.com

[error] Repo at github.com/ankurdave/maven-repo: unable to get resource for com/ankurdave#part_2.10;0.1: res=https://raw.githubusercontent.com/ankurdave/maven- repo/master/com/ankurdave/part_2.10/0.1/part_2.10-0.1.pom: java.net.UnknownHostException:

raw.githubusercontent.com [error] [error] unresolved dependency: org.scalatest#scalatest_2.11;2.2.4: Resolution failed several

times for dependency: org.scalatest#scalatest_2.11;2.2.4 {test=[default(compile)]}:: [error] res=https://repo1.maven.org/maven2/org/scalatest/scalatest_2.11/2.2.4/scalatest_2.11-2.2.4.pom: public: unable to get resource for org/scalatest#scalatest_2.11;2.2.4:

java.net.UnknownHostException: repo1.maven.org [error] Spark Packages Repo: unable to get resource for org/scalatest#scalatest_2.11;2.2.4: res=http://dl.bintray.com/spark-

packages/maven/org/scalatest/scalatest_2.11/2.2.4/scalatest_2.11-2.2.4.pom: java.net.UnknownHostException: dl.bintray.com [error]

Repo at github.com/ankurdave/maven-repo: unable to get resource for org/scalatest#scalatest_2.11;2.2.4: res=https://raw.githubusercontent.com/ankurdave/maven- repo/master/org/scalatest/scalatest_2.11/2.2.4/scalatest_2.11-2.2.4.pom: java.net.UnknownHostException: raw.githubusercontent.com

[error] [error] unresolved dependency: org.scalacheck#scalacheck_2.11;1.12.2: Resolution failed several times for dependency: org.scalacheck#scalacheck_2.11;1.12.2 {test=[default(compile)]}::

[error] res=https://repo1.maven.org/maven2/org/scalacheck/scalacheck_2.11/1.12.2/scalacheck_2.11- public: unable to get resource for org/scalacheck#scalacheck_2.11;1.12.2: 1.12.2.pom: java.net.UnknownHostException: repo1.maven.org

[error] org/scalacheck#scalacheck_2.11;1.12.2: res=http://dl.bintray.com/spark- Spark Packages Repo: unable to get resource for packages/maven/org/scalacheck/scalacheck_2.11/1.12.2/scalacheck_2.11-1.12.2.pom:

java.net.UnknownHostException: dl.bintray.com [error] org/scalacheck#scalacheck_2.11;1.12.2: res=https://raw.githubusercontent.com/ankurdave/maven- Repo at github.com/ankurdave/maven-repo: unable to get resource for

repo/master/org/scalacheck/scalacheck_2.11/1.12.2/scalacheck_2.11-1.12.2.pom: java.net.UnknownHostException: raw.githubusercontent.com [error] Total time: 9 s, completed Dec 16, 2017 1:27:20 PM</pre><br/>See complete log in <a

href="file:/home/hduser/.IdeaIC2017.2/system/log/sbt.last.log">file:/home/hduser/.IdeaIC2017.2/s ystem/log/sbt.last.log</a>

Maka solusinya cek koneksi internet anda

Maka solusinya cek koneksi internet anda, dengan klik “Wired connection 1”

Semua file dependencies dari Build.sbt sudah berhasil didownload, abaikan warning pada log

Untuk menjalankan kode program, buka file main.scala, klik kanan, pilih “TestKernelSVM”

Kode program, berhasil dibuild, dan ada petunjuk “spark- submit ..”, jika mau menjalankan dari Terminal

Usage: /path/to/spark/bin/spark-submit --packages amplab:spark-

indexedrdd:0.4.0target/scala-2.11/ppackubuntu_2.11-1.0.jar <data file>

Pada tipe “VALIDATION”, pada file “main.scala” terdapat args(0), args(1), .., args(6)

- args(0), tipe - args(1), trainingfile: Path of the training set in libsvm

format - args(2), lambda: Regularization Term - args(3), sigma: Kernel Parameter - args(4), iterations: Number of iterations - args(5), outputfile: log file - args(6), numfeatures: Number of variables of the dataset

Contoh argument yang digunakan: VALIDATION file:///home/hduser/ideaProject/MYSVMnBPPGD/iris3.txt 0.8 1.0 10 result.txt 4

Pada tipe “TEST”, pada file “main.scala” terdapat args(0), args(1), .., args(6), args(7)

- args(0), tipe - args(1), trainingfile: Path of the training set in libsvm

format - args(2), lambda: Regularization Term - args(3), sigma: Kernel Parameter - args(4), iterations: Number of iterations - args(5), outputfile: log file - args(6), numfeatures: Number of variables of the dataset - args(7), testingfile: Path of the testing set in libsvm

format

Contoh argument yang digunakan: args(0)="TEST" args(1)="file:///home/hduser/ideaProject/MYSVMnBPPGD/iris3.txt" args(2)="0.8" args(3)="1.0" args(4)="20" args(5)="result.txt" args(6)="4"

args(7)="file:///home/hduser/ideaProject/MYSVMnBPPGD/iristest3.txt"

Cara Set Argument di IntelliJ IDEA: Klik Run, pilih “Edit Configurations....”

Cara Set Argument di IntelliJ IDEA: Pada “Program argument”, masukkan, misal seperti berikut, lalu klik OK, klik Apply, klik OK

VALIDATION file:///home/hduser/ideaProject/MYSVMnBPPGD/iris3.txt 0.8 1.0 10 result.txt 4

Running kembali kode programnya, klik kanan “main.scala”, pilih Run ”TestKernelSVM”

Hasil Running kembali kode program

result.txt, berisi: Training time: 10 Accuracy: 1.0 AUC: 1.0 Training time: 5 Accuracy: 1.0 AUC: 1.0 Training time: 4 Accuracy: 1.0 AUC: 1.0 Training time: 3 Accuracy: 1.0 AUC: 1.0 Training time: 2 Accuracy: 1.0 AUC: 1.0 Mean_Accuracy: 1.0 Mean_AUC: 1.0

Tampilan project

Running tanpa argument (tipe TEST). Pastikan agument pada “Run  Edit Configurations..  Program argument” telah kosong

Lalu pada file main.scala, dibawahnya kode “def main(args: Array[String]) {”, tambahkan kode berikut

//untuk tipe TEST //val args = Array("","","","","","","","") atau val args = Array.fill(8)("")

Lalu set arguments- nya diatas koding “val action = args(0)”, misal args(0)="TEST" args(1)="file:///home/hduser/ideaProject/MYSVMnBPPGD/iris3.txt" args(2)="0.8" args(3)="1.0" args(4)="20" args(5)="resultTest.txt" args(6)="4"

args(7)="file:///home/hduser/ideaProject/MYSVMnBPPGD/iristest3.txt"

Hasil Running tanpa argument (tipe TEST)

8.1.8 Konfigurasi & Solusi Error/Bug

- Ubuntu Desktop does not load Atau Remove Icon Red Minus

Ketikkan perintah berikut,

Tekan Ctrl+Alt+F1/F2/../F6 atau masuk ke /usr/share/applications/Xterm, lalu sudo apt-get install gnome-panel sudo mv ~/.Xauthority ~/.Xauthority.backup sudo apt-get install unity-tweak-tool unity-tweak-tool --reset-unity

Untuk mengembalikan terminal yang hilang (gunakan xterm): sudo apt-get remove gnome-terminal sudo apt-get install gnome-terminal

nidos@Master:~$ sudo rm /var/cache/apt/archives/*.* nidos@Master:~$ sudo rm -R /var/lib/dpkg/info nidos@Master:~$ cd /var/lib/dpkg/ nidos@Master:/var/lib/dpkg$ sudo mkdir info nidos@Master:~$ sudo apt-get clean

cat -n /etc/apt/sources.list ls -la /etc/apt/sources.list.d tail -v -n +1 /etc/apt/sources.list.d/* sudo apt-get update sudo apt-get upgrade sudo apt-get --reinstall install python3-minimal

- Cara Membuat Icon Eclipse di Ubuntu

Ketikkan perintah berikut: hduser@Master:~$ sudo mkdir ~/.local/share/applications hduser@Master:~$ sudo chmod 777 -R ~/.local/share/applications hduser@Master:~$ sudo gedit ~/.local/share/applications/opt_eclipse.desktop

[Desktop Entry] Type=Application Name=Eclipse Comment=Eclipse Integrated Development Environment Icon=/home/hduser/eclipse/jee- Anda install Eclipse Exec=/home/hduser/eclipse/jee- Anda install Eclipse Terminal=false Categories=Development;IDE;Java; StartupWMClass=Eclipse

Ketikkan perintah berikut: hduser@Master:~$ sudo chmod 777 -R ~/.local/share/applications hduser@Master:~$ sudo nautilus ~/.local/share/applications

Copy Icon Eclipse, lalu paste di Desktop

Hasil paste di Desktop, jangan lupa ketikkan kode berikut: hduser@Master:~$ cd /usr/local/hadoop hduser@Master:/usr/local/hadoop$ bin/hadoop fs -chmod -R 777 /

*Agar HDFS hduser bisa juga digunakan oleh user lain, misal nidos, sehingga ketika eclipse dijalankan dari Desktop nidos, hasil data prosesing dapat disimpan pada HDFS hduser.

Coba Running dari Desktop

- Solusi Jika Muncul Error pada saat Configure Python Interpreter di PyCharm

Coba ketikkan berikut: nidos@Master:~$ sudo apt-get install python-setuptools nidos@Master:~$ sudo apt-get install python-pip python-dev build-essential nidos@Master:~$ sudo pip install setuptools --upgrade