Editor + GUI untuk Spark Java/ Spark Scala/ PySpark
8.1 Editor + GUI untuk Spark Java/ Spark Scala/ PySpark
Editor + GUI, dengan adanya 2 bagian ini maka harapannya akan memudahkan developer dalam membuat koding dari awal dalam lingkungan IDE atau mengembangkan implementasi algoritma untuk penyelesaian kasus apapun menjadi lebih cepat, nyaman dan lebih profesional.
Gambar 8.1 Get Eclipse OXYGEN Java/ Scala/ Python/ R/ etc, ini hanya beberapa macam dari
bahasa pemrograman yang masing-masing memiliki kelebihan dan keterbatasan. Silahkan dipilih dengan bijak, manakah bahasa pemrograman untuk Analisis Big Data yang anda gunakan, sesuai dengan style anda sebagai developer.
Gambar 8.2 Bahasa Java/ Scala/ Python/ R
8.1.1 Install Sublime Text
- Ketikkan perintah berikut
sudo add-apt-repository ppa:webupd8team/sublime-text-3 sudo apt-get update sudo apt-get install sublime-text-installer sudo ln -s /usr/lib/sublime-text-3/sublime_text /usr/local/bin/sublime
8.1.2 Eclipse + Spark Standalone (Java EE)
- Link kode wordcount Spark Standalone: https://goo.gl/DNMsNG
Jika muncul “Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration”
Ganti kode “SparkSession spark = SparkSession.builder().appName("JavaWordCount") .getOrCreate();” Dengan SparkSession spark = SparkSession.builder().appName("JavaWordCount").config("spar k.master", "local[*]")
.getOrCreate();
8.1.3 Eclipse + Spark + Scala IDE + Maven
- Install Scala IDE pada Eclipse, sebelum Buka Eclipse, ketikkan kode berikut:
nidos@Master:~$ su hduser hduser@Master:/home/nidos$ cd hduser@Master:~$ sudo chmod 777 -R /home/hduser/eclipse- workspace/ hduser@Master:~$ sudo chmod 777 -R /home/hduser/eclipse
Klik Help, pilih “Install New Software”:
Pada work with, masukkan “http://download.scala- ide.org/sdk/lithium/e47/scala212/stable/site”, klik Add
Masukkan name, misal “Scala IDE”, klik OK
klik Select All, klik Next klik Select All, klik Next
Pilih accept, klik Finish
Tunggu beberapa waktu sampai instalasi selesai
Klik install anyway
Klik Restart Now
Open Perspective Scala, klik Other
Scala IDE berhasil di-install
Setelah di Klik Open
Cek FileNew
- Latihan 1: “HelloScala.scala”. Cek FileNew misal mencoba membuat “Scala Object” dengan nama “HelloScala”
package com.nidos.myscala
object HelloScala { def main(args:Array[String]){ println("Hello my Scala") } }
Cara menjalankannya degan menggunakan Run Configuration di atas.
- Latihan 2: Scala Spark Project dengan Maven
Klik Kanan di “Package Explorer” New Project
Pilih “Maven Project”, klik Next Pilih “Maven Project”, klik Next
klik Next
Isikan, misal seperti berikut, klik Finish
Tunggu beberapa waktu
Klik kanan “mysparkexample”, pilih Configure, klik “Add Scala Nature”
Hassil “Add Scala Nature”
Klik kanan “mysparkexample”, pilih Properties
Klik “Java Build Path”, klik Tab Source
Klik “Add Folder”, klik “main”, lalu klik “Create New Folder”
Isikan folder name, misal “scala”, klik Next
Klik Add
Isikan “**/*.scala”, lalu klik OK
Klik Finish
Klik OK
Klik “Apply and Close”
Pada “Package Explorer”, pada project “mysparkwxample”, “src/main/scala” sudah muncul
Pada project “mysparkwxample”, klik kanan pada “src/main/scala”, klik ”Package”
Isikan name dengan, misal “com.nidos.mysparkexample”, klik Finish
Package sudah muncul
Klik kanan pada package “com.nidos.mysparkexample”, klik “Scala Object”
Masukkan name, misal “WordCount”, klik Finish, link kode: https://goo.gl/ootdZN
Setelah diklik “Finish”, link kode: https://goo.gl/ootdZN
Buat main “ketik main”, lalu tekan Ctrl+Space
Konfigurasi file “pom.xml” untuk Spark, tambahkan dependencies dibawah ini, setelah line 17
<dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>2.10.4</version>
</dependency>
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.4.1</version>
</dependency>
Tunggu beberapa waktu
Sampai Selesai
Cek Auto format Spark
Ketika running, jangan memilih pada “Java Application”
dan muncul error “Error: Could not find or load main class com.nidos.spark.mysparkexample.WordCount”
Ubah ke “Scala Application”
Jika masih ada error “Error: Could not find or load main class com.nidos.spark.mysparkexample.WordCount” maka coba tambahkan kode berikut “package com.nidos.mysparkexample”, langsung jalankan tanpa dengan “ Scala Application ”
Set Argument “hdfs://localhost:9000/user/hduser/wordcount/input/input3.tx t”, lalu klik Run
Project berhasil dijalankan :D
8.1.4 Eclipse + Spark + Scala IDE + SBT
- Setting up Spark Dev Environment using SBT and Eclipse. Ketikkan kode berikut (http://www.scala-sbt.org/download.html)
nidos@Master:~$ echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
nidos@Master:~$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823
Ketikkan kode berikut: nidos@Master:~$ sudo apt-get update nidos@Master:~$ sudo apt-get install sbt
Install sbt telah selesai
- Misal membuat project SBT, namanya “SparkSVM”. Ketikkan perintah berikut ( Link kodenya: https://goo.gl/omA1ks ):
nidos@Master:~$ cd ./eclipse-workspace nidos@Master:~/eclipse-workspace$ mkdir SparkSVM nidos@Master:~/eclipse-workspace$ cd ./SparkSVM/ nidos@Master:~/eclipse-workspace/SparkSVM$ mkdir -p src/{main,test}/{java,resources,scala} nidos@Master:~/eclipse-workspace/SparkSVM$ mkdir lib project target nidos@Master:~/eclipse-workspace/SparkSVM$
Cek struktur folder, misal dengan sublime. Ketikkan perintah berikut: nidos@Master:~/eclipse-workspace/SparkSVM$ subl .
Misal kita sudah download file project referensi “Spark_kernel_svm-master” dari link berikut: https://goo.gl/j3dWL4
Ketikkan berikut, lalu cek di sublime (jgn lupa mengganti name :=“SparkSVM”): nidos@Master:~/eclipse-workspace/SparkSVM$ cp /home/nidos/Downloads/Spark_kernel_svm-master/build.sbt ./
Jangan lupa mengganti namenya menjadi name :=“SparkSVM”
Buat file plugins.sbt pada folder “SparkSVM/project”. Ketikkan perintah berikut, lalu cek di sublime: nidos@Master:~/eclipse-workspace/SparkSVM$ cd project/ nidos@Master:~/eclipse-workspace/SparkSVM/project$ touch plugins.sbt
Masukkan kode “addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse- plugin" % "5.2.4")” pada file plugins.sbt, lalu simpan
Cek Struktur foder lagi. Ketikkan perintah berikut: nidos@Master:~/eclipse-workspace/SparkSVM/project$ cd .. nidos@Master:~/eclipse-workspace/SparkSVM$ find
Jalankan SBT. Ketikkan perintah berikut: nidos@Master:~/eclipse-workspace/SparkSVM$ ls nidos@Master:~/eclipse-workspace/SparkSVM$ sbt
tunggu beberapa waktu
File Dependencies dari SBT sudah berhasil dicreate
Klik File, klik Open Tab
Ketikkan, untuk cek “.classpath” dan lainnya, berikut: nidos@Master:~$ cd ./eclipse-workspace/SparkSVM/ nidos@Master:~/eclipse-workspace/SparkSVM$ ls build.sbt lib project src target nidos@Master:~/eclipse-workspace/SparkSVM$ ls -a . .. build.sbt .classpath lib project .project .settings src target nidos@Master:~/eclipse-workspace/SparkSVM$
Buka Eclipse, klik File, klik Import
Pilih General, klik “Existing Projects into ....”, klik Next
Klik Browse
Cari pada folder “/home/nidos/eclipse-workspace” Pilih “SparkSVM”, klik OK
Klik Finish
Project “SparkSVM” siap untuk dicopykan kode program dari referensi
Copykan 3 code berikut, dari project referen ke Project
Siapkan dataset, misal “iris3.txt” difolder, misal “/home/nidos/eclipse-workspace/SparkSVM”
Run Project “SparkSVM”, dengan klik kanan file “main.scala”, pilih “Run As”, klik “2Scala Application”, jika belum muncul “result.txt”
Set Run Project “SparkSVM” by kode program, dengan mengisi langsung args(0)-nya
Pada file main.scala, ganti kode berikut:
if (args.length != 1 ) { println("Usage: /path/to/spark/bin/spark-submit --packages amplab:spark-indexedrdd:0.1" +
"target/scala-2.10/spark-kernel-svm_2.10-1.0.jar <data file>") }
sys.exit(1) val logFile = "README.md" // Should be some file on your system
//val conf = new SparkConf().setAppName("KernelSVM Test")
Dengan val args = Array.fill(1)("") val logFile = "README.md" // Should be some file on your system
val conf = new SparkConf() conf.setAppName("SparkSVM") conf.setMaster("local[*]")
val sc = new SparkContext(conf) args(0)="file:///home/nidos/eclipse-workspace/SparkSVM/iris3.txt"
Run lagi Project “SparkSVM”, dengan klik kanan file “main.scala”, pilih “Run As”, klik “2Scala Application”
Runing Project “SparkSVM” Sukses
Hasil Runing Project “SparkSVM” berupa file “result.txt”, juga sudah muncul
Isi dari file “result.txt”
8.1.5 Eclipse + PySpark + PyDev
- Setting up Eclipse + PySpark + PyDev. Ikuti langkah-langkah berikut:
Klik Help, pilih “ Install New Software ”:
Pada work with, masukkan “http://www.pydev.org/updates”, Tekan Enter
Select All, klik Next
Klik Next
Pilih “I accept ..”, Klik Finish
Tunggu beberapa waktu untuk “Installing Software..”
Klik Install Anyway
Klik Restart Now
Open Perspective PyDev, klik Other
PyDev berhasil di-install
Setelah di Klik Open, klik FileNew
Ketikkan, hduser@Master:~$ sudo gedit ~/.bashrc
Pastikan pada file “bashrc” anda sudah berisi:
.. export JAVA_HOME=/usr/lib/jvm/java-8-oracle export JRE_HOME=/usr/lib/jvm/java-8-oracle/jre export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native" export HADOOP_CLASSPATH=/usr/lib/jvm/java-8-oracle/lib/tools.jar
export SPARK_HOME=/home/hduser/spark-2.2.0-bin-hadoop2.7 export PATH=$PATH:$SPARK_HOME/bin export PATH=$PATH:$SPARK_HOME/bin/pyspark export XDG_RUNTIME_DIR=""
# Add the PySpark classes to the Python path: export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export MAHOUT_HOME=/usr/local/mahout export PATH=$PATH:$MAHOUT_HOME/bin export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
# added by Anaconda2 4.4.0 installer export PATH="/home/hduser/anaconda/bin:$PATH“
- Latihan 1: “HelloPySparkOnEclipse”. Cek FileNew misal men- coba membuat “PySpark On Eclipse” dengan nama “HelloPySpar- kOnEclipse”
klik “Click here to configure an interpreter not listed”
Klik “Quick Auto-Config”, lalu klik “Libraries”
Klik “New Folder”
Pilih folder Python pada Spark Home, misal di “ /home/hduser/spark-2.2.0-bin-hadoop2.7/python ”, lalu Klik “OK”
Pilih folder Python pada Spark Home, sudah berhasil dimasukkan, lalu klik “New Egg/Zip(s)”
Masuk ke directory “ /home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib ”, ubah “*.egg” ke “*.zip”, lalu klik OK.
Pilih “ /home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib/py4j-0.10.4-src.zip ”, lalu klik OK
File “ /home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib/py4j-0.10.4-src.zip ”, berhasil ditambahkan
Klik Apply
Klik tab “Environment”, klik New
Masukkan Name “SPARK_HOME” dan Value “/home/hduser/spark- 2.2.0-bin- hadoop2.7”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark-shel l”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark-shel l”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukk an Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masuk kan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark- shell”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
“SPARK_HOME” dan Value “/home/hduser/spark-2.2.0-bin- hadoop2.7”, berhasil ditambahkan
Masukkan Name “PYSPARK_SUBMIT_ARGS” dan Value “--master local[*] --queue PyDevSpark2.2.0 pyspark-shel l”, klik OK
Masukkan Name “SPARK_CONF_DIR” dan Value “/home/hduser/spark-2.2.0-bin-hadoop2.7/conf”, klik OK
Klik Apply
Klik Finish
Buat Source Folder
Klik kanan “src” New Pilih PyDev Module
Beri Name, misal “wordc”, klik Finish
Masukkan kode, misal dari “https://goo.gl/Fu6geJ”
Klik kanan “wordc.py”, pilih “Python Run”
Jika muncul error berikut, maka jalankan dulu hadoop anda hduser@Master:~$ start-all.sh
Lalu Klik kanan lagi “wordc.py”, pilih “Python Run”
File “wordc.py”, berhasil dijalankan
8.1.6 PySpark + Pycharm
- Setting up pySpark Dev Environment using Pycharm. Ikuti langkah- langkah berikut:
Download pycharm lalu ekstrak, misal di folder /usr/local/
Download pycharm lalu ekstrak, misal di folder /usr/local/
Sebelum menjalankan pycharm, jalankan terlebih dahulu Hadoop. Ketikkan: hduser@Master:/home/nidos$ start-all.sh
Lalu jalankan pyspark, lalu baru jalankan pyCharm. Ketikkan: hduser@Master:/home/nidos$ pyspark
Jalankan Pycharm. Ketikkan: hduser@Master:~$ cd /usr/local/pycharm/bin/ hduser@Master:/usr/local/pycharm/bin$ ls
format.sh idea.properties pycharm64.vmoptions restart.py fsnotifier inspect.sh pycharm.png fsnotifier64 log.xml pycharm.sh fsnotifier-arm printenv.py pycharm.vmoptions
hduser@Master:/usr/local/pycharm/bin$ ./pycharm.sh Klik Create New Project
Masukkan nama project pada Location, misal “/home/hduser/PycharmProjects/pySpark”, klik Create
Tampilan project “pySpark”
Pada project “pySpark”, misal buat kode “WordCount.py”
Jika muncul Error
Edit Configuration
Konfigurasi pada Environment Variable, tekan tombol plus, lalu tambahkan path SPARK_HOME, misal di “/home/hduser/spark-2.2.0-bin-hadoop2.7”
Dan PYTHONPATH, misal dari path “/home/hduser/spark-2.2.0- bin-hadoop2.7/python/lib/py4j-0.10.4- src.zip:/home/hduser/spark-2.2.0-bin- hadoop2.7/python”
Konfigurasi pada Environment Variable, bisa juga dengan cara copy kode berikut, lalu klik icon paste, part 1 of 2.
SPARK_HOME=/home/hduser/spark-2.2.0-bin-hadoop2.7 PYTHONPATH=/home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib/py4j-0.10.4-src.zip:/home/hduser/spark- 2.2.0-bin-hadoop2.7/python
Konfigurasi pada Environment Variable, bisa juga dengan cara copy kode berikut, lalu klik icon paste, part 2 of 2 (klik OK, klik Apply, klik OK).
SPARK_HOME=/home/hduser/spark-2.2.0-bin-hadoop2.7 PYTHONPATH=/home/hduser/spark-2.2.0-bin- hadoop2.7/python/lib/py4j-0.10.4-src.zip:/home/hduser/spark- 2.2.0-bin-hadoop2.7/python
Ketikkan: hduser@Master:~$ sudo gedit ~/.bashrc
Pastikan pada file “bashrc” anda sudah berisi:
.. export JAVA_HOME=/usr/lib/jvm/java-8-oracle export JRE_HOME=/usr/lib/jvm/java-8-oracle/jre export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native" export HADOOP_CLASSPATH=/usr/lib/jvm/java-8-oracle/lib/tools.jar
export SPARK_HOME=/home/hduser/spark-2.2.0-bin-hadoop2.7 export PATH=$PATH:$SPARK_HOME/bin export PATH=$PATH:$SPARK_HOME/bin/pyspark export XDG_RUNTIME_DIR=""
# Add the PySpark classes to the Python path: export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export MAHOUT_HOME=/usr/local/mahout export PATH=$PATH:$MAHOUT_HOME/bin export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
# added by Anaconda2 4.4.0 installer export PATH="/home/hduser/anaconda/bin:$PATH“ ..
Fie “wordc.py” berhasil dijalankan, link file kode: “https://goo.gl/Fu6geJ” atau dari “https://goo.gl/QZiLyX”
Solusi untuk hidden “Setting default log level to "WARN".”
Masuk ke directory spark, lalu hduser@Master:~$ cd ./spark-2.2.0-bin-hadoop2.7/
Solusi untuk hidden “Setting default log level to "WARN".” Masuk ke directory spark, lalu hduser@Master:~$ cd ./spark-2.2.0-bin-hadoop2.7/ hduser@Master:~/spark-2.2.0-bin-hadoop2.7$ cd ./conf/ hduser@Master:~/spark-2.2.0-bin-hadoop2.7/conf$ ls
Solusi untuk hidden “Setting default log level to "WARN".” hduser@Master:~/spark-2.2.0-bin-hadoop2.7/conf$ sudo cp ./log4j.properties.template ./log4j.properties
Replace semua isi file “log4j.properties” dengan isi file dari link “https://goo.gl/GiWCfy” Restart pySpark dengan tekan control+D atau ketik quit(), lalu tekan enter
Lalu jalankan lagi file *.py
Setting default log level to " WARN“ sudah berhasil terhidden
Cara meng-copy Environment Variables 1 of 3. Misal, anda baru saja membuat file baru dengan nama “hdfs_wordcount.py”, maka biasanya Environment Variables belum lengkap, seperti gambar berikut
Cara meng-copy Environment Variables 2 of 3 Solusinya, anda bisa mengisinya dengan cara meng-copy, misal dari Environment Variables dari file “wordc.py” kita klik, lalu blok pada konfigurasi lalu klik icon Copy
Cara meng-copy Environment Variables 3 of 3 Lalu klik lagi file “hdfs_wordcount.py”, pilih Environment Variables, lalu klik icon Paste, klik OK, klik Apply, klik OK
Code hdfs_wordcount.py streaming berhasil dijalankan. Link file kode “https://goo.gl/vY6f4E”
Karena file kode tersebut streaming, maka akan selalu mencari file baru untuk diproses kembali dengan konsep wordcount setiap waktu (misal per detik)
Proses Streaming, dari setiap waktu koding akan dijalankan untuk melakukan proses wordcount jika ada data baru yang masuk di hdfs, misal : /user/hduser/wordcount/input Pada bagian hijau , Proses belum ada karena pada alamat hdfs tersebut masih belum ada data baru yang masuk
Tampilan isi dari alamat “ /user/hduser/wordcount/input ” pada hdfs yang masih belum ada data baru yang masuk (masih kosong)
M isal kita telah set “ssc.awaitTerminationOrTimeout(60*10) untuk proses streaming supaya dijalankan selama 10 menit”, lalu jalankan code hdfs_wordcount.py streaming
Pada saat koding dijalankan, untuk melakukan proses wordcount maka masukkan file text sembarang ke alamat /user/hduser/wordcount/input pada hdfs, misal file terebut adalah “input.txt” dan “input2.txt”, dari link berikut “ https://goo.gl/6d7CWQ”
Pada saat koding dijalankan, untuk melakukan proses wordcount maka masukkan file text sembarang ke alamat : /user/hduser/wordcount/input pada hdfs, pada Hue klik Upload, klik “Select files”
Klik Open
Tampilan pada Hue setelah Klik Open
Segera kembali ke PyCharm, maka hasil wordcount untuk file “input.txt” akan muncul
Misal melakukan proses wordcount lagi untuk file yang kedua ke alamat : /user/hduser/wordcount/input pada hdfs, pada Hue klik Upload, klik “Select files”
Klik Open
Tampilan pada Hue setelah Klik Open
Segera kembali ke PyCharm, maka hasil wordcount untuk file “input2.txt” akan muncul
Wordcount streaming ke-2. Link file kode “https://goo.gl/cnGuHo”
Copy semua file pada project di PyCharm, misal nama projectnya “pySparkWordCount”
Hasil Copy semua file pada project di PyCharm, misal nam projectnya “pySparkWordCount”.
1. Run file “streaming.py”, lalu 2. Run file “file.py” pada terminal
1. file “streaming.py” telah aktif , lalu 2. Run file “file.py” pada terminal, ketik seperti dibawah
ini, lalu tekan enter
1. file “streaming.py” telah aktif dan telah memproses wordcount lalu
2. file “file.py” telah aktif pada terminal, untuk simulasi membuat file streaming log{}.txt
Wordcount streaming ke-3 (pySpark Streaming (HDFS File) & Pycharm). Link file kode “”
Buat project baru di PyCharm, misal nama projectnya “pySparkWordCountHDFS”
Atau dengan cara buat package pada PycharmProjects, misal nama package- nya “pySparkWordCountHDFS”
Hapus file “__init__.py”
Copy file dari “pySparkWordCountLocal” ke “pySparkWordCountHDFS”
Hasil Copy semua file pada project di PyCharm, misal nam projectnya “pySparkWordCount”.
1. Run file “streaming.py”, lalu 2. Run file “file.py” pada terminal
1. file “streaming.py” telah aktif , lalu 2. Run file “file.py” pada terminal, ketik seperti dibawah
ini, lalu tekan enter
1. file “streaming.py” telah aktif dan telah memproses wordcount lalu
2. file “file.py” telah aktif pada terminal, untuk simulasi membuat file streaming log{}.txt
– pySpark Naive Bayes (From Scratch) & Pycharm
Link file kode “https://goo.gl/i9Cn5v” Buat package baru di PyCharm, misal namanya “pySparkNB”, dan masukkan file “train_pos.txt‘, train_neg.txt‘, test_pos.txt‘, test_neg.txt” ke dalam HDFS pada hdfs://localhost:9000/user/hduser/NB_files
Tampilan file “train_pos.txt‘, train_neg.txt‘, test_pos.txt‘, test_neg.txt” pada hdfs://localhost:9000/user/hduser/NB_files dengan Hue
8.1.7 IntelliJ IDEA + SBT
- Setting up Scala Spark + SBT Dev Environment using IntelliJ IDEA
Download IntelliJ IDEA lalu ekstrak, misal di folder /usr/local/
Tampilan folder hasil download
hduser@Master:~$ sudo chown hduser:hadoop -R /usr/local/idea-IC hduser@Master:~$ sudo chmod 777 -R /usr/local/idea-IC
Edit file bashrc, tambahkan “export IBUS_ENABLE_SYNC_MODE=1” hduser@Master:~$ sudo subl ~/.bashrc Atau gunakan, hduser@Master:~$ sudo gedit ~/.bashrc
hduser@Master:~$ source ~/.bashrc
Jalankan IntelliJ IDEA. hduser@Master:~$ cd /usr/local/idea-IC/bin/ hduser@Master:/usr/local/idea-IC/bin$ ./idea.sh Misal pilih “Do not import settings” Klik OK
Pilih Theme
Centang “Create a desktop ..” dan “For all users ..”
Create Launcher Script
Klik Next
Klik Install Scala
Klik Install and Enable -> Optional (Sebaiknya tidak perlu install IdeaVim)
Klik “Start using IntelliJ IDEA” 1 of 2
Klik “Start using IntelliJ IDEA” 2 of 2
Klik “Create New Project”
Klik Scala, pilih SBT, klik Next
Masukkan name, misal “MYSVMnBPPGD”
Karena menggunakan SBT, sebaiknya uncheck pada Sources pada Scala, lalu klik Finish
Klik File, pilih “Project Structure...”
Klik Modules, Pada “mysvmnbppgd”, Klik tab “Dependencies”, set seperti berikut
Klik Modules, Pada “mysvmnbppgd-build”, Klik tab “Dependencies”, set seperti berikut , klik Apply, klik OK
Tampilan di IntelliJ IDEA
Tampilan di Explorer Linux (nautilus)
Ketikkan berikut, agar folder tidak terkunci hduser@Master:~$ cd ./ideaProject hduser@Master:~/ideaProject$ sudo chmod 777 -R ./
Download kode program SVM dari link: https://goo.gl/TVMZGn (sudah dgn kernel Poly) atau https://goo.gl/ttW5c9 (belum ada kernel Poly)
Copy file build.sbt dari folder “/home/nidos/Download/BPPGD- master” ke “/home/hduser/ideaProject/MYSVMnBPPGD”
Copy semua file dari folder “/home/nidos/Download/BPPGD- master/src/main/scala” ke “/home/hduser/ideaProject/MYSVMnBPPGD/src/main/scala”
Klik Replace
Lalu ketikkan lagi berikut: hduser@Master:~/ideaProject$ sudo chmod 777 -R ./
Masuk Ke “Modules”, pada “mysvmbppgd”, klik “Dependencies”, klik tanda “-” merah. Klik Apply, klik OK
Masuk Ke “Project Structure”, klik Libraries
Klik tanda “-” warna merah, klik OK
Klik Apply, klik OK
Lalu ketikkan lagi hduser@Master:~/ideaProject$ sudo chmod 777 -R ./
Buka file “build.sbt” pada IntelliJ IDEA
Klik “Enable auto-Import”
Tunggu beberapa waktu, sampai semua file dependency sudah selesai didownload
Jika muncul error
Error:Error while importing SBT project:<br/>...<br/><pre>[error] resource for amplab#spark-indexedrdd;0.4.0: res=https://repo1.maven.org/maven2/amplab/spark- public: unable to get indexedrdd/0.4.0/spark-indexedrdd-0.4.0.pom: java.net.UnknownHostException: repo1.maven.org
[error] res=http://dl.bintray.com/spark-packages/maven/amplab/spark-indexedrdd/0.4.0/spark-indexedrdd- Spark Packages Repo: unable to get resource for amplab#spark-indexedrdd;0.4.0: 0.4.0.pom: java.net.UnknownHostException: dl.bintray.com indexedrdd;0.4.0: res=https://raw.githubusercontent.com/ankurdave/maven- [error] Repo at github.com/ankurdave/maven-repo: unable to get resource for amplab#spark- repo/master/amplab/spark-indexedrdd/0.4.0/spark-indexedrdd-0.4.0.pom:
java.net.UnknownHostException: raw.githubusercontent.com [error]
[error] unresolved dependency: com.ankurdave#part_2.10;0.1: Resolution failed several times for dependency: com.ankurdave#part_2.10;0.1 {compile=[default(compile)]}:: [error]
public: unable to get resource for com/ankurdave#part_2.10;0.1: res=https://repo1.maven.org/maven2/com/ankurdave/part_2.10/0.1/part_2.10-0.1.pom: java.net.UnknownHostException: repo1.maven.org
[error]
Spark Packages Repo: unable to get resource for com/ankurdave#part_2.10;0.1: res=http://dl.bintray.com/spark-packages/maven/com/ankurdave/part_2.10/0.1/part_2.10-0.1.pom: java.net.UnknownHostException: dl.bintray.com
[error] Repo at github.com/ankurdave/maven-repo: unable to get resource for com/ankurdave#part_2.10;0.1: res=https://raw.githubusercontent.com/ankurdave/maven- repo/master/com/ankurdave/part_2.10/0.1/part_2.10-0.1.pom: java.net.UnknownHostException:
raw.githubusercontent.com [error] [error] unresolved dependency: org.scalatest#scalatest_2.11;2.2.4: Resolution failed several
times for dependency: org.scalatest#scalatest_2.11;2.2.4 {test=[default(compile)]}:: [error] res=https://repo1.maven.org/maven2/org/scalatest/scalatest_2.11/2.2.4/scalatest_2.11-2.2.4.pom: public: unable to get resource for org/scalatest#scalatest_2.11;2.2.4:
java.net.UnknownHostException: repo1.maven.org [error] Spark Packages Repo: unable to get resource for org/scalatest#scalatest_2.11;2.2.4: res=http://dl.bintray.com/spark-
packages/maven/org/scalatest/scalatest_2.11/2.2.4/scalatest_2.11-2.2.4.pom: java.net.UnknownHostException: dl.bintray.com [error]
Repo at github.com/ankurdave/maven-repo: unable to get resource for org/scalatest#scalatest_2.11;2.2.4: res=https://raw.githubusercontent.com/ankurdave/maven- repo/master/org/scalatest/scalatest_2.11/2.2.4/scalatest_2.11-2.2.4.pom: java.net.UnknownHostException: raw.githubusercontent.com
[error] [error] unresolved dependency: org.scalacheck#scalacheck_2.11;1.12.2: Resolution failed several times for dependency: org.scalacheck#scalacheck_2.11;1.12.2 {test=[default(compile)]}::
[error] res=https://repo1.maven.org/maven2/org/scalacheck/scalacheck_2.11/1.12.2/scalacheck_2.11- public: unable to get resource for org/scalacheck#scalacheck_2.11;1.12.2: 1.12.2.pom: java.net.UnknownHostException: repo1.maven.org
[error] org/scalacheck#scalacheck_2.11;1.12.2: res=http://dl.bintray.com/spark- Spark Packages Repo: unable to get resource for packages/maven/org/scalacheck/scalacheck_2.11/1.12.2/scalacheck_2.11-1.12.2.pom:
java.net.UnknownHostException: dl.bintray.com [error] org/scalacheck#scalacheck_2.11;1.12.2: res=https://raw.githubusercontent.com/ankurdave/maven- Repo at github.com/ankurdave/maven-repo: unable to get resource for
repo/master/org/scalacheck/scalacheck_2.11/1.12.2/scalacheck_2.11-1.12.2.pom: java.net.UnknownHostException: raw.githubusercontent.com [error] Total time: 9 s, completed Dec 16, 2017 1:27:20 PM</pre><br/>See complete log in <a
href="file:/home/hduser/.IdeaIC2017.2/system/log/sbt.last.log">file:/home/hduser/.IdeaIC2017.2/s ystem/log/sbt.last.log</a>
Maka solusinya cek koneksi internet anda
Maka solusinya cek koneksi internet anda, dengan klik “Wired connection 1”
Semua file dependencies dari Build.sbt sudah berhasil didownload, abaikan warning pada log
Untuk menjalankan kode program, buka file main.scala, klik kanan, pilih “TestKernelSVM”
Kode program, berhasil dibuild, dan ada petunjuk “spark- submit ..”, jika mau menjalankan dari Terminal
Usage: /path/to/spark/bin/spark-submit --packages amplab:spark-
indexedrdd:0.4.0target/scala-2.11/ppackubuntu_2.11-1.0.jar <data file>
Pada tipe “VALIDATION”, pada file “main.scala” terdapat args(0), args(1), .., args(6)
- args(0), tipe - args(1), trainingfile: Path of the training set in libsvm
format - args(2), lambda: Regularization Term - args(3), sigma: Kernel Parameter - args(4), iterations: Number of iterations - args(5), outputfile: log file - args(6), numfeatures: Number of variables of the dataset
Contoh argument yang digunakan: VALIDATION file:///home/hduser/ideaProject/MYSVMnBPPGD/iris3.txt 0.8 1.0 10 result.txt 4
Pada tipe “TEST”, pada file “main.scala” terdapat args(0), args(1), .., args(6), args(7)
- args(0), tipe - args(1), trainingfile: Path of the training set in libsvm
format - args(2), lambda: Regularization Term - args(3), sigma: Kernel Parameter - args(4), iterations: Number of iterations - args(5), outputfile: log file - args(6), numfeatures: Number of variables of the dataset - args(7), testingfile: Path of the testing set in libsvm
format
Contoh argument yang digunakan: args(0)="TEST" args(1)="file:///home/hduser/ideaProject/MYSVMnBPPGD/iris3.txt" args(2)="0.8" args(3)="1.0" args(4)="20" args(5)="result.txt" args(6)="4"
args(7)="file:///home/hduser/ideaProject/MYSVMnBPPGD/iristest3.txt"
Cara Set Argument di IntelliJ IDEA: Klik Run, pilih “Edit Configurations....”
Cara Set Argument di IntelliJ IDEA: Pada “Program argument”, masukkan, misal seperti berikut, lalu klik OK, klik Apply, klik OK
VALIDATION file:///home/hduser/ideaProject/MYSVMnBPPGD/iris3.txt 0.8 1.0 10 result.txt 4
Running kembali kode programnya, klik kanan “main.scala”, pilih Run ”TestKernelSVM”
Hasil Running kembali kode program
result.txt, berisi: Training time: 10 Accuracy: 1.0 AUC: 1.0 Training time: 5 Accuracy: 1.0 AUC: 1.0 Training time: 4 Accuracy: 1.0 AUC: 1.0 Training time: 3 Accuracy: 1.0 AUC: 1.0 Training time: 2 Accuracy: 1.0 AUC: 1.0 Mean_Accuracy: 1.0 Mean_AUC: 1.0
Tampilan project
Running tanpa argument (tipe TEST). Pastikan agument pada “Run Edit Configurations.. Program argument” telah kosong
Lalu pada file main.scala, dibawahnya kode “def main(args: Array[String]) {”, tambahkan kode berikut
//untuk tipe TEST //val args = Array("","","","","","","","") atau val args = Array.fill(8)("")
Lalu set arguments- nya diatas koding “val action = args(0)”, misal args(0)="TEST" args(1)="file:///home/hduser/ideaProject/MYSVMnBPPGD/iris3.txt" args(2)="0.8" args(3)="1.0" args(4)="20" args(5)="resultTest.txt" args(6)="4"
args(7)="file:///home/hduser/ideaProject/MYSVMnBPPGD/iristest3.txt"
Hasil Running tanpa argument (tipe TEST)
8.1.8 Konfigurasi & Solusi Error/Bug
- Ubuntu Desktop does not load Atau Remove Icon Red Minus
Ketikkan perintah berikut,
Tekan Ctrl+Alt+F1/F2/../F6 atau masuk ke /usr/share/applications/Xterm, lalu sudo apt-get install gnome-panel sudo mv ~/.Xauthority ~/.Xauthority.backup sudo apt-get install unity-tweak-tool unity-tweak-tool --reset-unity
Untuk mengembalikan terminal yang hilang (gunakan xterm): sudo apt-get remove gnome-terminal sudo apt-get install gnome-terminal
nidos@Master:~$ sudo rm /var/cache/apt/archives/*.* nidos@Master:~$ sudo rm -R /var/lib/dpkg/info nidos@Master:~$ cd /var/lib/dpkg/ nidos@Master:/var/lib/dpkg$ sudo mkdir info nidos@Master:~$ sudo apt-get clean
cat -n /etc/apt/sources.list ls -la /etc/apt/sources.list.d tail -v -n +1 /etc/apt/sources.list.d/* sudo apt-get update sudo apt-get upgrade sudo apt-get --reinstall install python3-minimal
- Cara Membuat Icon Eclipse di Ubuntu
Ketikkan perintah berikut: hduser@Master:~$ sudo mkdir ~/.local/share/applications hduser@Master:~$ sudo chmod 777 -R ~/.local/share/applications hduser@Master:~$ sudo gedit ~/.local/share/applications/opt_eclipse.desktop
[Desktop Entry] Type=Application Name=Eclipse Comment=Eclipse Integrated Development Environment Icon=/home/hduser/eclipse/jee- Anda install Eclipse Exec=/home/hduser/eclipse/jee- Anda install Eclipse Terminal=false Categories=Development;IDE;Java; StartupWMClass=Eclipse
Ketikkan perintah berikut: hduser@Master:~$ sudo chmod 777 -R ~/.local/share/applications hduser@Master:~$ sudo nautilus ~/.local/share/applications
Copy Icon Eclipse, lalu paste di Desktop
Hasil paste di Desktop, jangan lupa ketikkan kode berikut: hduser@Master:~$ cd /usr/local/hadoop hduser@Master:/usr/local/hadoop$ bin/hadoop fs -chmod -R 777 /
*Agar HDFS hduser bisa juga digunakan oleh user lain, misal nidos, sehingga ketika eclipse dijalankan dari Desktop nidos, hasil data prosesing dapat disimpan pada HDFS hduser.
Coba Running dari Desktop
- Solusi Jika Muncul Error pada saat Configure Python Interpreter di PyCharm
Coba ketikkan berikut: nidos@Master:~$ sudo apt-get install python-setuptools nidos@Master:~$ sudo apt-get install python-pip python-dev build-essential nidos@Master:~$ sudo pip install setuptools --upgrade