APPLICATION OF LENGTH OF STUDY AND DEGREE OFEXCELLENCE PREDICTION FOR INFORMATICS DEPARTMENT Application of Length of Study and Degree of Excellence Prediction For Informatics Department Students of UMS Using Naive Bayes Method.

APPLICATION OF LENGTH OF STUDY AND DEGREE OF
EXCELLENCE PREDICTION FOR INFORMATICS DEPARTMENT
STUDENTS OF UMS USING NAÏVE BAYES METHOD

Papers
Department of Informatics
Faculty of Communications and Informatics

By:

Muh Amin Nurrohmat
Yusuf Sulistyo Nugroho, S.T, M.Eng
DEPARTMENT OF INFORMATICS
FACULTY OF COMMUNICATIONS AND INFORMATICS
UNIVERSITAS MUHAMMADIYAH SURAKARTA
JUNE 2015

APPLICATION OF LENGTH OF STUDY AND DEGREE OF
EXCELLENCE PREDICTION FOR INFORMATICS DEPARTMENT
STUDENTS OF UMS USING NAÏVE BAYES METHOD


Muh Amin Nurrohmat, Yusuf Sulistyo Nugroho
Department of Informatics, Faculty of Communications and Informatics
Universitas Muhammadiyah Surakarta
Email : amin.nurrohmat@gmail.com

ABSTRACT
Informatics department of Muhammadiyah University of Surakarta has
large data. The data are active students and graduate students. Every year, the data
becomes larger. On the other hand, the department cannot manage the data well,
thus it means that if the data is larger, then the information is smaller. The solution
to solve the problem is that the data must be converted into information. This
research discusses how to maximize the data into information using data mining
technique. This research uses Naïve Bayes method. It is used to analyze the data,
especially in the process of pattern recognition, predicting length of study, and
predicting the degree of excellence. After processing the data, the application will
display the report, the summary report, and suggestion. Based on the results, the
application helps Informatics department to find a solution and take a decision to
determine the policy. It is in order to decide where Informatics department will
promote its department. Moreover, if Informatics department can recruit good
student, it can improve the quality of Informatics department.

Keywords: data mining, degree of excellence, length of study, Naïve Bayes
method

INTRODUCTION
The

competitiveness of university and to

development

of

support strategic decision-making.

information technology is very rapid.

Mauriza (2013), based on

Now and in the future, information


data from 342 college students,

becomes a crucial element in life.

results show that the samples using

Information

generates

Naïve Bayes method shows that

large data. Data includes education,

students will not graduate on time is

economy, industry, and others. These

256 students or 74.85% of the total


can

for

sample. This shows that many

information. However, the need is

students who take study more than 8

not offset by presenting sufficient

semesters. Therefore, the department

information. The big data cannot be

needs to utilize the student data and

used optimally. Thus, in presenting


the graduation data. The data is used

the information needed require data

to determine student graduation rates

reprocessing.

by using data mining techniques.

technology

increase

the

need

The use of data mining


This research still has weaknesses.

techniques are expected to provide

The

information where the previous data

implementation of data mining is still

is only hidden in the data warehouse,

using the existing application. The

it makes the data as valuable

application was WEKA.

information (Huda, 2010).
The


implementation

weakness

is

that

the

Based on the problems, this
of

research develops application to help

information technology in university

the Informatics department. This


generates large data. The data such

application predicts the length of

as student data and graduation data

study and degree of excellence of

that always grows every year. The

Informatics

data include personal data, the

Communication

number of graduation and academic

UMS


score. Based on the data, the hidden

techniques. Selected techniques are

information

by

Naïve Bayes method. The writer

exploring the data to improve the

merges the technique with student

can

be

found


by

students,
and

using

Faculty of
Informatics
data

mining

data and graduation data. The writer

semester, and the role as an lab

hopes that the use of data mining

assistant. The results show that the


techniques with Naive Bayes method

highest influence variable on the

can find the information about the

length of study is the average credits

length of study and degree of

per semester. Moreover, the results

excellence.

the

indicate that the variables that need

Informatics

to be used as consideration for

department in finding solutions and

faculty to obtain the effective rate of

policies

the length of study is the average of

The

application

use

helps

to

achievement,

of

improve
thus

student

students

can

credits.

complete their studies on time.

Setyawan

(2014),

in

his

research titled "Klasifikasi Prestasi
LITERATURE REVIEW
Nugroho

and

Akademik Mahasiswa FKI UMS
Setyawan

(2014) state in their research titled
"Klasifikasi

Masa

Studi

Menggunakan

Metode

Decision

Tree" aims to classify students'
academic

achievement.

Student

Mahasiswa Fakultas Komunikasi

achievement is obtained by using

Dan

Universitas

data mining and provides a strategic

Surakarta

plan for the Informatics department

Menggunakan Algoritma C4.5"

to find out new students, and to

that there is a relationship between

improve

the classification of the length of

achievement.

study to the data FKI UMS’s student

accreditation score for informatics

who have graduated by using the

department. The writer chooses 342

Decision Tree algorithm C4.5. The

students

writer took 341 data of the 2358

classification result shows that male

data. The data is the student data

students

who have already graduated. The

predicted to have a satisfactory GPA

writer uses some attributes such as

from

school major, gender, school, the

classification results predict that the

average

students who are not from science

Informatika

Muhammadiyah

number

of

credits

per

student’s
It

as

are

a

increases

sample.

students

Surakarta

academic

who

residency.

the

The

are

The

are the students who have a less

Surakarta by using Naive Bayes

satisfactory GPA. The classification

methods.

results predict that students who have

compared the method with one

a satisfactory GPA are the students

another. J48 is better than others

who come from the Surakarta outside

because of its ability to define and

the residency. Female students from

classify each attribute to each class

Surakarta residency are predicted to

simply. The result shows that there

have

The

are 342 students as a data. The

classification results predict that

writer uses Naive Bayes method.

students who have a less satisfactory

Apparently, the results show that the

GPA are the student majoring in

students who will graduate on time

science and social studies. The

only

classification results predict that

Students who will not graduate on

male students from social studies

time 256 students or 74.85%. Based

originating from Surakarta residency,

on the results, the faculty needs

and women students who come from

solutions

outside residency of Surakarta is a

achievement. Thus, students can

student with a GPA of less than

graduate on time and they have

satisfactory.

the

satisfactory outcome. Thereby, it

students majoring in science is

can help the faculty to increase the

predicted to have a satisfactory GPA.

accreditation score.

a

satisfactory

Almost

Mauriza

GPA.

all

students

to

or

improve

is

also

25.15%.

student

According to Huda (2010),

research entitled "Implementasi Data

data mining is a mining or the

Mining

invention of new information by

Untuk

in

research

his

Kelulusan

(2013)

of

86

The

Memprediksi
Fakultas

looking for patterns or specific rules.

Komunikasi dan Informatika UMS

The information is obtained from the

Menggunakan Metode Naive Bayes"

big data and is expected to treat the

has the purpose to predict the length

condition. By utilizing student data

of

and graduation data, it is expected to

study

Mahasiswa

in

Communication
Muhammadiyah

the

Faculty

and

Informatics,

University

of

of

generate

information

on

student

graduation rates through data mining

techniques.

The

categories

graduation data are school

graduation rate is measured from the

major,

length of study and GPA. The

school,

algorithm is a apriori algorithm. The

length

information is presented as the

degree of excellence.

support and the confidence of each

gender,

region,

assistant
of

study,

lab,
and

2) Student Data

category graduation rates. The result

Student data that are used

is an application to obtain useful

as testing data is the

information about the graduation

student data which is still

rates of students with data mining

active as the students. The

techniques.

writer

the

data

randomly as a sample. The

RESEARCH METHOD

attributes of student data

1. Data Mining Analysis

are school major, gender,

This research will seek the
greatest

takes

probability

of

each

attribute.

region, school, assistant
lab.
b. Data Needs

a. Collecting The Data

Determination needs is

This research requires

a necessary in assisting the

from

Informatics

development of data mining.

students either already passed

Based on the graduation data

or not passed. This research

attributes and student data

uses

attributes, then, it is divided

data

all

student

data

and

graduation data.

into a variable, such as:

1) Graduation Data

1) School

Graduation data are used

Major:

Science,

Social and others.

as training data is the data

2) Gender: Male and Female.

of

3) Region:

students

who

have

passed. Sample data is

student’s

data student of 2007 to

students

2010. The attributes of

This

explains

home,

or

address.

The

students are divided based

c. Cleaning Data
This research needs to

on Indonesia time zone.
explains

perform data cleansing. Thus,

where students took the

the data is relevant to the

course

high

needs. In addition, to avoid

school). It is divided based

noise and inconsistency data

on Indonesia time zone.

on graduation data attributes

4) School:

This

(Senior

5) Lab

Assistant:

This

and students data attributes in

answer the question “Was

testing the system.

the

Table 1 Attribute List

student

the

lab

assistant?”

Attribute

6) Length of Study: This
answer the question "Do
the students graduate on

School

Content
IPA, IPS, and LAIN

Major
Gender

PRIA and WANITA

Region

WIB, WITA, and WIT

School

WIB, WITA, and WIT

criteria that students do

Lab

YA and TIDAK

not graduate on time are

Assistant

when students graduate >

Length of TEPAT

4 years.

Study

time?". The criteria for
students who graduate on
time are if a student
graduated ≤ 4 years. The

7) Degree

of

excellence:

Predicates graduation are
Cumlaude if GPA> 3.51,
Very Satisfactory when
2.76