Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

(1)

LAMPIRAN

DATA SET DIABETES

(700 Instances, 9 Attributes, 2 Classes)

Sumber: UCI Machine Learning Repository, yaitu: PIMA Indians Diabetes Dataset.

No preg plas pres Skin insu mass pedi age Class 1 6 148 72 35 0 33,6 0,627 50 tested_positive 2 1 85 66 29 0 26,6 0,351 31 tested_negative 3 8 183 64 0 0 23,3 0,672 32 tested_positive 4 1 89 66 23 94 28,1 0,167 21 tested_negative 5 0 137 40 35 168 43,1 2,288 33 tested_positive 6 5 116 74 0 0 25,6 0,201 30 tested_negative

7 3 78 50 32 88 31 0,248 26 tested_positive

8 10 115 0 0 0 35,3 0,134 29 tested_negative 9 2 197 70 45 543 30,5 0,158 53 tested_positive

10 8 125 96 0 0 0 0,232 54 tested_positive

11 4 110 92 0 0 37,6 0,191 30 tested_negative

12 10 168 74 0 0 38 0,537 34 tested_positive

13 10 139 80 0 0 27,1 1,441 57 tested_negative 14 1 189 60 23 846 30,1 0,398 59 tested_positive 15 5 166 72 19 175 25,8 0,587 51 tested_positive

16 7 100 0 0 0 30 0,484 32 tested_positive

17 0 118 84 47 230 45,8 0,551 31 tested_positive 18 7 107 74 0 0 29,6 0,254 31 tested_positive 19 1 103 30 38 83 43,3 0,183 33 tested_negative 20 1 115 70 30 96 34,6 0,529 32 tested_positive 21 3 126 88 41 235 39,3 0,704 27 tested_negative

22 8 99 84 0 0 35,4 0,388 50 tested_negative

23 7 196 90 0 0 39,8 0,451 41 tested_positive

24 9 119 80 35 0 29 0,263 29 tested_positive

25 11 143 94 33 146 36,6 0,254 51 tested_positive 26 10 125 70 26 115 31,1 0,205 41 tested_positive 27 7 147 76 0 0 39,4 0,257 43 tested_positive 28 1 97 66 15 140 23,2 0,487 22 tested_negative 29 13 145 82 19 110 22,2 0,245 57 tested_negative 30 5 117 92 0 0 34,1 0,337 38 tested_negative

(2)

31 5 109 75 26 0 36 0,546 60 tested_negative 32 3 158 76 36 245 31,6 0,851 28 tested_positive 33 3 88 58 11 54 24,8 0,267 22 tested_negative

34 6 92 92 0 0 19,9 0,188 28 tested_negative

35 10 122 78 31 0 27,6 0,512 45 tested_negative 36 4 103 60 33 192 24 0,966 33 tested_negative 37 11 138 76 0 0 33,2 0,42 35 tested_negative 38 9 102 76 37 0 32,9 0,665 46 tested_positive 39 2 90 68 42 0 38,2 0,503 27 tested_positive 40 4 111 72 47 207 37,1 1,39 56 tested_positive 41 3 180 64 25 70 34 0,271 26 tested_negative 42 7 133 84 0 0 40,2 0,696 37 tested_negative 43 7 106 92 18 0 22,7 0,235 48 tested_negative 44 9 171 110 24 240 45,4 0,721 54 tested_positive 45 7 159 64 0 0 27,4 0,294 40 tested_negative

46 0 180 66 39 0 42 1,893 25 tested_positive

47 1 146 56 0 0 29,7 0,564 29 tested_negative

48 2 71 70 27 0 28 0,586 22 tested_negative

49 7 103 66 32 0 39,1 0,344 31 tested_positive

50 7 105 0 0 0 0 0,305 24 tested_negative

51 1 103 80 11 82 19,4 0,491 22 tested_negative 52 1 101 50 15 36 24,2 0,526 26 tested_negative 53 5 88 66 21 23 24,4 0,342 30 tested_negative 54 8 176 90 34 300 33,7 0,467 58 tested_positive 55 7 150 66 42 342 34,7 0,718 42 tested_negative

56 1 73 50 10 0 23 0,248 21 tested_negative

57 7 187 68 39 304 37,7 0,254 41 tested_positive 58 0 100 88 60 110 46,8 0,962 31 tested_negative 59 0 146 82 0 0 40,5 1,781 44 tested_negative 60 0 105 64 41 142 41,5 0,173 22 tested_negative

61 2 84 0 0 0 0 0,304 21 tested_negative

62 8 133 72 0 0 32,9 0,27 39 tested_positive

63 5 44 62 0 0 25 0,587 36 tested_negative

64 2 141 58 34 128 25,4 0,699 24 tested_negative 65 7 114 66 0 0 32,8 0,258 42 tested_positive

66 5 99 74 27 0 29 0,203 32 tested_negative

67 0 109 88 30 0 32,5 0,855 38 tested_positive 68 2 109 92 0 0 42,7 0,845 54 tested_negative 69 1 95 66 13 38 19,6 0,334 25 tested_negative 70 4 146 85 27 100 28,9 0,189 27 tested_negative

(3)

71 2 100 66 20 90 32,9 0,867 28 tested_positive 72 5 139 64 35 140 28,6 0,411 26 tested_negative 73 13 126 90 0 0 43,4 0,583 42 tested_positive 74 4 129 86 20 270 35,1 0,231 23 tested_negative

75 1 79 75 30 0 32 0,396 22 tested_negative

76 1 0 48 20 0 24,7 0,14 22 tested_negative

77 7 62 78 0 0 32,6 0,391 41 tested_negative

78 5 95 72 33 0 37,7 0,37 27 tested_negative

79 0 131 0 0 0 43,2 0,27 26 tested_positive

80 2 112 66 22 0 25 0,307 24 tested_negative

81 3 113 44 13 0 22,4 0,14 22 tested_negative

82 2 74 0 0 0 0 0,102 22 tested_negative

83 7 83 78 26 71 29,3 0,767 36 tested_negative 84 0 101 65 28 0 24,6 0,237 22 tested_negative 85 5 137 108 0 0 48,8 0,227 37 tested_positive 86 2 110 74 29 125 32,4 0,698 27 tested_negative 87 13 106 72 54 0 36,6 0,178 45 tested_negative 88 2 100 68 25 71 38,5 0,324 26 tested_negative 89 15 136 70 32 110 37,1 0,153 43 tested_positive 90 1 107 68 19 0 26,5 0,165 24 tested_negative

91 1 80 55 0 0 19,1 0,258 21 tested_negative

92 4 123 80 15 176 32 0,443 34 tested_negative 93 7 81 78 40 48 46,7 0,261 42 tested_negative 94 4 134 72 0 0 23,8 0,277 60 tested_positive 95 2 142 82 18 64 24,7 0,761 21 tested_negative 96 6 144 72 27 228 33,9 0,255 40 tested_negative

97 2 92 62 28 0 31,6 0,13 24 tested_negative

98 1 71 48 18 76 20,4 0,323 22 tested_negative 99 6 93 50 30 64 28,7 0,356 23 tested_negative 100 1 122 90 51 220 49,7 0,325 31 tested_positive

101 1 163 72 0 0 39 1,222 33 tested_positive

102 1 151 60 0 0 26,1 0,179 22 tested_negative 103 0 125 96 0 0 22,5 0,262 21 tested_negative 104 1 81 72 18 40 26,6 0,283 24 tested_negative

105 2 85 65 0 0 39,6 0,93 27 tested_negative

106 1 126 56 29 152 28,7 0,801 21 tested_negative 107 1 96 122 0 0 22,4 0,207 27 tested_negative 108 4 144 58 28 140 29,5 0,287 37 tested_negative 109 3 83 58 31 18 34,3 0,336 25 tested_negative 110 0 95 85 25 36 37,4 0,247 24 tested_positive

(4)

111 3 171 72 33 135 33,3 0,199 24 tested_positive 112 8 155 62 26 495 34 0,543 46 tested_positive 113 1 89 76 34 37 31,2 0,192 23 tested_negative

114 4 76 62 0 0 34 0,391 25 tested_negative

115 7 160 54 32 175 30,5 0,588 39 tested_positive 116 4 146 92 0 0 31,2 0,539 61 tested_positive

117 5 124 74 0 0 34 0,22 38 tested_positive

118 5 78 48 0 0 33,7 0,654 25 tested_negative

119 4 97 60 23 0 28,2 0,443 22 tested_negative 120 4 99 76 15 51 23,2 0,223 21 tested_negative 121 0 162 76 56 100 53,2 0,759 25 tested_positive 122 6 111 64 39 0 34,2 0,26 24 tested_negative 123 2 107 74 30 100 33,6 0,404 23 tested_negative 124 5 132 80 0 0 26,8 0,186 69 tested_negative 125 0 113 76 0 0 33,3 0,278 23 tested_positive

126 1 88 30 42 99 55 0,496 26 tested_positive

127 3 120 70 30 135 42,9 0,452 30 tested_negative 128 1 118 58 36 94 33,3 0,261 23 tested_negative 129 1 117 88 24 145 34,5 0,403 40 tested_positive 130 0 105 84 0 0 27,9 0,741 62 tested_positive 131 4 173 70 14 168 29,7 0,361 33 tested_positive 132 9 122 56 0 0 33,3 1,114 33 tested_positive 133 3 170 64 37 225 34,5 0,356 30 tested_positive 134 8 84 74 31 0 38,3 0,457 39 tested_negative 135 2 96 68 13 49 21,1 0,647 26 tested_negative 136 2 125 60 20 140 33,8 0,088 31 tested_negative 137 0 100 70 26 50 30,8 0,597 21 tested_negative 138 0 93 60 25 92 28,7 0,532 22 tested_negative 139 0 129 80 0 0 31,2 0,703 29 tested_negative 140 5 105 72 29 325 36,9 0,159 28 tested_negative 141 3 128 78 0 0 21,1 0,268 55 tested_negative 142 5 106 82 30 0 39,5 0,286 38 tested_negative 143 2 108 52 26 63 32,5 0,318 22 tested_negative 144 10 108 66 0 0 32,4 0,272 42 tested_positive 145 4 154 62 31 284 32,8 0,237 23 tested_negative

146 0 102 75 23 0 0 0,572 21 tested_negative

147 9 57 80 37 0 32,8 0,096 41 tested_negative 148 2 106 64 35 119 30,5 1,4 34 tested_negative 149 5 147 78 0 0 33,7 0,218 65 tested_negative 150 2 90 70 17 0 27,3 0,085 22 tested_negative

(5)

151 1 136 74 50 204 37,4 0,399 24 tested_negative 152 4 114 65 0 0 21,9 0,432 37 tested_negative 153 9 156 86 28 155 34,3 1,189 42 tested_positive 154 1 153 82 42 485 40,6 0,687 23 tested_negative 155 8 188 78 0 0 47,9 0,137 43 tested_positive

156 7 152 88 44 0 50 0,337 36 tested_positive

157 2 99 52 15 94 24,6 0,637 21 tested_negative 158 1 109 56 21 135 25,2 0,833 23 tested_negative

159 2 88 74 19 53 29 0,229 22 tested_negative

160 17 163 72 41 114 40,9 0,817 47 tested_positive 161 4 151 90 38 0 29,7 0,294 36 tested_negative 162 7 102 74 40 105 37,2 0,204 45 tested_negative 163 0 114 80 34 285 44,2 0,167 27 tested_negative 164 2 100 64 23 0 29,7 0,368 21 tested_negative 165 0 131 88 0 0 31,6 0,743 32 tested_positive 166 6 104 74 18 156 29,9 0,722 41 tested_positive 167 3 148 66 25 0 32,5 0,256 22 tested_negative 168 4 120 68 0 0 29,6 0,709 34 tested_negative 169 4 110 66 0 0 31,9 0,471 29 tested_negative 170 3 111 90 12 78 28,4 0,495 29 tested_negative

171 6 102 82 0 0 30,8 0,18 36 tested_positive

172 6 134 70 23 130 35,4 0,542 29 tested_positive

173 2 87 0 23 0 28,9 0,773 25 tested_negative

174 1 79 60 42 48 43,5 0,678 23 tested_negative 175 2 75 64 24 55 29,7 0,37 33 tested_negative 176 8 179 72 42 130 32,7 0,719 36 tested_positive

177 6 85 78 0 0 31,2 0,382 42 tested_negative

178 0 129 110 46 130 67,1 0,319 26 tested_positive

179 5 143 78 0 0 45 0,19 47 tested_negative

180 5 130 82 0 0 39,1 0,956 37 tested_positive

181 6 87 80 0 0 23,2 0,084 32 tested_negative

182 0 119 64 18 92 34,9 0,725 23 tested_negative 183 1 0 74 20 23 27,7 0,299 21 tested_negative

184 5 73 60 0 0 26,8 0,268 27 tested_negative

185 4 141 74 0 0 27,6 0,244 40 tested_negative 186 7 194 68 28 0 35,9 0,745 41 tested_positive 187 8 181 68 36 495 30,1 0,615 60 tested_positive 188 1 128 98 41 58 32 1,321 33 tested_positive 189 8 109 76 39 114 27,9 0,64 31 tested_positive 190 5 139 80 35 160 31,6 0,361 25 tested_positive

(6)

191 3 111 62 0 0 22,6 0,142 21 tested_negative 192 9 123 70 44 94 33,1 0,374 40 tested_negative 193 7 159 66 0 0 30,4 0,383 36 tested_positive 194 11 135 0 0 0 52,3 0,578 40 tested_positive 195 8 85 55 20 0 24,4 0,136 42 tested_negative 196 5 158 84 41 210 39,4 0,395 29 tested_positive 197 1 105 58 0 0 24,3 0,187 21 tested_negative 198 3 107 62 13 48 22,9 0,678 23 tested_positive 199 4 109 64 44 99 34,8 0,905 26 tested_positive 200 4 148 60 27 318 30,9 0,15 29 tested_positive

201 0 113 80 16 0 31 0,874 21 tested_negative

202 1 138 82 0 0 40,1 0,236 28 tested_negative 203 0 108 68 20 0 27,3 0,787 32 tested_negative 204 2 99 70 16 44 20,4 0,235 27 tested_negative 205 6 103 72 32 190 37,7 0,324 55 tested_negative 206 5 111 72 28 0 23,9 0,407 27 tested_negative 207 8 196 76 29 280 37,5 0,605 57 tested_positive 208 5 162 104 0 0 37,7 0,151 52 tested_positive 209 1 96 64 27 87 33,2 0,289 21 tested_negative 210 7 184 84 33 0 35,5 0,355 41 tested_positive

211 2 81 60 22 0 27,7 0,29 25 tested_negative

212 0 147 85 54 0 42,8 0,375 24 tested_negative 213 7 179 95 31 0 34,2 0,164 60 tested_negative 214 0 140 65 26 130 42,6 0,431 24 tested_positive 215 9 112 82 32 175 34,2 0,26 36 tested_positive 216 12 151 70 40 271 41,8 0,742 38 tested_positive 217 5 109 62 41 129 35,8 0,514 25 tested_positive 218 6 125 68 30 120 30 0,464 32 tested_negative

219 5 85 74 22 0 29 1,224 32 tested_positive

220 5 112 66 0 0 37,8 0,261 41 tested_positive 221 0 177 60 29 478 34,6 1,072 21 tested_positive 222 2 158 90 0 0 31,6 0,805 66 tested_positive

223 7 119 0 0 0 25,2 0,209 37 tested_negative

224 7 142 60 33 190 28,8 0,687 61 tested_negative 225 1 100 66 15 56 23,6 0,666 26 tested_negative 226 1 87 78 27 32 34,6 0,101 22 tested_negative 227 0 101 76 0 0 35,7 0,198 26 tested_negative 228 3 162 52 38 0 37,2 0,652 24 tested_positive 229 4 197 70 39 744 36,7 2,329 31 tested_negative 230 0 117 80 31 53 45,2 0,089 24 tested_negative

(7)

231 4 142 86 0 0 44 0,645 22 tested_positive 232 6 134 80 37 370 46,2 0,238 46 tested_positive 233 1 79 80 25 37 25,4 0,583 22 tested_negative

234 4 122 68 0 0 35 0,394 29 tested_negative

235 3 74 68 28 45 29,7 0,293 23 tested_negative 236 4 171 72 0 0 43,6 0,479 26 tested_positive 237 7 181 84 21 192 35,9 0,586 51 tested_positive 238 0 179 90 27 0 44,1 0,686 23 tested_positive 239 9 164 84 21 0 30,8 0,831 32 tested_positive 240 0 104 76 0 0 18,4 0,582 27 tested_negative 241 1 91 64 24 0 29,2 0,192 21 tested_negative 242 4 91 70 32 88 33,1 0,446 22 tested_negative 243 3 139 54 0 0 25,6 0,402 22 tested_positive 244 6 119 50 22 176 27,1 1,318 33 tested_positive 245 2 146 76 35 194 38,2 0,329 29 tested_negative

246 9 184 85 15 0 30 1,213 49 tested_positive

247 10 122 68 0 0 31,2 0,258 41 tested_negative 248 0 165 90 33 680 52,3 0,427 23 tested_negative 249 9 124 70 33 402 35,4 0,282 34 tested_negative 250 1 111 86 19 0 30,1 0,143 23 tested_negative

251 9 106 52 0 0 31,2 0,38 42 tested_negative

252 2 129 84 0 0 28 0,284 27 tested_negative

253 2 90 80 14 55 24,4 0,249 24 tested_negative 254 0 86 68 32 0 35,8 0,238 25 tested_negative 255 12 92 62 7 258 27,6 0,926 44 tested_positive 256 1 113 64 35 0 33,6 0,543 21 tested_positive 257 3 111 56 39 0 30,1 0,557 30 tested_negative 258 2 114 68 22 0 28,7 0,092 25 tested_negative 259 1 193 50 16 375 25,9 0,655 24 tested_negative 260 11 155 76 28 150 33,3 1,353 51 tested_positive 261 3 191 68 15 130 30,9 0,299 34 tested_negative

262 3 141 0 0 0 30 0,761 27 tested_positive

263 4 95 70 32 0 32,1 0,612 24 tested_negative

264 3 142 80 15 0 32,4 0,2 63 tested_negative

265 4 123 62 0 0 32 0,226 35 tested_positive

266 5 96 74 18 67 33,6 0,997 43 tested_negative

267 0 138 0 0 0 36,3 0,933 25 tested_positive

268 2 128 64 42 0 40 1,101 24 tested_negative

269 0 102 52 0 0 25,1 0,078 21 tested_negative

(8)

271 10 101 86 37 0 45,6 1,136 38 tested_positive 272 2 108 62 32 56 25,2 0,128 21 tested_negative

273 3 122 78 0 0 23 0,254 40 tested_negative

274 1 71 78 50 45 33,2 0,422 21 tested_negative 275 13 106 70 0 0 34,2 0,251 52 tested_negative 276 2 100 70 52 57 40,5 0,677 25 tested_negative 277 7 106 60 24 0 26,5 0,296 29 tested_positive 278 0 104 64 23 116 27,8 0,454 23 tested_negative 279 5 114 74 0 0 24,9 0,744 57 tested_negative 280 2 108 62 10 278 25,3 0,881 22 tested_negative 281 0 146 70 0 0 37,9 0,334 28 tested_positive 282 10 129 76 28 122 35,9 0,28 39 tested_negative 283 7 133 88 15 155 32,4 0,262 37 tested_negative 284 7 161 86 0 0 30,4 0,165 47 tested_positive

285 2 108 80 0 0 27 0,259 52 tested_positive

286 7 136 74 26 135 26 0,647 51 tested_negative 287 5 155 84 44 545 38,7 0,619 34 tested_negative 288 1 119 86 39 220 45,6 0,808 29 tested_positive 289 4 96 56 17 49 20,8 0,34 26 tested_negative 290 5 108 72 43 75 36,1 0,263 33 tested_negative 291 0 78 88 29 40 36,9 0,434 21 tested_negative 292 0 107 62 30 74 36,6 0,757 25 tested_positive 293 2 128 78 37 182 43,3 1,224 31 tested_positive 294 1 128 48 45 194 40,5 0,613 24 tested_positive 295 0 161 50 0 0 21,9 0,254 65 tested_negative 296 6 151 62 31 120 35,5 0,692 28 tested_negative 297 2 146 70 38 360 28 0,337 29 tested_positive 298 0 126 84 29 215 30,7 0,52 24 tested_negative 299 14 100 78 25 184 36,6 0,412 46 tested_positive

300 8 112 72 0 0 23,6 0,84 58 tested_negative

301 0 167 0 0 0 32,3 0,839 30 tested_positive

302 2 144 58 33 135 31,6 0,422 25 tested_positive 303 5 77 82 41 42 35,8 0,156 35 tested_negative 304 5 115 98 0 0 52,9 0,209 28 tested_positive

305 3 150 76 0 0 21 0,207 37 tested_negative

306 2 120 76 37 105 39,7 0,215 29 tested_negative 307 10 161 68 23 132 25,5 0,326 47 tested_positive 308 0 137 68 14 148 24,8 0,143 21 tested_negative 309 0 128 68 19 180 30,5 1,391 25 tested_positive 310 2 124 68 28 205 32,9 0,875 30 tested_positive

(9)

311 6 80 66 30 0 26,2 0,313 41 tested_negative 312 0 106 70 37 148 39,4 0,605 22 tested_negative 313 2 155 74 17 96 26,6 0,433 27 tested_positive 314 3 113 50 10 85 29,5 0,626 25 tested_negative 315 7 109 80 31 0 35,9 1,127 43 tested_positive 316 2 112 68 22 94 34,1 0,315 26 tested_negative 317 3 99 80 11 64 19,3 0,284 30 tested_negative 318 3 182 74 0 0 30,5 0,345 29 tested_positive 319 3 115 66 39 140 38,1 0,15 28 tested_negative 320 6 194 78 0 0 23,5 0,129 59 tested_positive 321 4 129 60 12 231 27,5 0,527 31 tested_negative 322 3 112 74 30 0 31,6 0,197 25 tested_positive 323 0 124 70 20 0 27,4 0,254 36 tested_positive 324 13 152 90 33 29 26,8 0,731 43 tested_positive 325 2 112 75 32 0 35,7 0,148 21 tested_negative 326 1 157 72 21 168 25,6 0,123 24 tested_negative 327 1 122 64 32 156 35,1 0,692 30 tested_positive

328 10 179 70 0 0 35,1 0,2 37 tested_negative

329 2 102 86 36 120 45,5 0,127 23 tested_positive 330 6 105 70 32 68 30,8 0,122 37 tested_negative 331 8 118 72 19 0 23,1 1,476 46 tested_negative 332 2 87 58 16 52 32,7 0,166 25 tested_negative

333 1 180 0 0 0 43,3 0,282 41 tested_positive

334 12 106 80 0 0 23,6 0,137 44 tested_negative 335 1 95 60 18 58 23,9 0,26 22 tested_negative 336 0 165 76 43 255 47,9 0,259 26 tested_negative

337 0 117 0 0 0 33,8 0,932 44 tested_negative

338 5 115 76 0 0 31,2 0,343 44 tested_positive 339 9 152 78 34 171 34,2 0,893 33 tested_positive 340 7 178 84 0 0 39,9 0,331 41 tested_positive 341 1 130 70 13 105 25,9 0,472 22 tested_negative 342 1 95 74 21 73 25,9 0,673 36 tested_negative

343 1 0 68 35 0 32 0,389 22 tested_negative

344 5 122 86 0 0 34,7 0,29 33 tested_negative

345 8 95 72 0 0 36,8 0,485 57 tested_negative

346 8 126 88 36 108 38,5 0,349 49 tested_negative 347 1 139 46 19 83 28,7 0,654 22 tested_negative

348 3 116 0 0 0 23,5 0,187 23 tested_negative

349 3 99 62 19 74 21,8 0,279 26 tested_negative

(10)

351 4 92 80 0 0 42,2 0,237 29 tested_negative 352 4 137 84 0 0 31,2 0,252 30 tested_negative 353 3 61 82 28 0 34,4 0,243 46 tested_negative 354 1 90 62 12 43 27,2 0,58 24 tested_negative

355 3 90 78 0 0 42,7 0,559 21 tested_negative

356 9 165 88 0 0 30,4 0,302 49 tested_positive 357 1 125 50 40 167 33,3 0,962 28 tested_positive 358 13 129 0 30 0 39,9 0,569 44 tested_positive 359 12 88 74 40 54 35,3 0,378 48 tested_negative 360 1 196 76 36 249 36,5 0,875 29 tested_positive 361 5 189 64 33 325 31,2 0,583 29 tested_positive 362 5 158 70 0 0 29,8 0,207 63 tested_negative 363 5 103 108 37 0 39,2 0,305 65 tested_negative

364 4 146 78 0 0 38,5 0,52 67 tested_positive

365 4 147 74 25 293 34,9 0,385 30 tested_negative

366 5 99 54 28 83 34 0,499 30 tested_negative

367 6 124 72 0 0 27,6 0,368 29 tested_positive

368 0 101 64 17 0 21 0,252 21 tested_negative

369 3 81 86 16 66 27,5 0,306 22 tested_negative 370 1 133 102 28 140 32,8 0,234 45 tested_positive 371 3 173 82 48 465 38,4 2,137 25 tested_positive

372 0 118 64 23 89 0 1,731 21 tested_negative

373 0 84 64 22 66 35,8 0,545 21 tested_negative 374 2 105 58 40 94 34,9 0,225 25 tested_negative 375 2 122 52 43 158 36,2 0,816 28 tested_negative 376 12 140 82 43 325 39,2 0,528 58 tested_positive 377 0 98 82 15 84 25,2 0,299 22 tested_negative 378 1 87 60 37 75 37,2 0,509 22 tested_negative 379 4 156 75 0 0 48,3 0,238 32 tested_positive 380 0 93 100 39 72 43,4 1,021 35 tested_negative 381 1 107 72 30 82 30,8 0,821 24 tested_negative

382 0 105 68 22 0 20 0,236 22 tested_negative

383 1 109 60 8 182 25,4 0,947 21 tested_negative 384 1 90 62 18 59 25,1 1,268 25 tested_negative 385 1 125 70 24 110 24,3 0,221 25 tested_negative 386 1 119 54 13 50 22,3 0,205 24 tested_negative 387 5 116 74 29 0 32,3 0,66 35 tested_positive 388 8 105 100 36 0 43,3 0,239 45 tested_positive 389 5 144 82 26 285 32 0,452 58 tested_positive 390 3 100 68 23 81 31,6 0,949 28 tested_negative

(11)

391 1 100 66 29 196 32 0,444 42 tested_negative

392 5 166 76 0 0 45,7 0,34 27 tested_positive

393 1 131 64 14 415 23,7 0,389 21 tested_negative 394 4 116 72 12 87 22,1 0,463 37 tested_negative 395 4 158 78 0 0 32,9 0,803 31 tested_positive 396 2 127 58 24 275 27,7 1,6 25 tested_negative 397 3 96 56 34 115 24,7 0,944 39 tested_negative 398 0 131 66 40 0 34,3 0,196 22 tested_positive

399 3 82 70 0 0 21,1 0,389 25 tested_negative

400 3 193 70 31 0 34,9 0,241 25 tested_positive

401 4 95 64 0 0 32 0,161 31 tested_positive

402 6 137 61 0 0 24,2 0,151 55 tested_negative 403 5 136 84 41 88 35 0,286 35 tested_positive

404 9 72 78 25 0 31,6 0,28 38 tested_negative

405 5 168 64 0 0 32,9 0,135 41 tested_positive 406 2 123 48 32 165 42,1 0,52 26 tested_negative 407 4 115 72 0 0 28,9 0,376 46 tested_positive 408 0 101 62 0 0 21,9 0,336 25 tested_negative 409 8 197 74 0 0 25,9 1,191 39 tested_positive 410 1 172 68 49 579 42,4 0,702 28 tested_positive 411 6 102 90 39 0 35,7 0,674 28 tested_negative 412 1 112 72 30 176 34,4 0,528 25 tested_negative 413 1 143 84 23 310 42,4 1,076 22 tested_negative 414 1 143 74 22 61 26,2 0,256 21 tested_negative 415 0 138 60 35 167 34,6 0,534 21 tested_positive 416 3 173 84 33 474 35,7 0,258 22 tested_positive 417 1 97 68 21 0 27,2 1,095 22 tested_negative 418 4 144 82 32 0 38,5 0,554 37 tested_positive

419 1 83 68 0 0 18,2 0,624 27 tested_negative

420 3 129 64 29 115 26,4 0,219 28 tested_positive 421 1 119 88 41 170 45,3 0,507 26 tested_negative

422 2 94 68 18 76 26 0,561 21 tested_negative

423 0 102 64 46 78 40,6 0,496 21 tested_negative 424 2 115 64 22 0 30,8 0,421 21 tested_negative 425 8 151 78 32 210 42,9 0,516 36 tested_positive 426 4 184 78 39 277 37 0,264 31 tested_positive

427 0 94 0 0 0 0 0,256 25 tested_negative

428 1 181 64 30 180 34,1 0,328 38 tested_positive 429 0 135 94 46 145 40,6 0,284 26 tested_negative 430 1 95 82 25 180 35 0,233 43 tested_positive

(12)

431 2 99 0 0 0 22,2 0,108 23 tested_negative 432 3 89 74 16 85 30,4 0,551 38 tested_negative

433 1 80 74 11 60 30 0,527 22 tested_negative

434 2 139 75 0 0 25,6 0,167 29 tested_negative

435 1 90 68 8 0 24,5 1,138 36 tested_negative

436 0 141 0 0 0 42,4 0,205 29 tested_positive

437 12 140 85 33 0 37,4 0,244 41 tested_negative 438 5 147 75 0 0 29,9 0,434 28 tested_negative 439 1 97 70 15 0 18,2 0,147 21 tested_negative 440 6 107 88 0 0 36,8 0,727 31 tested_negative 441 0 189 104 25 0 34,3 0,435 41 tested_positive 442 2 83 66 23 50 32,2 0,497 22 tested_negative 443 4 117 64 27 120 33,2 0,23 24 tested_negative 444 8 108 70 0 0 30,5 0,955 33 tested_positive 445 4 117 62 12 0 29,7 0,38 30 tested_positive 446 0 180 78 63 14 59,4 2,42 25 tested_positive 447 1 100 72 12 70 25,3 0,658 28 tested_negative 448 0 95 80 45 92 36,5 0,33 26 tested_negative 449 0 104 64 37 64 33,6 0,51 22 tested_positive 450 0 120 74 18 63 30,5 0,285 26 tested_negative 451 1 82 64 13 95 21,2 0,415 23 tested_negative 452 2 134 70 0 0 28,9 0,542 23 tested_positive 453 0 91 68 32 210 39,9 0,381 25 tested_negative

454 2 119 0 0 0 19,6 0,832 72 tested_negative

455 2 100 54 28 105 37,8 0,498 24 tested_negative 456 14 175 62 30 0 33,6 0,212 38 tested_positive 457 1 135 54 0 0 26,7 0,687 62 tested_negative 458 5 86 68 28 71 30,2 0,364 24 tested_negative 459 10 148 84 48 237 37,6 1,001 51 tested_positive 460 9 134 74 33 60 25,9 0,46 81 tested_negative 461 9 120 72 22 56 20,8 0,733 48 tested_negative

462 1 71 62 0 0 21,8 0,416 26 tested_negative

463 8 74 70 40 49 35,3 0,705 39 tested_negative 464 5 88 78 30 0 27,6 0,258 37 tested_negative

465 10 115 98 0 0 24 1,022 34 tested_negative

466 0 124 56 13 105 21,8 0,452 21 tested_negative 467 0 74 52 10 36 27,8 0,269 22 tested_negative 468 0 97 64 36 100 36,8 0,6 25 tested_negative

469 8 120 0 0 0 30 0,183 38 tested_positive

(13)

471 1 144 82 40 0 41,3 0,607 28 tested_negative 472 0 137 70 38 0 33,2 0,17 22 tested_negative 473 0 119 66 27 0 38,8 0,259 22 tested_negative

474 7 136 90 0 0 29,9 0,21 50 tested_negative

475 4 114 64 0 0 28,9 0,126 24 tested_negative 476 0 137 84 27 0 27,3 0,231 59 tested_negative 477 2 105 80 45 191 33,7 0,711 29 tested_positive 478 7 114 76 17 110 23,8 0,466 31 tested_negative 479 8 126 74 38 75 25,9 0,162 39 tested_negative

480 4 132 86 31 0 28 0,419 63 tested_negative

481 3 158 70 30 328 35,5 0,344 35 tested_positive 482 0 123 88 37 0 35,2 0,197 29 tested_negative 483 4 85 58 22 49 27,8 0,306 28 tested_negative 484 0 84 82 31 125 38,2 0,233 23 tested_negative

485 0 145 0 0 0 44,2 0,63 31 tested_positive

486 0 135 68 42 250 42,3 0,365 24 tested_positive 487 1 139 62 41 480 40,7 0,536 21 tested_negative 488 0 173 78 32 265 46,5 1,159 58 tested_negative 489 4 99 72 17 0 25,6 0,294 28 tested_negative 490 8 194 80 0 0 26,1 0,551 67 tested_negative 491 2 83 65 28 66 36,8 0,629 24 tested_negative 492 2 89 90 30 0 33,5 0,292 42 tested_negative 493 4 99 68 38 0 32,8 0,145 33 tested_negative 494 4 125 70 18 122 28,9 1,144 45 tested_positive

495 3 80 0 0 0 0 0,174 22 tested_negative

496 6 166 74 0 0 26,6 0,304 66 tested_negative

497 5 110 68 0 0 26 0,292 30 tested_negative

498 2 81 72 15 76 30,1 0,547 25 tested_negative 499 7 195 70 33 145 25,1 0,163 55 tested_positive 500 6 154 74 32 193 29,3 0,839 39 tested_negative 501 2 117 90 19 71 25,2 0,313 21 tested_negative 502 3 84 72 32 0 37,2 0,267 28 tested_negative

503 6 0 68 41 0 39 0,727 41 tested_positive

504 7 94 64 25 79 33,3 0,738 41 tested_negative 505 3 96 78 39 0 37,3 0,238 40 tested_negative 506 10 75 82 0 0 33,3 0,263 38 tested_negative 507 0 180 90 26 90 36,5 0,314 35 tested_positive 508 1 130 60 23 170 28,6 0,692 21 tested_negative 509 2 84 50 23 76 30,4 0,968 21 tested_negative

(14)

511 12 84 72 31 0 29,7 0,297 46 tested_positive 512 0 139 62 17 210 22,1 0,207 21 tested_negative

513 9 91 68 0 0 24,2 0,2 58 tested_negative

514 2 91 62 0 0 27,3 0,525 22 tested_negative

515 3 99 54 19 86 25,6 0,154 24 tested_negative 516 3 163 70 18 105 31,6 0,268 28 tested_positive 517 9 145 88 34 165 30,3 0,771 53 tested_positive 518 7 125 86 0 0 37,6 0,304 51 tested_negative

519 13 76 60 0 0 32,8 0,18 41 tested_negative

520 6 129 90 7 326 19,6 0,582 60 tested_negative

521 2 68 70 32 66 25 0,187 25 tested_negative

522 3 124 80 33 130 33,2 0,305 26 tested_negative

523 6 114 0 0 0 0 0,189 26 tested_negative

524 9 130 70 0 0 34,2 0,652 45 tested_positive 525 3 125 58 0 0 31,6 0,151 24 tested_negative 526 3 87 60 18 0 21,8 0,444 21 tested_negative 527 1 97 64 19 82 18,2 0,299 21 tested_negative 528 3 116 74 15 105 26,3 0,107 24 tested_negative 529 0 117 66 31 188 30,8 0,493 22 tested_negative

530 0 111 65 0 0 24,6 0,66 31 tested_negative

531 2 122 60 18 106 29,8 0,717 22 tested_negative 532 0 107 76 0 0 45,3 0,686 24 tested_negative 533 1 86 66 52 65 41,3 0,917 29 tested_negative

534 6 91 0 0 0 29,8 0,501 31 tested_negative

535 1 77 56 30 56 33,3 1,251 24 tested_negative

536 4 132 0 0 0 32,9 0,302 23 tested_positive

537 0 105 90 0 0 29,6 0,197 46 tested_negative

538 0 57 60 0 0 21,7 0,735 67 tested_negative

539 0 127 80 37 210 36,3 0,804 23 tested_negative 540 3 129 92 49 155 36,4 0,968 32 tested_positive 541 8 100 74 40 215 39,4 0,661 43 tested_positive 542 3 128 72 25 190 32,4 0,549 27 tested_positive 543 10 90 85 32 0 34,9 0,825 56 tested_positive 544 4 84 90 23 56 39,5 0,159 25 tested_negative

545 1 88 78 29 76 32 0,365 29 tested_negative

546 8 186 90 35 225 34,5 0,423 37 tested_positive 547 5 187 76 27 207 43,6 1,034 53 tested_positive 548 4 131 68 21 166 33,1 0,16 28 tested_negative 549 1 164 82 43 67 32,8 0,341 50 tested_negative 550 4 189 110 31 0 28,5 0,68 37 tested_negative

(15)

551 1 116 70 28 0 27,4 0,204 21 tested_negative 552 3 84 68 30 106 31,9 0,591 25 tested_negative 553 6 114 88 0 0 27,8 0,247 66 tested_negative 554 1 88 62 24 44 29,9 0,422 23 tested_negative 555 1 84 64 23 115 36,9 0,471 28 tested_negative 556 7 124 70 33 215 25,5 0,161 37 tested_negative 557 1 97 70 40 0 38,1 0,218 30 tested_negative 558 8 110 76 0 0 27,8 0,237 58 tested_negative 559 11 103 68 40 0 46,2 0,126 42 tested_negative

560 11 85 74 0 0 30,1 0,3 35 tested_negative

561 6 125 76 0 0 33,8 0,121 54 tested_positive 562 0 198 66 32 274 41,3 0,502 28 tested_positive 563 1 87 68 34 77 37,6 0,401 24 tested_negative 564 6 99 60 19 54 26,9 0,497 32 tested_negative

565 0 91 80 0 0 32,4 0,601 27 tested_negative

566 2 95 54 14 88 26,1 0,748 22 tested_negative 567 1 99 72 30 18 38,6 0,412 21 tested_negative 568 6 92 62 32 126 32 0,085 46 tested_negative 569 4 154 72 29 126 31,3 0,338 37 tested_negative 570 0 121 66 30 165 34,3 0,203 33 tested_positive

571 3 78 70 0 0 32,5 0,27 39 tested_negative

572 2 130 96 0 0 22,6 0,268 21 tested_negative 573 3 111 58 31 44 29,5 0,43 22 tested_negative 574 2 98 60 17 120 34,7 0,198 22 tested_negative 575 1 143 86 30 330 30,1 0,892 23 tested_negative 576 1 119 44 47 63 35,5 0,28 25 tested_negative 577 6 108 44 20 130 24 0,813 35 tested_negative 578 2 118 80 0 0 42,9 0,693 21 tested_positive

579 10 133 68 0 0 27 0,245 36 tested_negative

580 2 197 70 99 0 34,7 0,575 62 tested_positive 581 0 151 90 46 0 42,1 0,371 21 tested_positive

582 6 109 60 27 0 25 0,206 27 tested_negative

583 12 121 78 17 0 26,5 0,259 62 tested_negative

584 8 100 76 0 0 38,7 0,19 42 tested_negative

585 8 124 76 24 600 28,7 0,687 52 tested_positive 586 1 93 56 11 0 22,5 0,417 22 tested_negative 587 8 143 66 0 0 34,9 0,129 41 tested_positive 588 6 103 66 0 0 24,3 0,249 29 tested_negative 589 3 176 86 27 156 33,3 1,154 52 tested_positive

(16)

591 11 111 84 40 0 46,8 0,925 45 tested_positive 592 2 112 78 50 140 39,4 0,175 24 tested_negative 593 3 132 80 0 0 34,4 0,402 44 tested_positive 594 2 82 52 22 115 28,5 1,699 25 tested_negative 595 6 123 72 45 230 33,6 0,733 34 tested_negative 596 0 188 82 14 185 32 0,682 22 tested_positive

597 0 67 76 0 0 45,3 0,194 46 tested_negative

598 1 89 24 19 25 27,8 0,559 21 tested_negative 599 1 173 74 0 0 36,8 0,088 38 tested_positive 600 1 109 38 18 120 23,1 0,407 26 tested_negative

601 1 108 88 19 0 27,1 0,4 24 tested_negative

602 6 96 0 0 0 23,7 0,19 28 tested_negative

603 1 124 74 36 0 27,8 0,1 30 tested_negative

604 7 150 78 29 126 35,2 0,692 54 tested_positive

605 4 183 0 0 0 28,4 0,212 36 tested_positive

606 1 124 60 32 0 35,8 0,514 21 tested_negative 607 1 181 78 42 293 40 1,258 22 tested_positive 608 1 92 62 25 41 19,5 0,482 25 tested_negative 609 0 152 82 39 272 41,5 0,27 27 tested_negative 610 1 111 62 13 182 24 0,138 23 tested_negative 611 3 106 54 21 158 30,9 0,292 24 tested_negative 612 3 174 58 22 194 32,9 0,593 36 tested_positive 613 7 168 88 42 321 38,2 0,787 40 tested_positive 614 6 105 80 28 0 32,5 0,878 26 tested_negative 615 11 138 74 26 144 36,1 0,557 50 tested_positive 616 3 106 72 0 0 25,8 0,207 27 tested_negative 617 6 117 96 0 0 28,7 0,157 30 tested_negative 618 2 68 62 13 15 20,1 0,257 23 tested_negative 619 9 112 82 24 0 28,2 1,282 50 tested_positive

620 0 119 0 0 0 32,4 0,141 24 tested_positive

621 2 112 86 42 160 38,4 0,246 28 tested_negative 622 2 92 76 20 0 24,2 1,698 28 tested_negative 623 6 183 94 0 0 40,8 1,461 45 tested_negative 624 0 94 70 27 115 43,5 0,347 21 tested_negative 625 2 108 64 0 0 30,8 0,158 21 tested_negative 626 4 90 88 47 54 37,7 0,362 29 tested_negative 627 0 125 68 0 0 24,7 0,206 21 tested_negative 628 0 132 78 0 0 32,4 0,393 21 tested_negative 629 5 128 80 0 0 34,6 0,144 45 tested_negative 630 4 94 65 22 0 24,7 0,148 21 tested_negative

(17)

631 7 114 64 0 0 27,4 0,732 34 tested_positive 632 0 102 78 40 90 34,5 0,238 24 tested_negative 633 2 111 60 0 0 26,2 0,343 23 tested_negative 634 1 128 82 17 183 27,5 0,115 22 tested_negative 635 10 92 62 0 0 25,9 0,167 31 tested_negative 636 13 104 72 0 0 31,2 0,465 38 tested_positive 637 5 104 74 0 0 28,8 0,153 48 tested_negative 638 2 94 76 18 66 31,6 0,649 23 tested_negative 639 7 97 76 32 91 40,9 0,871 32 tested_positive 640 1 100 74 12 46 19,5 0,149 28 tested_negative 641 0 102 86 17 105 29,3 0,695 27 tested_negative 642 4 128 70 0 0 34,3 0,303 24 tested_negative 643 6 147 80 0 0 29,5 0,178 50 tested_positive

644 4 90 0 0 0 28 0,61 31 tested_negative

645 3 103 72 30 152 27,6 0,73 27 tested_negative 646 2 157 74 35 440 39,4 0,134 30 tested_negative 647 1 167 74 17 144 23,4 0,447 33 tested_positive 648 0 179 50 36 159 37,8 0,455 22 tested_positive 649 11 136 84 35 130 28,3 0,26 42 tested_positive 650 0 107 60 25 0 26,4 0,133 23 tested_negative 651 1 91 54 25 100 25,2 0,234 23 tested_negative 652 1 117 60 23 106 33,8 0,466 27 tested_negative 653 5 123 74 40 77 34,1 0,269 28 tested_negative 654 2 120 54 0 0 26,8 0,455 27 tested_negative 655 1 106 70 28 135 34,2 0,142 22 tested_negative 656 2 155 52 27 540 38,7 0,24 25 tested_positive 657 2 101 58 35 90 21,8 0,155 22 tested_negative 658 1 120 80 48 200 38,9 1,162 41 tested_negative

659 11 127 106 0 0 39 0,19 51 tested_negative

660 3 80 82 31 70 34,2 1,292 27 tested_positive 661 10 162 84 0 0 27,7 0,182 54 tested_negative 662 1 199 76 43 0 42,9 1,394 22 tested_positive 663 8 167 106 46 231 37,6 0,165 43 tested_positive 664 9 145 80 46 130 37,9 0,637 40 tested_positive 665 6 115 60 39 0 33,7 0,245 40 tested_positive 666 1 112 80 45 132 34,8 0,217 24 tested_negative 667 4 145 82 18 0 32,5 0,235 70 tested_positive 668 10 111 70 27 0 27,5 0,141 40 tested_positive

669 6 98 58 33 190 34 0,43 43 tested_negative

(18)

671 6 165 68 26 168 33,6 0,631 49 tested_negative 672 1 99 58 10 0 25,4 0,551 21 tested_negative 673 10 68 106 23 49 35,5 0,285 47 tested_negative 674 3 123 100 35 240 57,3 0,88 22 tested_negative

675 8 91 82 0 0 35,6 0,587 68 tested_negative

676 6 195 70 0 0 30,9 0,328 31 tested_positive

677 9 156 86 0 0 24,8 0,23 53 tested_positive

678 0 93 60 0 0 35,3 0,263 25 tested_negative

679 3 121 52 0 0 36 0,127 25 tested_positive

680 2 101 58 17 265 24,2 0,614 23 tested_negative 681 2 56 56 28 45 24,2 0,332 22 tested_negative 682 0 162 76 36 0 49,6 0,364 26 tested_positive 683 0 95 64 39 105 44,6 0,366 22 tested_negative 684 4 125 80 0 0 32,3 0,536 27 tested_positive

685 5 136 82 0 0 0 0,64 69 tested_negative

686 2 129 74 26 205 33,2 0,591 25 tested_negative 687 3 130 64 0 0 23,1 0,314 22 tested_negative 688 1 107 50 19 0 28,3 0,181 29 tested_negative 689 1 140 74 26 180 24,1 0,828 23 tested_negative 690 1 144 82 46 180 46,1 0,335 46 tested_positive 691 8 107 80 0 0 24,6 0,856 34 tested_negative 692 13 158 114 0 0 42,3 0,257 44 tested_positive 693 2 121 70 32 95 39,1 0,886 23 tested_negative 694 7 129 68 49 125 38,5 0,439 43 tested_positive

695 2 90 60 0 0 23,5 0,191 25 tested_negative

696 7 142 90 24 480 30,4 0,128 43 tested_positive 697 3 169 74 19 125 29,9 0,268 31 tested_positive

698 0 99 0 0 0 25 0,253 22 tested_negative

699 4 127 88 11 155 34,5 0,598 28 tested_negative 700 4 118 70 0 0 44,5 0,904 26 tested_negative

(19)

LISTING PROGRAM <div id="page_content">

<p class="uk-text">Penentuan Klasifikasi Penyakit Diabetes dengan K-Means dan KNN</p>

</div> </div> </div> </div> </div> </div> </div> <?php

include 'koneksi.php'; error_reporting(0);

$page = "index.php?page=k_means"; $sec = "10";

$time = microtime(); $time = explode(' ', $time); $time = $time[1] + $time[0]; $start = $time;

$sql_cek_iterasi = mysql_query("SELECT * FROM iterasi"); if(mysql_num_rows($sql_cek_iterasi) == 0){

$sql_akumulasi_jarak = mysql_query("SELECT * FROM akumulasi_jarak"); $array_akumulasi_jarak = mysql_fetch_array($sql_akumulasi_jarak);

(20)

$iterasi_tampil=1; } else {

$sql_akhir = mysql_query("SELECT * FROM iterasi ORDER BY id_iterasi DESC LIMIT 1");

$array_akhir = mysql_fetch_array($sql_akhir); $iterasi = $array_akhir['iterasi'] + 1;

$iterasi_tampil=$iterasi;

mysql_query("INSERT INTO iterasi SET iterasi='$iterasi'") or die(mysql_error()); }

$delete_jarak = mysql_query("DELETE FROM jarak");

$delete_akumulasi_jarak = mysql_query("DELETE FROM akumulasi_jarak"); //menentukan jumlah euclide distance

$sql_acak = mysql_query("SELECT * FROM nilai_acak ORDER BY id_nilai_acak ASC"); $i=1;

while($array_nilai_acak = mysql_fetch_array($sql_acak)){ $k = $i-1;

$sql_sampel = mysql_query("SELECT * FROM sampel ORDER BY id_sampel ASC"); while($array_sampel = mysql_fetch_array($sql_sampel)){

$cluser = sqrt(pow(($array_sampel['sampel_1'] - $array_nilai_acak['x1']),2) + pow(($array_sampel['sampel_2'] - $array_nilai_acak['x2']),2)

+ pow(($array_sampel['sampel_3'] - $array_nilai_acak['x3']),2) + pow(($array_sampel['sampel_4'] - $array_nilai_acak['x4']),2)

+ pow(($array_sampel['sampel_5'] - $array_nilai_acak['x5']),2) + pow(($array_sampel['sampel_6'] - $array_nilai_acak['x6']),2)

+ pow(($array_sampel['sampel_7'] - $array_nilai_acak['x7']),2) + pow(($array_sampel['sampel_8'] - $array_nilai_acak['x8']),2)) ;

$insert_jarak = mysql_query("INSERT INTO jarak VALUES ('','$array_sampel[id_sampel]','$cluser','$k')");

}

$i++; }

//menentukan jumlah euclide distanc

(21)

$sql_sampel = mysql_query("SELECT a.*,b.* FROM sampel a, jarak b where a.id_sampel=b.id_sampel ORDER BY a.id_sampel ASC");

while($array_sampel = mysql_fetch_array($sql_sampel)){

$sql_max_jarak = mysql_query("SELECT MIN(hasil) AS hasil_max, urutan, id_sampel FROM jarak WHERE id_sampel='$array_sampel[id_sampel]'");

$array_max_jarak = mysql_fetch_array($sql_max_jarak); if($array_max_jarak['hasil_max'] == $array_sampel['hasil']){

$insert_jarak = mysql_query("INSERT INTO akumulasi_jarak VALUES ('','$array_sampel[id_sampel]','$array_max_jarak[hasil_max]',$array_sampel[urutan])"); $insert_jarak = mysql_query("INSERT INTO akumulasi_jarak2 VALUES ('','$array_sampel[id_sampel]','$array_max_jarak[hasil_max]',$array_sampel[urutan])"); }

}

//perhitungan euclidean distancet terhadap centroid pada sampel //proses perpindahan centroids

$sql_num_sampel1 = mysql_query("SELECT a.*,b.* FROM sampel a, akumulasi_jarak b where a.id_sampel=b.id_sampel and b.urutan=0 ORDER BY a.id_sampel ASC");

$num_sampel1 = mysql_num_rows($sql_num_sampel1);

$sql_sampel_urutan_1 = mysql_query("SELECT SUM(a.sampel_1) AS nilai_1,SUM(a.sampel_2) AS nilai_2,

SUM(a.sampel_3) AS nilai_3,SUM(a.sampel_4) AS nilai_4,SUM(a.sampel_5) AS nilai_5,SUM(a.sampel_6) AS nilai_6,SUM(a.sampel_7) AS nilai_7

,SUM(a.sampel_8) AS nilai_8 FROM sampel a, akumulasi_jarak b where a.id_sampel=b.id_sampel and b.urutan=0 ORDER BY a.id_sampel ASC");

$array_sampel_urutan_1 = mysql_fetch_array($sql_sampel_urutan_1) or die(mysql_error()); $sampel1_urutan_1 = $array_sampel_urutan_1['nilai_1'] / $num_sampel1;

$sampel2_urutan_1 = $array_sampel_urutan_1['nilai_2'] / $num_sampel1; $sampel3_urutan_1 = $array_sampel_urutan_1['nilai_3'] / $num_sampel1; $sampel4_urutan_1 = $array_sampel_urutan_1['nilai_4'] / $num_sampel1; $sampel5_urutan_1= $array_sampel_urutan_1['nilai_5'] / $num_sampel1; $sampel6_urutan_1 = $array_sampel_urutan_1['nilai_6'] / $num_sampel1; $sampel7_urutan_1 = $array_sampel_urutan_1['nilai_7'] / $num_sampel1; $sampel8_urutan_1 = $array_sampel_urutan_1['nilai_8'] / $num_sampel1;

(22)

$sql_num_sampel2 = mysql_query("SELECT a.*,b.* FROM sampel a, akumulasi_jarak b where a.id_sampel=b.id_sampel and b.urutan=1 ORDER BY a.id_sampel ASC");

$num_sampel2 = mysql_num_rows($sql_num_sampel2);

$sql_sampel_urutan_2 = mysql_query("SELECT SUM(a.sampel_1) AS nilai_1,SUM(a.sampel_2) AS nilai_2,

SUM(a.sampel_3) AS nilai_3,SUM(a.sampel_4) AS nilai_4,SUM(a.sampel_5) AS nilai_5,SUM(a.sampel_6) AS nilai_6,SUM(a.sampel_7) AS nilai_7

,SUM(a.sampel_8) AS nilai_8 FROM sampel a, akumulasi_jarak b where a.id_sampel=b.id_sampel and b.urutan=1 ORDER BY a.id_sampel ASC");

$array_sampel_urutan_2 = mysql_fetch_array($sql_sampel_urutan_2) or die(mysql_error()); $sampel1_urutan_2 = $array_sampel_urutan_2['nilai_1'] / $num_sampel2;

$sampel2_urutan_2 = $array_sampel_urutan_2['nilai_2'] / $num_sampel2; $sampel3_urutan_2 = $array_sampel_urutan_2['nilai_3'] / $num_sampel2; $sampel4_urutan_2 = $array_sampel_urutan_2['nilai_4'] / $num_sampel2; $sampel5_urutan_2= $array_sampel_urutan_2['nilai_5'] / $num_sampel2; $sampel6_urutan_2 = $array_sampel_urutan_2['nilai_6'] / $num_sampel2; $sampel7_urutan_2 = $array_sampel_urutan_2['nilai_7'] / $num_sampel2; $sampel8_urutan_2 = $array_sampel_urutan_2['nilai_8'] / $num_sampel2; $delete_nilai_acak = mysql_query("DELETE FROM nilai_acak");

$update_1 = mysql_query("INSERT INTO nilai_acak VALUES ('','$sampel1_urutan_1','$sampel2_urutan_1',

'$sampel3_urutan_1','$sampel4_urutan_1','$sampel5_urutan_1','$sampel6_urutan_1','$sampel 7_urutan_1','$sampel8_urutan_1')");

$update_1 = mysql_query("INSERT INTO nilai_acak VALUES ('','$sampel1_urutan_2','$sampel2_urutan_2',

'$sampel3_urutan_2','$sampel4_urutan_2','$sampel5_urutan_2','$sampel6_urutan_2','$sampel 7_urutan_2','$sampel8_urutan_2')");

$sql_pilihan = mysql_query("SELECT * FROM akumulasi_jarak ORDER BY id_akumulasi_jarak ASC");

$array_pilihan = mysql_fetch_array($sql_pilihan);

$id_akumulasi_jarak = $array_pilihan['id_akumulasi_jarak'];

$sql_cek = mysql_query("SELECT * FROM akumulasi_jarak WHERE id_sampel='$array_pilihan[id_sampel]' and id_akumulasi_jarak='$id_akumulasi_jarak'"); $array_cek = mysql_fetch_array($sql_pilihan);

(23)

//proses perpindahan centroids //mencari waktu penyelesaian $time = microtime();

$time = explode(' ', $time); $time = $time[1] + $time[0]; $finish = $time;

$total_time = round(($finish - $start), 4);

$insert_waktu = mysql_query("INSERT INTO waktu VALUES ('','$total_time')") or die(mysql_error());

//mencari waktu penyelesaian ?>

<h3 class="heading_b uk-margin-bottom">Algoritma K-Means Iterasi <?php echo $iterasi_tampil;?></h3>

<tr> <th>Hasil</th> <th>Urutan</th> </tr> </thead> <tfoot> <tr> <th>Hasil</th> <th>Urutan</th> </tr> </tfoot> <tbody> <?php

(24)

$sql_adult = mysql_query("select a.*,b.* from sampel a, akumulasi_jarak b where a.id_sampel=b.id_sampel

order by urutan asc "); while($array_adult = mysql_fetch_array($sql_adult)){

<tr>

</tr

<?php } ?> </tbody> </table> </div> </div> </div> </div> <?php

$sql_cek_akumulasi_jarak = mysql_query("SELECT * FROM akumulasi_jarak ORDER BY id_sampel ASC");

$array_cek_akumulasi_jarak = mysql_fetch_array($sql_cek_akumulasi_jarak); $cluster_pilihan = $array_cek_akumulasi_jarak['cluster_pilihan'];

$id_sampel_pilihan = $array_cek_akumulasi_jarak['id_sampel'];

$sql_num_akumulasi = mysql_query("SELECT * FROM akumulasi_jarak2 WHERE cluster_pilihan='$cluster_pilihan' and id_sampel='$id_sampel_pilihan'");

$num = mysql_num_rows($sql_num_akumulasi); if($num > 2){

echo "<meta http-equiv='refresh' content='0; url=index.php?page=hasil_kmeans'>";

}?> </body> <?php

include 'koneksi.php'; $time = microtime(); $time = explode(' ', $time);

(25)

$time = $time[1] + $time[0]; $start = $time;

$k = $_GET['nilai_k'];

$delete_jarak = mysql_query("DELETE FROM jarak_knn"); $delete_jarak = mysql_query("DELETE FROM akumulasi_knn"); $delete_jarak = mysql_query("DELETE FROM cek_knn");

$sql_acak = mysql_query("SELECT * FROM testing ORDER BY id_testing DESC"); $i=1;

//proses perhitungan similiarity funtion pada algoritma KNN while($array_nilai_acak = mysql_fetch_array($sql_acak)){

$sql_sampel = mysql_query("SELECT * FROM sampel ORDER BY id_sampel ASC"); while($array_sampel = mysql_fetch_array($sql_sampel)){

$cluser = sqrt(pow(($array_sampel['sampel_1'] - $array_nilai_acak['x1']),2) + pow(($array_sampel['sampel_2'] - $array_nilai_acak['x2']),2)

+ pow(($array_sampel['sampel_3'] - $array_nilai_acak['x3']),2) + pow(($array_sampel['sampel_4'] - $array_nilai_acak['x4']),2)

+ pow(($array_sampel['sampel_5'] - $array_nilai_acak['x5']),2) + pow(($array_sampel['sampel_6'] - $array_nilai_acak['x6']),2)

+ pow(($array_sampel['sampel_7'] - $array_nilai_acak['x7']),2) + pow(($array_sampel['sampel_8'] - $array_nilai_acak['x8']),2)) ;

$insert_jarak = mysql_query("INSERT INTO jarak_knn VALUES ('','$array_sampel[id_sampel]','$cluser')") or die(mysql_error());

}

$i++; }

//proses perhitungan similiarity funtion pada algoritma KNN //nilai similiarity function pada sample

$sql_num = mysql_query("SELECT * FROM jarak_knn"); $sisa = mysql_num_rows($sql_num) - $k;

$sql_limit_atas = mysql_query("SELECT * FROM jarak_knn ORDER BY nilai DESC LIMIT $k") or die(mysql_error());

while($array_limit_atas = mysql_fetch_array($sql_limit_atas)){

$insert = mysql_query("INSERT INTO akumulasi_knn VALUES ('','$array_limit_atas[id_sampel]','Positif')");

(26)

}

$sql_limit_bawah = mysql_query("SELECT * FROM jarak_knn ORDER BY nilai ASC LIMIT $sisa");

while($array_limit_bawah = mysql_fetch_array($sql_limit_bawah)){

$insert = mysql_query("INSERT INTO akumulasi_knn VALUES ('','$array_limit_bawah[id_sampel]','Negatif')");

}

//nilai similiarity function pada sample //menghitung waktu eksekusi

$time = microtime(); $time = explode(' ', $time); $time = $time[1] + $time[0]; $finish = $time;

$total_time = round(($finish - $start), 4); //menghitung waktu eksekusi

<h3 class="heading_b uk-margin-bottom">Algoritma KNN</h3> <div class="md-card uk-margin-medium-bottom">

<tr>

<th>Jumlahkehamilan(preg)</th>

<th>Konsentrasi plasma glukosadalam 2 jam (plas)</th> <th>Tekanandarahdiastolik (mm Hg) (pres)</th>

<th>Ketebalankulittricep (mm) (skin)</th>

<th>Serum insulin selama 2 jam (mu U/ml) (insu)</th>

<th>Index beratbadan(beratdalam kg / (tinggidalam meter)^2) (mass)</th>

(27)

<th>Umur(years) (age)</th> <th>Hasil</th>

</tr> </thead> <tfoot> <tr>

<th>Jumlahkehamilan(preg)</th>

<th>Konsentrasi plasma glukosadalam 2 jam (plas)</th> <th>Tekanandarahdiastolik (mm Hg) (pres)</th>

<th>Ketebalankulittricep (mm) (skin)</th>

<th>Serum insulin selama 2 jam (mu U/ml) (insu)</th>

<th>Index beratbadan(beratdalam kg / (tinggidalam meter)^2) (mass)</th>

<th>Fungsipedigree diabetes (pedi)</th> <th>Umur(years) (age)</th>

<th>Hasil</th> </tr> </tfoot> <tbody> <?php

$sql_adult = mysql_query("select a.*,b.* from sampel a, akumulasi_knn b where a.id_sampel=b.id_sampel

order by id_akumulasi_knn asc "); while($array_adult = mysql_fetch_array($sql_adult)){

<tr>

(28)

<?php } ?> </tbody> </table>

<a href="index.php?page=akumulasi_knn">Akumulasi</a> <br /><br /> <?php

//menghitung tingkat akurasi

$sql_cek_knn = mysql_query("SELECT a.*, b.* FROM sampel a, akumulasi_knn b WHERE a.id_sampel=b.id_sampel");

while($array_cek_knn = mysql_fetch_array($sql_cek_knn)){

$id_sampel = $array_cek_knn['id_sampel']; if($array_cek_knn['class'] == 'tested_positive'){

$nilai = 'Positif'; }

else if($array_cek_knn['class'] == 'tested_negative'){ $nilai = 'Negatif';

}

if($nilai == $array_cek_knn['hasil'])

$ketepatan = 1;

} else {

$ketepatan = 0; }

$insert_ketepatan =

mysql_query("INSERT INTO cek_knn VALUES ('','$id_sampel','$ketepatan')"); }

$select_total = mysql_query("SELECT COUNT(*) AS jum_sampel FROM sampel");

(29)

$select = mysql_query("SELECT COUNT(*) AS jum_ketepatan FROM cek_knn WHERE ketepatan = '1'");

$array = mysql_fetch_array($select);

$ketepatan = ($array['jum_ketepatan'] / $array_total['jum_sampel']) * 100;

echo "Akurasi dari Algoritma ini adalah " . round($ketepatan , 3) . " %";

?> </div>

</div> </div> </div>

<h3 class="heading_b uk-margin-bottom">Hasil K-Means</h3> <div class="md-card">

Hasil K-Means dengan Hasil Diabetes Positif adalah : <?php

$sql_adult = mysql_query("select a.*,b.* from sampel a, akumulasi_jarak b where a.id_sampel=b.id_sampel

and b.urutan='1'

order by a.id_sampel asc "); $string_positif = "";

while($array_adult = mysql_fetch_array($sql_adult)){

$string_positif .= "Sampel " . $array_adult['id_sampel'] . ', '; }

echo substr($string_positif, 0, strlen($string_positif) - 2); ?>

</div> </div>

(30)

</div> </div>

Hasil K-Means dengan Hasil Diabetes Negatif adalah : <?php

$sql_adult = mysql_query("select a.*,b.* from sampel a, akumulasi_jarak b where a.id_sampel=b.id_sampel

and b.urutan='0'

order by a.id_sampel asc "); $string = "";

while($array_adult = mysql_fetch_array($sql_adult)){ $string .= "Sampel " . $array_adult['id_sampel'] . ', '; }

echo substr($string, 0, strlen($string) - 2); ?>

</div> </div> </div> </div> </div> </div>

<h3 class="heading_b uk-margin-bottom">Hasil KNN</h3> <div class="md-card">

Hasil KNN dengan Hasil Diabetes Positif adalah : <?php

(31)

$sql_adult = mysql_query("select a.*,b.* from sampel a, akumulasi_knn b where a.id_sampel=b.id_sampel

and b.hasil='Positif'

order by id_akumulasi_knn asc "); $string_positif = "";

while($array_adult = mysql_fetch_array($sql_adult)){

$string_positif .= "Sampel " . $array_adult['id_sampel'] . ', '; }

echo substr($string_positif, 0, strlen($string_positif) - 2); ?>

</div> </div> </div> </div>

Hasil KNN dengan Hasil Diabetes Negatif adalah : <?php

$sql_adult = mysql_query("select a.*,b.* from sampel a, akumulasi_knn b where a.id_sampel=b.id_sampel

and b.hasil='Negatif'

order by a.id_sampel asc "); $string = "";

while($array_adult = mysql_fetch_array($sql_adult)){ $string .= "Sampel " . $array_adult['id_sampel'] . ', '; }

echo substr($string, 0, strlen($string) - 2); ?>

</div> </div> </div> </div>

(32)

DAFTAR PUSTAKA

[AGU07] Agusta, Y. 2007. K-Means-Penerapan, Permasalahan dan Metode Terkait. Denpasar, Bali: Jurnal Sistem dan Informatika Vol.3, pp : 47-60.

[BUD13] Budiman, I. 2012. Data Clustering Menggunakan Metodologi CRISP- DM Untuk Pengenalan Pola Proporsi Pelaksanaan Tridharma. Tesis. Universitas Diponegoro.

[HUL13] Huliman.2013. Analisis Akurasi Algoritma Pohon Keputusan dan K-Nearest Neighbor (KNN).Tesis.Universitas Sumatera Utara.

[LAR05] Larose Daniel,T .2005. Discovering knowledge in data : an introduction to data mining , John Wiley & Sons, Inc.

[MIR08] Mirza, M. 2008. Mengenal Diabetes Melitus. Kata Hati. Yogyakarta. [NUG11] Nugraheni, Y. 2011. Data Mining degan Metode Fuzzy Untuk Customer

Relationship Management (CRM) pada Perusahaan Retail. Universitas Udayana.

[NUR11] Nurjayanti B. 2011. Identifikasi shorea menggunakan K-Nearest Neighbor berdasarkan karakteristik morfologi daun. Skripsi. Institut Pertanian Bogor. [ONG13] Ong, J. O. 2013. Implementasi Algoritma K-Means Clustering Untuk

Menentukan Strategi Marketing President University(12):10-20.

[OSC13] Oscar Ong, J .2013. Implementasi Algoritma k-means Clustering untuk no. 1, pp. Menentukan Strategi Marketing President University. Jurnal Ilmiah Teknik Industri. vol. 12,10-13.

[PAU12] Paulanda, Z. 2012. Model Profil Mahasiswa Yang Potensisal Drop Out Menggunakan Teknik Kernel-K-Mean Clustering Dan Decision Tree. Tesis. Universitas Sumatera Utara. 2013.

[RIS08] Rismawan, T & Kusumadewi, S. 2008. Aplikasi K-Means Untuk

Pengelompokkan Mahasiswa Berdasarkan Nilai Body Mass Index (BMI) & Ukuran Kerangka, SNATI. Yogyakarta.

[SAN07] Santosa, B. 2007. Data Mining : Teknik Pemanfaatan Data untuk Keperluan Bisnis, Teori dan Aplikasi. Graha Ilmu. Yogyakarta.

[SOE04] Soegondo, S, dkk, 2004. Penatalaksanaan Diabetes Mellitus Terpadu. FKUI. Jakarta.

[SOR11] Soraya, Y. 2011. Perbandingan Kinerja Metode Single Linkage, Metode Complete Linkage dan Metode K-Means dalam Analisis Cluster. Universitas

(33)

[UTA10] Utami, D. D. P & Sutikno. 2010. Pengelompokan Zona Musim (ZOM) Dengan Fuzzy K-Means Clustering.

[VER09] Vercilles, Carlo. 2009. Business Intelligence: Data Mining and Optimazation for Decision Making. United Kingdom: Joh Wiley & Sons Ltd.

[ZAR13] Zarlis, M., Sitompul, O.S., Sawaluddin, Effendi, S., Sihombing, P. & Nababan, E.B.2013. Pedoman Penulisan Tesis. FasilkomTI. Universitas Sumatera Utara.

[MIR14] Mirkes EM. 2011. KNN and potential energy.

http://www.math.le.ac.uk/people/ag153/homepage/KNN/KNN3.html. (6 September 2014).

(34)

BAB 3

ANALISIS DAN PERANCANGAN SISTEM

3.1 Pengumpulan Data Pelatihan

PIMA Indians Dataset adalah sebuah dataset yang didapat dari National Institute of Diabetes and Digestive and Kidney Diseases dan pertama kali digunakan oleh Smith,~J.~W., Everhart,~J.~E., Dickson,~W.~C., Knowler,~W.~C., \& Johannes,~R.~S pada tahun 1988 pada sebuah penelitian dengan judul memprediksi apakah sebuah sample terindikasi Diabetes Mellitus atau peramalan diabetes militus pada populasi di Phoenix, Arizona, USA. Dataset ini terdiri dari 12 kolom sehingga dalam penelitian ini diperlukan beberapa langkah pre-processing guna mengolah raw data yang didapat sehingga menjadi data yang siap di training, adapun langkah- langkah nya adalah sebagai berikut:

1. Membuat rancangan data input dan output yang akan dimasukkan sebagai data penelitian

2. Memisahkan data penelitian menjadi dua bagian, yaitu data pelatihan dan data pengujian. Data pelatihan diguanakan untuk mengamati proses pengenalan pola (memorisasi), sedanga data pengujian digunakan untuk mengamati kemampuan algoritma dalam mengenali pola pada sample yang belum dipelajari oleh algoritma K - Nearest Neighbor dan K-Means sebelumnya.

PIMA Indians dataset terdiri dari beberapa tipe data, yaitu: integer, float, numeric, Boolean sehingga pada masing-masing kolom memiliki karakteristik tersendiri apakah itu nilai mean, fungsi distribusi nya, nilai maksimum dan minimum nya, pengetahuan menganai karakteriik masing-masing parameter dapat membantu proses pengolahan data input sehingga kita dapat melakukan penyaringan untuk mengetahui sample yang mana saja yang layak diolah dan sample mana yang sebaiknya di hapus atau dibuang, berikut disajikan dalam table 3.1 karakterisik pada masing-masing kolom dalam PIMA Indians dataset:

(35)

Tabel 3.1. Karakterisik pada masing-masing kolom dalam PIMA Indians dataset

No Atribut Tipe Data Min Mean Max Standard Deviation

1 Jumlah kehamilan (preg) Integer 0 3,8 17 3,4

2 Konsentrasi plasma glukosa dalam Integer 0 120,9 199 32,0 3 Tekanan darah diastolik (mm Hg) Integer 0 69,1 122 19,4 4 Ketebalan kulit tricep (mm) (skin) Integer 0 20,5 99 16,0 5 Serum insulin selama 2 jam (mu Integer 0 79,8 846 115,2 6 Index berat badan (berat dalam kg Integer 0 32,0 67.1 7.9 7 Fungsi pedigree diabetes (pedi) Integer 0.0780 0,5 2.42 0.3

8 Umur (years) (age) Integer 21 33,2 81 11.8

9 K e l a s ( 0 a t a u 1 / Boolean 0 0.34 1 -

Pada dataset ini nama atribut pada PIMA Indians dataset diubah menjadi variabel sebagai berikut:

1. Jumlah kehamilan (preg) diubah menjadi �

2. Konsentrasi plasma glukosa dalam 2 jam diubah menjadi � 3. Tekanan darah diastolik (mm Hg) diubah menjadi � 4. Ketebalan kulit tricep (mm) (skin) diubah menjadi � 5. Serum insulin selama 2 jam diubah menjadi � 6. Index berat badan diubah menjadi �

7. Fungsi pedigree diabetes (pedi) diubah menjadi � 8. Umur (years) (age) diubah menjadi �

3.2. Proses Training pada Algoritma K-Means Clustering

Pada algoritma K-Means Clstering terjadi proses dalam 2 tahap utama yaitu: proses menghitung nilai rata-rata (mean) dan juga proses pergeseran centroids kearah mayoritas sample pelatihan. Pada bab ini akan dijelaskan masing-masing proses yang terjadi pada algoritma K-Means Clustering.

(36)

3.2.1. Menentukan jumlah cluster dan nilai centroids

Menentukan jumlah cluster dan centroids biasanya tergantung pada jenis permasalahan yang akan diselesaikan sehingga pada penelitian ini permasalahan yang akan diselesaikan berkaitan dengan diabetes mellitus sehingga hanya diperlukan dua buah centroid untuk mewakili dua buah cluster (kelompok) yaitu: postif (mengidap diabetes mellitus) dan negative (tidak mengidap diabetes mellitus) yang juga mewakili target pada data pelatihan, yaitu: 0 (negatif) dan 1 (positif).

Setelah menentukan jumlah cluster maka langkah selanjutnya adalah menetukan nilai centroids yang digunakan, namun dalam menentukan nilai centroids dilakukan secara random untuk menghindari terjadinya bottleneck dikarenakan nilai centroids yang digunakan saling berdekatan ataupun terlalu jauh sehingga proses pemindahan centroids pada langkah selanjutnya tidak dapat dilakukan ataupun makan waktu terlalu lama. Pada penelitian kali ini penulis telah menentukan nilai centroids yang di-generate secara random dapat dilihat pada Tabel 3.2 sebagai berikut.

Tabel 3.2. Nilai centroids yang akan digunakan

Centroids �₁ �₂ �₃ �₄ �₅ �₆ �₇ �₈ Target centroids 1 1 87 78 27 32 34 0.1 22 0 centroids 2 5 187 76 27 207 43 1.03 53 1

Dalam penelitian ini juga terdapat sebagian sample yang akan digunakan dalam proses perhitungan ,dapat dilihat pada Tabel 3.3 sebagai berikut.

Tabel 3.3. Nilai sample

Sample Ke- �₁ �₂ �₃ �₄ �₅ �₆ �₇ �₈ Target Sample 1 6 148 72 35 0 33.6 0.627 50 1 Sample 2 1 85 66 29 0 26.60 0.351 31 0 Sample 3 8 183 64 0 0 23.30 0.672 32 1 Sample 4 1 89 66 23 94 28.10 0.167 21 0 Sample 5 0 137 40 35 168 43.10 2.288 33 1

(37)

Sample ... .... .... .... .... ... .... .... .... .... Sample 450 0 120 74 18 63 30.5 0.285 26 0 Sample 451 1 82 64 13 95 21.20 0.415 23 0 Sample 452 2 134 70 0 0 28.90 0.542 23 1 Sample 453 0 91 68 32 210 39.90 0.381 25 0 Sample ... .... .... .... .... .... .... .... .... .... Sample 698 0 99 0 0 0 25 0.253 22 0 Sample 699 4 127 88 11 155 34.50 0.598 28 0

Sample 700 4 118 70 0 0 44.5 0.904 26 0

3.2.2. Menentukan jumlah Euclidean distance

Langkah kedua dalam proses pelatihan algoritma K-Means Clustering adalah menentukan nilai Euclidean distance pada masing-masing sample pelatihan. Dalam algoritma K-Means Clustering terdapat beberapa pilihan dalam menentukan nilai jarak pada antara masing-masing sample pelatihan seperti hamming distance dan manhattan distance, namun pada penelitian kali ini penulis memilih untuk menggunakan Euclidean distance dikarenakan alasan bahwa Euclidean distance lebih sederhana dalam proses perhitungan dan memiliki running time yang lebih singkat namun tetap memiliki hasil yang cukup akurat jika dibandingan dengan distance function yang lain seperti hamming distance dan manhattan distance.

Pada Euclidean distance nilai fungsi jarak yang didapat merupakan jarak antara sample pelatihan terhadap masing-masing centroids yang digunakan, dan pada algoritma K-Means Clustering yang harus diperhatikan adalah bahwa nilai Euclidean distance yang memiliki nilai paling sedikit berarti sample tersebut adalah anggota dari centroids terdekat dan pada akhirnya merupakan centroids yang mengalami shifting (pergeseran). Pada penelitian kali ini, penulis akan menjelaskan bagaimana proses perhitungan fungsi jarak pada algoritma K-Means Clustering menggunakan Euclidean distance sebagai distance function-nya.

(38)

Perhitungan Euclidean distance untuk sample pada tabel 3.3 dengan nilai centroids table 3.2 dengan menggunakan rumus Euclidean distance seperti yang diuraikan sebagai berikut:

a. Hitung jarak data pertama ke pusat cluster pertama:

, = − + − + − + − + − +

, − + , − , + −

= 5654,43

b. Hitung jarak data pertama ke pusat cluster kedua:

, = − + − + − + − + − +

, − + , − , + −

= 44548,52

c. Hitung jarak data ke-450 ke pusat cluster pertama:

, = − + − + − + − + − +

, − + , − , + −

= 2176,284

d. Hitung jarak data ke-450 ke pusat cluster kedua:

, = − + − + − + − + − +

, − + , − , + −

= 26220,81

e. Hitung jarak data ke-700 ke pusat cluster pertama:

, = − + − + − + − + − +

, − + , − , + −

= 2913,896

f. Hitung jarak data ke-700 ke pusat cluster kedua:

, = − + − + − + − + − +

, − + , − , + −

= 49107,27

Tabel 3.4. Hasil Perhitungan Euclidean Distance Terhadap Centroids pada Sample Sample Ke- Jarak Terhadap Centroid 1 Jarak Terhadap Centroid 2

(39)

Sample 2 1311,823 54126,42

Sample 3 11428,82 44576,22

Sample 4 4043,814 23751,75

Sample 5 22713,6 5807,593

Sample ... .... ....

Sample 450 2176,284 26220,81

Sample 451 4550,939 25300,62

Sample 452 4054,205 47531,05

Sample 453 31869,89 10133,03

Sample ... .... ....

Sample 698 8063,023 58408,6

Sample 699 17130,5 7402,437

Sample 700 2913,896 49107,27

= Centroids terdekat terhadap sample �_n

Terlihat pada pada hasil Tabel 3.4 bahwa sample 5, 453 dan sample 699 memiliki nilai Euclidean Distance terkecil terhadap centroids 2, sedangkan sample 1, 2, 3, 4,450, 451, 452, 698 dan 700 memiliki nilai Euclidean Distance terkecil terhadap centroids 1.

3.2.4.Proses perpindahan centroids

Pada proses selanjutnya dari K-Means Clustering adalah proses perpindahan centroids, proses ini merupakan sebuah proses yang bersifat iterative sehingga akan dilakukan secara berulang seiring dengan hasil yang didapat pada proses sebelumnya. Perpindahan centroids diawali dengan proses pencarian nilai mean pada masing- masing sample yang telah di-assignment pada setiap centroids untuk kemudian diketahui posisi pergeseran centroids berdasarkan pada nilai mean pada seluruh sample ter-assignment pada centroids tersebut. Proses perhitungan perpindahan centroids dapat dilihat sebagai berikut:

(40)

= [ . . ]+[ . . ]+[ . . ] [ , . ]+[ . . ]+[ . . ] [ . . ]+[ . ]+[ . . ] = [ . . ]

= [ . . . . . . . ] =

[ . . ]+[ . . ]+[ . . ] = [ . . ]

= [ . . . . . . . ]

Sesuai dengan hasil perhitungan yang diperoleh sebelumnya maka didapat posisi centroids terbaru seperti dalam tabel 3.5 berikut:

Tabel 3.5. Hasil Pergeseran centroids

Centroid Centroid

Centroid awal [1 87 78 27 32 34 0.1 22] [5 187 76 27 207 43 1.03 53]

Centroid baru [ . . . . . . . ] [ . . . . . . . ]

3.3. Proses training pada algoritma K-Nearest Neighbor

Proses training pada K-Nearest Neighbor pada penerapannya hanyalah terdiri dari 5 proses yaitu proses perhitungan jarak menggunakan Euclidean Distance dalam menghitung tingkat kemiripan pada sample training dengan sample testing dan kemudian diakhiri dengan proses pengelompokan dengan mempertimbangkan dan menghitung nilai ambang batas (threshold).

3.3.1. Proses perhitungan similarity function pada algoritma K-Nearest Neighbor

Algoritma K-Nearest Neighbor menjadikan nilai similarity function sebagai pertimbangan dalam proses clustering, ini berarti jika sebuah sample memiliki

(41)

kemiripan dengan sample yang lain maka besar kemungkinan bahwa sample tersebut memiiki target ataupun berasal dari kelompok yang sama. Pada penelitian kali ini proses perhitungan similarity fuction dilakukan mengggunakan radial basis function, dikarenakan radial basis function memiliki perhitungan yang cukup sederhana jika digunakan pada dataset yang memiliki mayoritas tipe data integer serta memiliki nilai similarity yang cukup akurat walaupun dibandingkan dengan similarity function yang lain seperti: hamming distance dan manhattan distance. Maka berikut perhitungan nilai similarity function berupa radial basis function menggunakan Euclidean Distance yang dilakukan pada beberapa sample:

Pada Euclidean distance nilai fungsi jarak yang didapat merupakan jarak antara sample testing terhadap masing-masing sample training yang digunakan, dan pada algoritma K-Nearest Neighbor yang harus diperhatikan adalah bahwa nilai Euclidean distance yang memiliki nilai paling kecil berarti sample testing tersebut adalah anggota dari sample training terdekat. Pada penelitian kali ini, penulis akan menjelaskan bagaimana proses perhitungan fungsi jarak pada algoritma K-Nearest Neighbor menggunakan Euclidean distance sebagai distance function-nya.

Pada proses perhitungan Euclidean Distance pada algoritma K-Nearest Neighbor aplikasi data mining terdiri dari dua data,yaitu:

 Data Testing

Data Testing yang akan digunakan seperti pada tabel 3.6 berikut: Tabel 3.6. Nilai Data Testing yang akan digunakan

Sample testing �₁ �₂ �₃ �₄ �₅ �₆ �₇ �₈ Target Sample 1 87 78 27 32 34.6 0.1 22 ?

 Data Training

Data Training yang akan digunakan seperti pada tabel 3.7 berikut: Tabel 3.7.Nilai Data Training

Sample Ke- �₁ �₂ �₃ �₄ �₅ �₆ �₇ �₈ Target Sample 1 6 148 72 35 0 33.6 0.627 50 1

(42)

Sample 2 1 85 66 29 0 26.60 0.351 31 0 Sample 3 8 183 64 0 0 23.30 0.672 32 1 Sample 4 1 89 66 23 94 28.10 0.167 21 0 Sample 5 0 137 40 35 168 43.10 2.288 33 1 Sample 44 9 171 110 24 240 45,5 0,74 54 1 Sample 107 1 96 122 0 0 22,4 0,207 27 0 Sample 441 0 189 104 25 0 34,3 0,435 41 1 Sample 550 4 189 110 31 0 28,5 0,68 37 0 Sample 663 8 167 106 46 231 37,6 0,165 43 1 Sample 692 13 158 114 0 0 42,3 0,257 44 1

Perhitungan Euclidean distance untuk sebagian sample training pada tabel 3.7 dengan nilai data testing table 3.6 dengan menggunakan nilai K=5 seperti yang diuraikan sebagai berikut:

(43)

Seluruh hasil perhitungan Euclidean Distance pada sebagian sample ditunjukkan pada Tabel 3.8 sebagai berikut:

Tabel 3.8.Hasil Euclidean Distance Pada sebagian data training

Sample Ke- Euclidean Distance

Sample 1 75,201

Sample 2 36,346

Sample 3 106,967

Sample 4 63,649

Sample 5 150,67

Sample 44 229,25

Sample 107 62,8

Sample 441 111,67 Sample 550 112,87 Sample 663 218,27

Sample 692 93,68

Dari hasil perhitungan Euclidean Distance pada tabel 3.8, Kemudian mengurutkan objek-objek tersebut ke dalam kelompok yang mempunyai jarak Euclid terkecil dengan nilai K=5 pada tabel 3.9 sebagai berikut.

Tabel 3.9.Mengurutkan Objek ke dalam Kelompok ke Jarak Euclid Terkecil

Sample Ke- Euclidean Distance Jarak terkecil

(44)

Sample 2 36,346 1

Sample 3 106,967 6

Sample 4 63,649 3

Sample 5 150,67 9

Sample 44 229,25 11

Sample 107 62,8 2

Sample 441 111,67 7

Sample 550 112,87 8

Sample 663 218,27 10

Sample 692 93,68 5

Dari hasil pengelompokan objek pada tabel 3.9, Kemudian Mengumpulkan label class (klasifikasi Nearest Neighbor) pada tabel 3.10 sebagai berikut.

Tabel 3.10 Label Class Y

Sample Ke- Euclidean Distance Jarak terkecil Target KNN

Sample 1 75,201 4 1 1

Sample 2 36,346 1 0 1

Sample 3 106,967 6 1

Sample 4 63,649 3 0 1

Sample 5 150,67 9 1

Sample 44 229,25 11 1

Sample 107 62,8 2 0 1

(45)

Sample 550 112,87 8 0

Sample 663 218,27 10 1

Sample 692 93,68 5 1 1

Dari hasil pengumpulkan label class (klasifikasi Nearest Neighbor) pada tabel 3.10, Kemudian Mencari Mayoritas Kategori seperti pada tabel 3.11 sebagai berikut.

Tabel 3.11 Hasil Akhir Mayoritas Kategori

Sample Ke- Euclidean Distance Jarak terkecil Target KNN

Sample 1 75,201 4 1 1

Sample 2 36,346 1 0 1

Sample 3 106,967 6 1

Sample 4 63,649 3 0 1

Sample 5 150,67 9 1

Sample 44 229,25 11 1

Sample 107 62,8 2 0 1

Sample 441 111,67 7 1

Sample 550 112,87 8 0

Sample 663 218,27 10 1

Sample 692 93,68 5 1 1

Seperti tampak pada Tabel 3.11, terdapat 11 data training. Ketika ada data testing, maka solusi yang akan diambil adalah hasil dari 5 sample terdekat dari data testing. Maka terlihat bahwa sample 1,2,4,107 dan 692 memiliki jarak lebih dekat dari pada sample lainya. Dengan demikian, mayoritas dari ke-5 sample yang terdekat adalah negatif.jadi data testing satu cluster dengan sample 2.

(1)

COMPARATIVE ANALYSIS OF CLUSTER PROCESS USING K -MEANS CLUSTERING AND K-NEAREST NEIGHBOR DISEASE DIABETES MELLITUS

ABSTRACT

Classification is one of the few role of data mining. In the classification function, there are many algorithms that can be used to process input into the desired output, so it must be considered aspects of performance of each algorithm. The purpose of this study was to analyze and compare the performance of Nearest Neighbor and K-Means Clustering from the standpoint of accuracy and runing time.Data sets the research came from the UCI Machine Learning Repository, ie: PIMA Indians Diabetes Dataset.Hasil accuracy comparative analysis shows that the value to-accuracy algorithm K-Means Clustering with an accuracy better than 67 143% K-Nearest Neighbor algorithm with 64 286% accuracy in the implementation of the testing process the data sets.sedangkan time K-Nearest Neighbor algorithm is relatively faster than the K-Means Clustering where Watu testing of K-Nearest Neighbor ie 0.2492 seconds while K-Means Clustering is 12.1285 seconds.

Keywords : Classification , Dataset , K -Means Clustering , K - Nearest Neighbor , runing time , accuracy .

(2)

DAFTAR ISI

Halaman

Persetujuan ii

Pernyataan iii

Penghargaan iv

Abstrak vi

Abstract vii

Daftar Isi viii

Daftar Tabel xi

Daftar Gambar xiii

Daftar Lampiran xv

BAB 1 PENDAHULUAN

1.1 Latar Belakang 1

1.2 Rumusan Masalah 2

1.3 Batasan Masalah 2

1.4 Tujuan Penelitian 3

1.5 Manfaat Penelitian 3

1.6 Metodologi Penelitian 3

1.7 Sistematika Penulisan 4

BAB 2 TINJAUAN PUSTAKA

2.1 Data Mining 6

2.2 Proses Data Mining 10

2.3 Data Clustering 13

2.4 Clustering 13

2.4.1 K-Means clustering

2.4.1.1 Algoritma K-Means clustering

14 15 2.4.2 k-Nearest Neighbor

2.4.2.1 Algoritma k-Nearest Neighbor

17 18 2.5 Euclidean Distance 21

2.6 Centroids 21

2.7 Dataset 21

2.8 Diabetes Melitus 21

2.8.1 Pengertian Diabetes Melitus 22 2.8.2 Determinan Diabetes Melitus 22

BAB 3 ANALISIS DAN PERANCANGAN SISTEM

3.1 Pengumpulan Data Pelatihan 23

3.2 Proses Training pada Algoritma k-means Clustering

(3)

3.3 Proses Training pada Algoritma k-nearest neighbor 31 3.3.1 Proses perhitungan similarity function pada

algoritma k-nearest neighbor

3.4 Struktur Tabel 38

3.5 Perancangan Sistem 45

3.5.1 Diagram Konteks 45

3.5.2 Data Flow Diagram 46

3.5.3 Data Flow Diagram level 2 47

3.5.4 Entity Relation Diagram 49

3.6 Flowchart 51

3.7 Perancangan User Interface 54

3.7.1 Perancangan Admin Interface Input (Pemasukan) 54 Data

3.7.2 Perancangan Admin Interface Outnput(Keluaran) 60

BAB 4 IMPLEMENTASI DAN PENGUJIAN

4.1 Pengertian Implementasi Sistem 69

4.2 Komponen utama dalam Implementasi Sistem 69 4.2.1 Perangkat Keras (Hardware) 69 4.2.2 Perangkat Lunak (Software) 70

4.2.3 Unsur Manusia (Brainware) 70

4.3 Tampilan Program 71

4.3.1 Tampilan Import Data 71

4.3.2 Halaman Data Sampel Dibetes 71

4.3.3 Halaman Input Data Atribut 72

4.3.4 Halaman Data Atribut 73

4.3.5 Halaman Input nilai acak centroid 74 4.3.6 Halaman Hasil Clustering K-Means 75 4.3.6.1 Halaman Hasil Clustering K-Means dengan

hasil diabetes negatif dan positif

76 4.3.7 Halaman Input data baru dan nilai limit

4.3.8 Halaman Hasil Clustering KNN

4.3.8.1 Halaman Hasil Clustering KNN dengan hasil diabetes negatif dan positif

77 78 79

BAB 5 KESIMPULAN DAN SARAN

5.1 Kesimpulan 80

5.2 Saran 80

(4)

DAFTAR TABEL

Halaman Tabel 3.1 Tabel karakterisik pada masing-masing kolom dalam

PIMA Indians dataset

Tabel 3.2 Tabel nilai centroids yang akan digunakan 26

Tabel 3.3 Tabel nilai sample 27

Tabel 3.4 Tabel Hasil Perhitungan Euclidean Distance Terhadap Centroids pada Sample

Tabel 3.5 Tabel Hasil Pergeseran centroids 31

Tabel 3.6 Tabel nilai Data Testing yang akan digunakan 32

Tabel 3.7 Tabel nilai Nilai data testing 32

Tabel 3.8 Tabel Nilai Similarity Function Pada Sample 34

Tabel 3.9 Tabel Mengurutkan Objek ke dalam Kelompok ke Jarak Euclid Terkecil

Tabel 3.10 Tabel Label classY 36

Tabel 3.11 Tabel Hasil Akhir Mayoritas Kategori 37

Tabel 3.12 Tabel akumulasi_jarak 38

Tabel 3.13 Tabel akumulasi_jarak2 39

Tabel 3.14 Tabel akumulasi_knn 39

Tabel 3.15 Tabel Tabel atribut 40

Tabel 3.16 Tabel Tabel iterasi 40

Tabel 3.17 Tabel jarak 41

Tabel 3.18 Tabel jarak_knn 41

Tabel 3.19 Tabel nilai_acak 42

Tabel 3.20 Tabel nilai_acak2 43

Tabel 3.21 Tabel sampel 43

(5)

DAFTAR GAMBAR

Halaman

Gambar 2.1 Tahapan KDD pada Data Mining 11

Gambar 2.2 Flowchart Algoritma Metode k-Means Clustering 16

Gambar 2.3 Ilustrasi Kedekatan Kasus Pasien 19

Gambar 2.4 Flowchart Algoritma k-nearest neighbor 20

Gambar 3.1 Diagram Konteks 45

Gambar 3.2 Data Flow Diagram 47

Gambar 3.3 DFD Level 2 olah data atribut 48

Gambar 3.4 DFD Level 2 Proses Clustering sampel dengan KNN

Gambar 3.5 DFD Level 2 proses clustering dengan K-Means 49

Gambar 3.6 Entity Relation Diagram 50

Gambar 3.7 Flowchart Menu 51

Gambar 3.8 Flowchrt K-Means 52

Gambar 3.9 _{Flowchart KNN} 53

Gambar 3.10 Import Data Sampel Diabetes 54

Gambar 3.11 Perancangan Input Nilai Data Baru dan Nilai Limit 55 Gambar 3.12 Perancangan Input Nilai Data Centroid 1 dan Data Centroid 2 57

Gambar 3.13 Perancangan Form Atribut 59

Gambar 3.14 Perancangan Output Sampel Diabetes 61

Gambar 3.15 Perancangan Output Data Atribut 62

Gambar 3.16 Perancangan Tampilan KNN 63

Gambar 3.17 Perancangan tampilan Hasil clustering KNN Positif dan Negatif 64

Gambar 3.18 Perancangan Tampilan K-Means 66

Gambar 3.19 Perancangan tampilan Hasil clustering K-Means Positif dan Negatif

Gambar 4.1 Tampilan Import Data 71

Gambar 4.2 Halaman Data Sampel Diabetes 72

Gambar 4.3 Input Data Atribut 73

Gambar 4.4 Input Data Bobot 74

Gambar 4.5 Halaman Input nilai acak centroid 75

Gambar 4.6 _{Halaman Hasil Clustering K-Means} 76

Gambar 4.7 Halaman Hasil Clustering K-Means dengan hasil diabetes Negati dan positif

Gambar 4.8 Halaman Input data baru dan nilai limit 77

Gambar 4.9 _{Halaman Hasil Clustering KNN} 78

Gambar 4.10 Halaman Hasil Clustering KNN dengan hasil diabetes Positif dan Negatif

(6)

DAFTAR LAMPIRAN

Halaman

A Tabel Dataset A-1

B Listing Program B-1

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Parts

Dokumen yang terkait

Perbandingan Metode K Nearest Neighbor dan K Means Clustering Dalam Segmentasi Warna Pada Citra.

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

DATA MINING PENERAPAN K-MEANS ALGORITM CLUSTERING

Identifikasi Penyakit Diabetes Mellitus Menggunakan Metode Modified K- Nearest Neighbor (MKNN)

Rekomendasi Diet Bagi Penderita Penyakit Diabetes Menggunakan Metode K-Nearest Neighbor

Dukungan

Links

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Parts

Dokumen yang terkait

Perbandingan Metode K Nearest Neighbor dan K Means Clustering Dalam Segmentasi Warna Pada Citra.

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

Analisis Perbandingan Proses Cluster Menggunakan K- Means Clustering dan K-Nearest Neighbor pada Penyakit Diabetes Mellitus

DATA MINING PENERAPAN K-MEANS ALGORITM CLUSTERING

Identifikasi Penyakit Diabetes Mellitus Menggunakan Metode Modified K- Nearest Neighbor (MKNN)

Rekomendasi Diet Bagi Penderita Penyakit Diabetes Menggunakan Metode K-Nearest Neighbor

Dokumen yang Anda mencari sudah siap untuk unduhkan