PDSIPDSI

Bulletin of Informatics and Data ScienceBulletin of Informatics and Data Science

Stroke is a disease with a high mortality and disability rate that requires early detection. However, the main challenge in the classification process of this disease is data imbalance and the large number of irrelevant features in the dataset. This study proposes a combination of Support Vector Machine (SVM) method with Information Gain feature selection technique and data balancing using Synthetic Minority Over-sampling Technique (SMOTE) to improve classification accuracy. The dataset used consists of 5,110 data with 10 variables and 1 label. Feature selection was performed with three threshold values (0.04; 0.01; and 0.0005), while SVM classification was tested on three different kernels: Linear, RBF, and Polynomial. Model evaluation was performed using Confusion Matrix and training and test data sharing using k-fold cross validation with k=10. The best results were obtained on the RBF kernel with Cost=100 and Gamma=5 parameters at an Information Gain threshold of 0.0005, with accuracy reaching 90.51%. These results show that the combination of techniques used aims to determine the variables that most affect SVM classification in detecting stroke disease.

The study showed that the combination of Information Gain technique in feature selection and SMOTE for data balancing successfully improved the performance of the Support Vector Machine (SVM) model in stroke disease classification.The RBF kernel with a combination of Cost = 100 and Gamma = 5 parameters gave the best performance with the highest accuracy of 90.The Polynomial kernel is also quite good (83.04%), while the linear kernel has lower results.Thus, proper feature selection, data balancing, and optimal parameter tuning are essential for building an accurate and effective stroke detection model.

Berdasarkan latar belakang, metode, hasil, keterbatasan, dan saran penelitian lanjutan yang ada, beberapa saran penelitian lanjutan yang dapat dikembangkan adalah sebagai berikut: Pertama, penelitian selanjutnya dapat mengeksplorasi penggunaan metode ensemble learning, seperti Random Forest atau Gradient Boosting, untuk menggabungkan kekuatan berbagai model dalam meningkatkan akurasi prediksi stroke. Kedua, penelitian dapat difokuskan pada pengembangan model yang mampu mengintegrasikan data dari berbagai sumber, seperti data klinis, data genetik, dan data gaya hidup, untuk memberikan gambaran yang lebih komprehensif tentang risiko stroke. Ketiga, penelitian dapat menginvestigasi penggunaan teknik deep learning, seperti convolutional neural networks (CNN) atau recurrent neural networks (RNN), untuk secara otomatis mengekstrak fitur-fitur relevan dari data medis, seperti citra MRI atau data rekam medis elektronik, sehingga dapat meningkatkan akurasi dan efisiensi deteksi stroke. Dengan menggabungkan ketiga saran ini, diharapkan dapat menghasilkan model prediksi stroke yang lebih akurat, komprehensif, dan efisien, yang pada akhirnya dapat membantu dalam pencegahan dan penanganan stroke secara lebih efektif.

  1. Penerapan SVM dan Information Gain Pada Analisis Sentimen Pelaksanaan Pilkada Saat Pandemi | Jurnal Teknologi... journal.thamrin.ac.id/index.php/jtik/article/view/641Penerapan SVM dan Information Gain Pada Analisis Sentimen Pelaksanaan Pilkada Saat Pandemi Jurnal Teknologi journal thamrin ac index php jtik article view 641
  2. The Analysis of Stroke Risk Factors and Stroke Types | Faletehan Health Journal. analysis stroke risk... journal.lppm-stikesfa.ac.id/index.php/FHJ/article/view/410The Analysis of Stroke Risk Factors and Stroke Types Faletehan Health Journal analysis stroke risk journal lppm stikesfa ac index php FHJ article view 410
  3. One moment, please.... moment please wait request verified ji.unbari.ac.id/index.php/ilmiah/article/view/1950One moment please moment please wait request verified ji unbari ac index php ilmiah article view 1950
  4. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informas)i. feature selection algoritma support vector machine... doi.org/10.29207/resti.v3i3.1084Jurnal RESTI Rekayasa Sistem dan Teknologi Informas i feature selection algoritma support vector machine doi 10 29207 resti v3i3 1084
Read online
File size619.55 KB
Pages12
DMCAReport

Related /

ads-block-test