Optimasi Algoritma Support Vector Machine (SVM) Dengan Menggunakan Feature Selection Gain Ratio Untuk Analisis Sentimen

Mochamad Amzah Yamin, Kusnadi Kusnadi, Luhur Bayuaji

Abstract


The ease of internet access has had a positive impact on the increase in the number of social media users in Indonesia. One of the most widely used applications is X or Twitter. Users often upload posts that contain opinions or sentiments, which trigger debates and discussions. This is interesting to analyze as a study of sentiments or opinions that are trending in society. For this analysis, algorithms such as Support Vector Machine (SVM) are required, which are often used for sentiment analysis. However, SVM lacks in accuracy due to the large number of similar words in the dataset. Words related to sentiment analysis usually have large dimensions, so feature selection is needed to improve SVM performance. This research aims to optimize SVM accuracy by using Feature Selection Gain Ratio. The object of research is a dataset related to the 2017 DKI elections from GitHub. The results showed an increase in SVM accuracy with Feature Selection Gain Ratio. With threshold weight gain ratio > 0.0001 (1732 features), accuracy increases from 61.63% to 71.51%. For threshold weights > 0.002 (518 features), the accuracy increased from 61.63% to 62.79%. Feature selection with Feature Selection Gain Ratio gain ratio produces better accuracy than gain ratio, namely 56.40% with gain ratio and 71.51% with gain ratio for weights > 0.0001. The implications of these findings show that the use of Feature Selection Gain Ratio can improve the accuracy of SVM in sentiment analysis. Social media practitioners can utilize this technique to gain more accurate insights from user data. Further research can focus on developing sentiment analysis algorithms with more sophisticated feature selection techniques for various applications on social media platforms.


References


A. C. Najib, A. Irsyad, G. A. Qandi, and N. A. Rakhmawati, “Perbandingan Metode Lexicon-based dan SVM untuk Analisis Sentimen Berbasis Ontologi pada Kampanye Pilpres Indonesia Tahun 2019 di Twitter,” Fountain of Informatics Journal, vol. 4, no. 2, p. 41, 2019, doi: 10.21111/fij.v4i2.3573.

M. Hafidzullah, S. Sutrisno, and M. Marji, “Seleksi Fitur dengan Information Gain pada Identifikasi Jenis Attention Deficit Hyperactivity Disorder Menggunakan Metode Modified K-Nearest Neighbor,” Jurnal Pengembangan Teknologi …, vol. 3, no. 11, pp. 10444–10452, 2019.

S. Pandey, H. Tekchandani, and S. Verma, “A literature review on application of machine learning techniques in pancreas segmentation,” 2020 1st International Conference on Power, Control and Computing Technologies, ICPC2T 2020, vol. 4, no. 2, pp. 401–405, 2020, doi: 10.1109/ICPC2T48082.2020.9071443.

Ratino, N. Hafidz, S. Anggraeni, and W. Gata, “Sentimen Analisis Informasi Covid-19 menggunakan Support Vector Machine dan Naïve Bayes,” Jurnal Penelitian Ilmu dan Teknologi Komputer, vol. 12, no. 2, pp. 1–11, 2020.

Ratino, N. Hafidz, S. Anggraeni, and W. Gata, “Sentimen Analisis Informasi Covid-19 menggunakan Support Vector Machine dan Naïve Bayes,” Jurnal JUPITER, vol. 12, no. 2, pp. 1–11, 2020.

O. Somantri and D. Apriliani, “Support Vector Machine Berbasis Feature Selection Untuk Sentiment Analysis Kepuasan Pelanggan Terhadap Pelayanan Warung dan Restoran Kuliner Kota Tegal,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 5, no. 5, p. 537, 2019, doi: 10.25126/jtiik.201855867.

R. Maulana, “Peningkatan Akurasi Analisis Sentimen Review Film Menggunakan Support Vector Machine Berbasis Information Gain,” Nusa Mandiri, 2019.

N. M. Hibattullah and S. Al Faraby, “Analisis Sentimen terhadap Ulasan Film Berbahasa Inggris Menggunakan Metode Support Vector Machine dengan Feature Selection Information Gain,” e-Proceeding of Engineering, vol. 8, no. 5, pp. 10138–10152, 2021.

A. R. I. Pratama, S. A. Latipah, and B. N. Sari, “Optimasi Klasifikasi Curah Hujan Menggunakan Support Vector Machine (Svm) Dan Recursive Feature Elimination (Rfe),” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 7, no. 2, pp. 314–324, 2022, doi: 10.29100/jipi.v7i2.2675.

A. Tedyyana, O. Ghazali, and O. Purbo, “Model Design of Intrusion Detection System on Web Server Using Machine Learning Based,” in Proceedings of the 11th International Applied Business and Engineering Conference, ABEC 2023, September 21st, 2023, Bengkalis, Riau, Indonesia, EAI, 2024. doi: 10.4108/eai.21-9-2023.2342879.

O. Pahlevi and A. Amrin, “Data Mining Model For Designing Diagnostic Applications Inflammatory Liver Disease,” SinkrOn, vol. 5, no. 1, p. 51, 2020, doi: 10.33395/sinkron.v5i1.10589.

A. S. Aribowo and S. Khomsah, “Implementation Of Text Mining For Emotion Detection Using The Lexicon Method (Case Study: Tweets About Covid-19) Implementasi Text Mining Untuk Deteksi Emosi Menggunakan Metode Leksikon (Studi Kasus: Twit Tentang Covid-19),” Jurnal Informatika dan Teknologi Informasi, vol. 18, no. 1, pp. 49–60, 2021, doi: 10.31515/telematika.v18i1.4341.

S. Siswanto, Z. Mar’ah, A. S. D. Sabir, T. Hidayat, F. A. Adhel, and W. S. Amni, “The Sentiment Analysis Using Naïve Bayes with Lexicon-Based Feature on TikTok Application,” Jurnal Varian, vol. 6, no. 1, pp. 89–96, 2022, doi: 10.30812/varian.v6i1.2205.

A. Tedyyana, O. Ghazali, and O. W. Purbo, “Machine learning for network defense: automated DDoS detection with telegram notification integration,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 34, no. 2, p. 1102, May 2024, doi: 10.11591/ijeecs.v34.i2.pp1102-1109.

S. Saikin, S. Fadli, and M. Ashari, “Optimization of Support Vector Machine Method Using Feature Selection to Improve Classification Results,” JISA(Jurnal Informatika dan Sains), vol. 4, no. 1, pp. 22–27, 2021, doi: 10.31326/jisa.v4i1.881.

E. B. Setiawan and I. M. Mubaroq, “The Effect of Information Gain Feature Selection for Hoax Identification in Twitter Using Classification Method Support Vector Machine,” Ind. Journal on Computing, vol. 5, no. 2, pp. 107–118, 2020, doi: 10.21108/indojc.2020.5.2.499.

F. N. Fajriyan, Moh. Ahsan, and W. Harianto, “Komparasi Tingkat Akurasi Information Gain Dan Gain Ratio Pada Metode K-Nearest Neighbor,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 6, no. 1, pp. 386–391, 2022, doi: 10.36040/jati.v6i1.4694.

Visitor Analytics, “Term Frequency Inverse Document Frequency (TF-IDF),” Visitor Analytics, no. December, 2023.




DOI: https://doi.org/10.35314/isi.v9i1.4197

Refbacks

  • There are currently no refbacks.




Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


This Journal has been listed and indexed in :

Crossref logo Find in a library with WorldCat

Copyright of Jurnal Inovtek Polbeng - Seri Informatika (ISSN: 2527-9866)

Creative Commons License
ISI: Inovtek Polbeng Seri Informatikan is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Editorial Office :
Pusat Penelitian dan Pengabdian kepada Masyarakat
 Politeknik Negeri Bengkalis 
Jl. Bathin alam, Sungai Alam Bengkalis-Riau 28711 
E-mail: jurnalinformatika@polbeng.ac.id
www.polbeng.ac.id

Web
Analytics
View My Stats