Deteksi Komentar Toxic Menggunakan BERT

Authors

  • Rama Nittia Khadifa Universitas Pamulang
  • Robi Farhan Anshori Universitas Pamulang
  • Humaidah Universitas Pamulang
  • Abdul Ridho Ramadhan Universitas Pamulang
  • Perani Rosyani Universitas Pamulang

Keywords:

Deteksi Komentar Toxic, DistilBERT, Klasifikasi Multi-label, Natural Language Processing, Moderasi Konten Otomatis, Transfer Learning, Bias dalam AI

Abstract

Pesatnya perkembangan media sosial diiringi dengan meningkatnya penyebaran komentar toksik yang dapat merusak lingkungan digital dan kesehatan mental pengguna. Penelitian ini bertujuan untuk mengembangkan model klasifikasi multi-label komentar toksik menggunakan arsitektur DistilBERT yang dimodifikasi dengan custom classification head untuk meningkatkan akurasi dan efisiensi. Dataset yang digunakan merupakan modifikasi dari Jigsaw Toxic Comment Classification dengan penyeimbangan kelas menjadi rasio 50:50. Model dilatih selama 2 epoch dengan hyperparameter optimal melalui proses fine-tuning. Hasil evaluasi menunjukkan model mencapai ROC-AUC 0,9016, precision 71,01%, recall 65,52%, dan F1-Score 0,6781. Model ini terbukti efisien dengan pengurangan parameter hingga 40% dibanding BERT-base, sehingga cocok untuk deployment dalam sistem moderasi konten semi-otomatis. Penelitian ini juga mengidentifikasi tantangan dalam mendeteksi kategori langka dan kontekstual, serta memberikan rekomendasi untuk mitigasi bias dan peningkatan generalisasi model di masa depan.

References

Bilal, M., Khan, A., Jan, S., & Musa, S. (2022). Context-Aware Deep Learning Model for Detection of Roman Urdu Hate Speech on Social Media Platform. IEEE Access, 10, 121133–121151. https://doi.org/10.1109/ACCESS.2022.3216375

Darusman, & Gata, W. (2025). Perbandingan Kinerja Machine Learning dan Deep Learning untuk Analisis Sentimen Fufufafa. Jl.Raya Jatiwaringin Cipinang Melayu, Kec.Makasar, Kota Jakarta Timur, 10(1), 28534471.

Dinarta, F. (2025). XLM-ROBERTA-BASED DETECTION OF HATE SPEECH IN INDONESIAN-ENGLISH CODE-MIXED TEXT.

Ghosh, K., & Senapati, A. (n.d.). Hate speech detection: a comparison of mono and multilingual transformer model with cross-language evaluation. https://hatespeechdata.com/

Gupta, B. (n.d.). Classification of Toxic Comments using Knowledge Distillation MSc Research Project Data Analytics.

Gupta, S., Kovatchev, V., Das, A., De-Arteaga, M., & Lease, M. (2025). Finding Pareto trade-offs in fair and accurate detection of toxic speech. Information Research, 30(iConf 2025), 123–141. https://doi.org/10.47989/ir30iConf47572

Hanifa, A., Fauzan, S. A., Hikal, M., & Ashfiya, M. B. (n.d.). PERBANDINGAN METODE LSTM DAN GRU (RNN) UNTUK KLASIFIKASI BERITA PALSU BERBAHASA INDONESIA COMPARISON OF LSTM AND GRU (RNN) METHODS FOR FAKE NEWS CLASSIFICATION IN INDONESIAN. https://covid19.go.id/p/hoax-buster.

Iftikhar, U., Ali, S. F., Mustafa, G., Bahar, N., & Ishaq, K. (2025). Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media. PeerJ Computer Science, 11. https://doi.org/10.7717/peerj-cs.3214

Ilham Maulana, M., Muslim, K., & Dwifebri, M. (2023). Klasifikasi Komentar Toxic Pada Sosial Media Menggunakan SVM, Information Gain dan TF-IDF.

Maslej-Krešňáková, V., Sarnovský, M., Butka, P., & Machová, K. (2020). Comparison of deep learning models and various text pre-processing techniques for the toxic comments classification. Applied Sciences (Switzerland), 10(23), 1–26. https://doi.org/10.3390/app10238631

Ningrum, D. Y. A., Daniati, E., & Najibulloh Muzaki, M. (2025). Perbandingan Model BERT dan RNN-LSTM pada Analisis Sentimen Aplikasi BRI Mobile (Vol. 4, Issue 2). https://subset.id/index.php/IJCSR

Musonzo, R. D. B., & Mulepa, J. (2025). Title TOXIC COMMENT CLASSIFICATION SYSTEM USING DEEP LEARNING: A COMPARATIVE A STUDY OF LSTM AND BERT MODELS. https://doi.org/10.5281/zenodo.15449805

Sahoo, N., Gupta, H., & Bhattacharyya, P. (2022). Detecting Unintended Social Bias in Toxic Language Datasets. http://arxiv.org/abs/2210.11762

Shanbhag, A., Jadhav, S., Thakurdesai, A., Sinare, R., & Joshi, R. (2025). PROCeedings of the Workshop on Beyond English: NLP for all Languages in an Era of LLM from Texts associated with Non-Contextual BERT or FastText? A Comparative Analysis. 27–33. https://doi.org/10.26615/978-954-452-105-9-004

Sushma, S., Nayak, S. K., & Krishna, M. V. (2025). Enhanced toxic comment detection model through Deep Learning models using Word embeddings and transformer architectures. Future Technology, 4(3), 76–84. https://doi.org/10.55670/fpll.futech.4.3.8

Teng, T. H., & Varathan, K. D. (2023). Cyberbullying Detection in Social Networks: A Comparison Between Machine Learning and Transfer Learning Approaches. IEEE Access, 11, 55533–55560. https://doi.org/10.1109/ACCESS.2023.3275130

Vaghasiya, D., Singh, A. D., Detroja, D., & Vaghasiya, V. (n.d.). Automated Detection of Hate Speech and Toxic Comments Using Machine Learning and Natural Language PROCessing. www.iafor.org

Agarwal, A., Bera, A., De, T. (2026). SafeText: A Unified Approach for Detecting and Mitigating Toxicity and Bias in Textual Data. In: Fachkha, C., Fung, B.C.M., Tchakounté, F. (eds) Safe, Secure, Ethical, Responsible Technologies and Emerging Applications. SAFER-TEA 2024. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 654. Springer, Cham. https://doi.org/10.1007/978-3-032-05832-4_5

T. Kasidakis, G. Polychronis, M. Koutsoubelias and S. Lalis, "Reducing the Mission Time of Drone Applications through Location-Aware Edge Computing," 2021 IEEE 5th International Conference on Fog and Edge Computing (ICFEC), Melbourne, Australia, 2021, pp. 45-52, doi: 10.1109/ICFEC51620.2021.00014.

D. Cedeno-Moreno, A. Delgado, E. E. C. Acosta, C. A. R. Rios and M. Vargas-Lombardo, "Analysis and Detection of Hate Speech: A Comparative Study of NLP Transformer Models," 2024 9th International Engineering, Sciences and Technology Conference (IESTEC), Panama City, Panama, 2024, pp. 398-403, doi: 10.1109/IESTEC62784.2024.10820231.

Downloads

Published

2025-12-26

How to Cite

Khadifa, R. N., Anshori, R. F., Humaidah, Ramadhan, A. R., & Rosyani, P. (2025). Deteksi Komentar Toxic Menggunakan BERT. AI Dan SPK : Jurnal Artificial Intelligent Dan Sistem Penunjang Keputusan, 3(2), 239–246. Retrieved from https://jurnalmahasiswa.com/index.php/aidanspk/article/view/3452

Most read articles by the same author(s)

1 2 3 > >> 

Similar Articles

<< < 7 8 9 10 11 12 13 > >> 

You may also start an advanced similarity search for this article.