Deteksi Cyberbullying Menggunakan BERT dan Bi-LSTM

Authors

  • Fidya Farasalsabila Magister Informatika, Universitas Amikom Yogyakarta
  • Ema Utami Magister Informatika, Universitas Amikom Yogyakarta
  • Hanafi Hanafi Magister Informatika, Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.34151/jurtek.v17i1.4636

Keywords:

BERD, Bi-LSTM, Cyberbullying, Deep Learning

Abstract

Cyberbullying is a digital problem that is not a new phenomenon. This existed before the advent of social networks, and cyberbullying has a wide impact, including a person's mental and physiological conditions such as sadness, anxiety and depression. The main objective of this research is to develop an effective cyberbullying detection system using natural language processing techniques. The method used in this research includes the application of the BERT (Bi-Directional Encoder Representations from Transformers) and Bi-LSTM (Bi-Directional Long Short-Term Memory) models as a deep learning approach to analyze text and detect cyberbullying behavior. This approach allows the system to understand complex language contexts and capture patterns that traditional methods may find difficult to identify. Testing was carried out using a dataset that included various types of Indonesian language texts containing cyber bullying acts. The research results show that the combination of BERT and Bi-LSTM is able to provide superior detection performance with a high accuracy rate of 90% and the ability to identify variations of cyber bullying. This research makes a significant contribution to efforts to protect individuals from the negative impacts of cyber bullying through the development of a sophisticated and adaptive detection system.

Downloads

Download data is not yet available.

References

Akhter, Arnisha, Uzzal Kumar Acharjee, Md. Alamin Talukder, Md. Manowarul Islam, dan Md Ashraf Uddin. 2023. “A robust hybrid machine learning model for Bengali cyber bullying detection in social media.” Natural Language Processing Journal 4:100027. doi: 10.1016/j.nlp.2023.100027.

Albayari, Reem, dan Sherief Abdallah. 2022. “Instagram‐Based Benchmark Dataset for Cyberbullying Detection in Arabic Text.” Data 7(7). doi: 10.3390/data7070083.

Al-Garadi, Mohammed Ali, Mohammad Rashid Hussain, Nawsher Khan, Ghulam Murtaza, Henry Friday Nweke, Ihsan Ali, Ghulam Mujtaba, Haruna Chiroma, Hasan Ali Khattak, dan Abdullah Gani. 2019. “Predicting Cyberbullying on Social Media in the Big Data Era Using Machine Learning Algorithms: Review of Literature and Open Challenges.” IEEE Access 7:70701–18. doi: 10.1109/ACCESS.2019.2918354.

Anderson, Katie Elson. 2020. “Getting acquainted with social networks and apps: it is time to talk about TikTok.” Library Hi Tech News 37(4):7–12. doi: 10.1108/LHTN-01-2020-0001.

Chavan, Vikas S., dan Shylaja S. S. 2015. Machine Learning Approach for Detection of Cyber-Aggressive Comments by Peers on Social Media Network.

Chintalapudi, Nalini, Gopi Battineni, dan Francesco Amenta. 2021. “Sentimental analysis of COVID-19 tweets using deep learning models.” Infectious Disease Reports 13(2). doi: 10.3390/IDR13020032.

Eom, Gayeong, Sanghyun Yun, dan Haewon Byeon. 2022. “Predicting the sentiment of South Korean Twitter users toward vaccination after the emergence of COVID-19 Omicron variant using deep learning-based natural language processing.” Frontiers in Medicine 9. doi: 10.3389/fmed.2022.948917.

Hosmer, David W., Stanley. Lemeshow, dan Rodney X. Sturdivant. t.t. Applied logistic regression.

Ignatow, Gabe, dan Rada Mihalcea. 2018. An Introduction to Text Mining.

Jason Wang, Kaiqun Fu, dan Chang-Tien Lu. 2020. “Fine-Grained Balanced Cyberbullying Dataset.”

Jiang, Chunxiao, Haijun Zhang, Yong Ren, Zhu Han, Kwang Cheng Chen, dan Lajos Hanzo. 2017. “Machine Learning Paradigms for Next-Generation Wireless Networks.” IEEE Wireless Communications 24(2):98–105. doi: 10.1109/MWC.2016.1500356WC.

Joshi, Raunak, dan Abhishek Gupta. 2022. “Performance Comparison of Simple Transformer and Res-CNN-BiLSTM for Cyberbullying Classification.”

Ke, Guolin, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, dan Tie-Yan Liu. t.t. LightGBM: A Highly Efficient Gradient Boosting Decision Tree.

Li, Jun, Guimin Huang, Chunli Fan, Zhenglin Sun, dan Hongtao Zhu. 2019. “Key word extraction for short text via word2vec, doc2vec, and textrank.” Turkish Journal of Electrical Engineering and Computer Sciences 27(3):1794–1805. doi: 10.3906/elk-1806-38.

Medhat, Walaa, Ahmed Hassan, dan Hoda Korashy. 2014. “Sentiment analysis algorithms and applications: A survey.” Ain Shams Engineering Journal 5(4):1093–1113. doi: 10.1016/j.asej.2014.04.011.

Novianti, Fenny, dan Kiky Rizky Nova Wardani. 2023. “Analisis Sentimen Masyarakat Terhadap Data Tweet Traveloka Selama Rapid Test Antigen Menggunakan Algoritma Naïve Bayes.” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika) 8(3):922–33. doi: 10.29100/jipi.v8i3.3973.

Ogunleye, Bayode, dan Babitha Dharmaraj. 2023. “The Use of a Large Language Model for Cyberbullying Detection.” Analytics 2(3):694–707. doi: 10.3390/analytics2030038.

Rizki Aditya, Dio, Endang Supriyati, dan Tri Listyorini. 2022. Analisis Sentimen Pengguna Twitter Terhadap Rokok Elektrik (VAPE) di Indonesia Menggunakan Metode Naïve Bayes.

Setiawan, Jerry Cahyo, Kemas M. Lhaksmana, dan Bunyamin Bunyamin. 2023. “Sentiment Analysis of Indonesian TikTok Review Using LSTM and IndoBERTweet Algorithm.” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika) 8(3):774–80. doi: 10.29100/jipi.v8i3.3911.

Setiawan, Yudi, Nur Ulfa Maulidevi, dan Kridanto Surendro. 2023. “The Use of Dynamic n-Gram to Enhance TF-IDF Features Extraction for Bahasa Indonesia Cyberbullying Classification.” Hlm. 200–205 dalam ACM International Conference Proceeding Series. Association for Computing Machinery.

Yudi Setiawan. 2023. “Bahasa Cyberbullying Dataset (Source Data: Instagram, Twitter, and Youtube).” Mendeley Data.

Published

2024-05-04

How to Cite

Farasalsabila, F., Utami, E. ., & Hanafi, H. (2024). Deteksi Cyberbullying Menggunakan BERT dan Bi-LSTM. Jurnal Teknologi, 17(1), 1–6. https://doi.org/10.34151/jurtek.v17i1.4636