IDID

Jurnal Ilmu Komputer dan InformatikaJurnal Ilmu Komputer dan Informatika

Social media platforms like Twitter have become highly influential in shaping public opinion, making sentiment analysis on tweet data crucial. However, traditional techniques struggle with the nuances and complexities of informal social media text. This research addresses these challenges by conducting a comparative analysis between the non-optimized BERT (Bidirectional Encoder Representations from Transformers) model and the BERT model optimized with Fine-Tuning techniques for sentiment analysis on Indonesian Twitter data using text mining methods. Employing the CRISP-DM methodology, the study involves data collection through Twitter crawling using the keyword biznet, data preprocessing steps such as case folding, cleaning, tokenization, normalization, and data augmentation, with the dataset split into training, validation, and testing subsets for modeling and evaluation using the IndoBERT-base-p1 model specifically trained for the Indonesian language. The results demonstrate that the Fine-Tuned BERT model significantly outperforms the non-optimized BERT, achieving 91% accuracy, 0.91 precision, 0.90 recall, and 0.91 F1-score on the test set. Fine-Tuning enables BERT to adapt to the unique characteristics of Twitter sentiment data, allowing better recognition of language and context patterns associated with sentiment expressions. The optimized model is implemented as a web application for practical utilization. This research affirms the superiority of Fine-Tuned BERT for accurate sentiment analysis on Indonesian Twitter data, providing valuable insights for businesses, governments, and researchers leveraging social media data.

This research confirms the superiority of the Fine-Tuned BERT model for sentiment analysis on Indonesian Twitter data.The Fine-Tuning process successfully adapted the BERT model to the characteristics of Twitter sentiment data, resulting in improved recognition of language and context patterns.The results demonstrate that Fine-Tuned BERT outperforms the non-optimized BERT model, providing valuable insights for utilizing Twitter data in various fields.

Future research could explore the integration of additional text preprocessing techniques, such as stemming or lemmatization, to further refine the data and potentially improve model accuracy. Investigating the use of different BERT variants or exploring alternative transformer-based models could also lead to enhanced performance in Indonesian Twitter sentiment analysis. Furthermore, expanding the dataset with a wider range of keywords and incorporating contextual information, such as user demographics or network characteristics, could provide a more comprehensive understanding of sentiment expression on Twitter and enable the development of more nuanced and accurate sentiment analysis models. These advancements would contribute to a more robust and reliable system for analyzing public opinion and extracting valuable insights from social media data.

  1. Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task... journals.sagepub.com/doi/10.1177/0165551521990616Unsupervised extractive multi document summarization method based on transfer learning from BERT multi task journals sagepub doi 10 1177 0165551521990616
  2. LiDA: Language-Independent Data Augmentation for Text Classification | IEEE Journals & Magazine |... doi.org/10.1109/ACCESS.2023.3234019LiDA Language Independent Data Augmentation for Text Classification IEEE Journals Magazine doi 10 1109 ACCESS 2023 3234019
Read online
File size712.27 KB
Pages16
DMCAReport

Related /

ads-block-test