Topic wise Segmentation based Hybrid Models for Sentiment Classification on Social Media Platforms

Albladi, Aish

View/Open

Albladi_PhD_Dissertation.pdf (3.274Mb)

Date

2025-07-29

Author

Albladi, Aish

Metadata

Show full item record

Abstract

Sentiment analysis is a crucial task in natural language processing that enables the extraction of meaningful insights from textual data, particularly from dynamic platforms. The research explores the development and evaluation of hybrid transformer-based models for sentiment classification, emphasizing stacking configurations and topic-wise segmentation for improved accuracy on social media datasets. Transformers like BERT, RoBERTa, XLNet, DistilBERT, and Electra were employed individually and in hybrid configurations. Experiments on the Sentiment140 and IMDb datasets demonstrate that hybrid models, particularly Electra+BERT, achieve significantly higher accuracy and robust classification performance, with test accuracies of 96.08% and 97.84%, respectively. The research extends to analyzing oppositional narratives on social media, distinguishing between conspiracy theories and critical narratives using fine-tuned RoBERTa and BERT models. The models achieved high performance, with MCC scores of 0.8050 for binary classification and an overall accuracy of 95% for identifying narrative elements. This work highlights the potential of hybrid models and advanced segmentation techniques to address complex NLP tasks, offering applications in sentiment analysis, public opinion monitoring, and misinformation detection. The Latent Dirichlet Allocation (LDA) model was integrated for topic segmentation, enabling enhanced feature selection and contextual understanding. Experiments on the Sentiment140 and IMDb datasets demonstrate that hybrid models, particularly Electra+BERT, achieve significantly higher accuracy and robust classification performance, with test accuracies of 98.44% and 98.38%, respectively. LDA segmentation further improved sentiment classification by refining decision boundaries, reducing misclassifications, and enhancing contextual insights.

URI

https://etd.auburn.edu/handle/10415/9870