A Social Media Sentiment Analysis Using Machine Learning Approaches

Main Article Content

Noor Salah Irzooqi Al-Agele
https://orcid.org/0009-0001-8568-1989
Didem KIVANÇ TÜRELİ

Abstract

Social media platforms like Twitter provide major means for individuals to express their opinions on various topics; therefore, a need for complex tools to distinguish between negative and positive attitudes in textual content. With consideration for the most suitable models for precisely classifying sentiments within social media data, this study aims to evaluate the efficacy of machine learning algorithms in analyzing sentiment text that people post or comment on Twitter, thus bridging the research gap in the analysis of sentiment in an understudied domain. a set of machine learning algorithms was applied along with feature extraction methods, including TF-IDF and Natural Language Processing (NLP). With an accuracy of 93%, the Random Forest (RF) model proved to be the most effective among other models, Because of its exceptional capacity and generating accurate and dependable results on textual data, the Random Forest (RF) model proves in the study to be the most optimal choice for sentiment analysis textual.

Article Details

How to Cite
Salah Irzooqi Al-Agele, N., & KIVANÇ TÜRELİ, D. (2025). A Social Media Sentiment Analysis Using Machine Learning Approaches. Tikrit Journal of Pure Science, 30(4), 70–82. https://doi.org/10.25130/tjps.v30i4.1916
Section
Articles

References

1. Bessarab A, Mitchuk O, Baranetska A, Kodatska N, Kvasnytsia O, Mykytiv G. Social networks as a phenomen on of the information society. Journal of Optimization in Industrial Engineering. 2021;14(29):17-24.

https://doi.org/10.22094/JOIE.2020.677811

2. Preis J, Klika D. Sustainability as a Message on Social Media: A Case Study of the World Economic Forum’s Twitter (Now X) Account. Problemy Ekorozwoju. 2024;19(2):126-38. https://doi.org/10.35784/preko.5673

3. Liu B. Sentiment analysis and opinion mining. Cham: Springer Nature; 2022.

https://doi.org/10.1007/978-3-031-02145-9

4. Olakangil A, Wang C, Nguyen J, Zhou Q, Jethwa K, Li J, et al., editors. Exploring Embeddings for Measuring Text Relatedness: Unveiling Sentiments and Relationships in Online Comments. 2023 Second International Conference on Informatics (ICI); 2023: IEEE. https://doi.org/10.1109/ICI60088.2023.10421308

5. Tabassum A, Patil RR. A survey on text pre-processing & feature extraction techniques in natural language processing. International Research Journal of Engineering and Technology (IRJET). 2020;7(06):4864-7.

6. Diwali A, Saeedi K, Dashtipour K, Gogate M, Cambria E, Hussain A. Sentiment analysis meets explainable artificial intelligence: a survey on explainable sentiment analysis. IEEE Transactions on Affective Computing. 2023. : https://doi.org/10.1109/TAFFC.2023.3296373

7. Sarker IH. Machine learning: Algorithms, real-world applications and research directions. SN computer science. 2021;2(3):160. https://doi.org/10.1007/s42979-021-00592-x

8. Maruthupandi J, Sivakumar S, Kumar VS, Srikaanth PB. Sem-AI: A Unique Framework for Sentiment Analysis and Opinion Mining Using Social Network Data. SN Computer Science. 2025;6(2):99. https://doi.org/10.1007/s42979-024-03628-0

9. Žitnik S, Blagus N, Bajec M. Target-level sentiment analysis for news articles. Knowledge-Based Systems. 2022;249:108939. https://doi.org/10.1016/j.knosys.2022.108939

10. Sharma S, Mehra R. Conventional machine learning and deep learning approach for multi-classification of breast cancer histopathology images—a comparative insight. Journal of digital imaging. 2020;33(3):632-54. https://doi.org/10.1007/s10278-019-00307-y

11. Tan KL, Lee CP, Lim KM. A survey of sentiment analysis: Approaches, datasets, and future research. Applied Sciences. 2023;13(7):4550. https://doi.org/10.3390/app13074550

12. Han K-X, Chien W, Chiu C-C, Cheng Y-T. Application of support vector machine (SVM) in the sentiment analysis of twitter dataset. Applied Sciences. 2020;10(3):1125.

https://doi.org/10.3390/app10031125

13. Chen J, Becken S, Stantic B. Lexicon based Chinese language sentiment analysis method. Computer Science and Information Systems. 2019;16(2):639-55. https://doi.org/10.2298/CSIS181015013C

14. Huang L, Liu Y, Huang W, Dong Y, Ma H, Wu K, et al. Combining random forest and XGBoost methods in detecting early and mid-term winter wheat stripe rust using canopy level hyperspectral measurements. Agriculture. 2022;12(1):74. https://doi.org/10.3390/agriculture12010074

15. Liu J, Cao M, Bai D, Zhang R, editors. Solar radiation prediction based on random forest of feature-extraction. IOP Conference Series: Materials Science and Engineering; 2019: IOP Publishing. https://doi.org/10.1088/1757-899X/658/1/012006

16. Li F. Novel Data-Driven Machine Learning Models for Heating Load Prediction: Single and Optimized Naive Bayes. International Journal of Advanced Computer Science & Applications. 2024;15(8). https://doi.org/10.14569/IJACSA.2024.0150866

17. Sundararajan SK, editor Detection of conjunctivitis with deep learning algorithm in medical image processing. 2019 Third International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC); 2019: IEEE. https://doi.org/10.1109/I-SMAC47947.2019.9032705

18. Ganaie MA, Hu M, Malik AK, Tanveer M, Suganthan PN. Ensemble deep learning: A review. Engineering Applications of Artificial Intelligence. 2022;115:105151. https://doi.org/10.1016/j.engappai.2022.105151

19. Soni T, Gupta D, Dutta M, editors. Machine Learning in Healthcare: Decision Trees for Asthma Risk Prediction. 2024 4th International Conference on Sustainable Expert Systems (ICSES); 2024: IEEE. https://doi.org/10.1109/ICSES63445.2024.10763341

20. Qian H, Pan Y, Wang X, Li Z. Research on the optimization of belief rule bases using the Naive Bayes theory. Frontiers in Energy Research. 2024;12:1396841. https://doi.org/10.3389/fenrg.2024.1396841

21. Cano A. Social media and machine learning2020. https://doi.org/10.5772/intechopen.78089

22. Alabdulatif A, Thilakarathne NN, Aashiq M. Machine Learning Enabled Novel Real-Time IoT Targeted DoS/DDoS Cyber Attack Detection System. Computers, Materials & Continua. 2024;80(3). https://doi.org/10.32604/cmc.2024.054610

23. Agerri R, Rigau G. Language independent sequence labelling for opinion target extraction. Artificial Intelligence. 2019;268:85-95. ttps://doi.org/10.48550/arXiv.1901.09755. https://doi.org/10.48550/arXiv.1901.09755

24. Siddiqui MK, Morales-Menendez R, Huang X, Hussain N. A review of epileptic seizure detection using machine learning classifiers. Brain informatics. 2020;7(1):5. https://doi.org/10.1186/s40708-020-00105-1

25. Lee CS, Cheang PYS, Moslehpour M. Predictive analytics in business analytics: decision tree. Advances in Decision Sciences. 2022;26(1):1-29. https://doi.org/10.47654/v26y2022i1p1-30

26. Patel TS, Patel DP, Sanyal M, Shrivastav PS. Prediction of heart disease and survivability using support vector machine and Naive Bayes algorithm. bioRxiv. 2023:2023.06. 09.543776. https://doi.org/10.1101/2023.06.09.543776

27. Mohammed MS, Talib HA. Using machine learning algorithms in intrusion detection systems: A review. Tikrit Journal of Pure Science. 2024;29:3. https://doi.org/10.25130/tjps.v29i3.1553

28. Yadav A, Vishwakarma DK. Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review. 2020;53(6):4335-85. https://doi.org/10.1007/s10462-019-09794-5

29. Sharma NA, Ali AS, Kabir MA. A review of sentiment analysis: tasks, applications, and deep learning techniques. International journal of data science and analytics. 2024:1-38.

https://doi.org/10.1007/s41060-024-00594-x

30. Zhang D, Wu C, Liu J. Ranking products with online reviews: A novel method based on hesitant fuzzy set and sentiment word framework. Journal of the Operational Research Society. 2020;71(3):528-42. https://doi.org/10.1080/01605682.2018.1557021

31. Singh H, Sharma V, Singh D. Comparative analysis of proficiencies of various textures and geometric features in breast mass classification using k-nearest neighbor. Visual Computing for Industry, Biomedicine, and Art. 2022;5(1):3. https://doi.org/10.1186/s42492-021-00100-1

32. Osman AIA, Ahmed AN, Chow MF, Huang YF, El-Shafie A. Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia. Ain Shams Engineering Journal. 2021;12(2):1545-56. https://doi.org/10.1016/j.asej.2020.11.011

33. Gomez-Adorno H, Bel-Enguix G, Sierra G, Barajas J-C, Álvarez W, editors. Machine learning and deep learning sentiment analysis models: Case study on the sent-covid corpus of tweets in mexican spanish. Informatics; 2024: MDPI. https://doi.org/10.3390/informatics11020024

34. Wankhade M, Rao ACS, Kulkarni C. A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review. 2022;55(7):5731-80. https://doi.org/10.1007/s10462-022-10144-1

35. Juluru K, Shih H-H, Keshava Murthy KN, Elnajjar P. Bag-of-words technique in natural language processing: a primer for radiologists. RadioGraphics. https://doi.org/10.1148/rg.2021210025

36. Erkan A, Güngör T. Analysis of Deep Learning Model Combinations and Tokenization Approaches in Sentiment Classification. IEEE Access. 2023;11:134951-68.

https://doi.org/10.1109/ACCESS.2023.3337354

37. Kalaivani ER, Marivendan ER. The effect of stop word removal and stemming in datapreprocessing. Annals of the Romanian Society for Cell Biology. 2021;25(6):739-46.

38. Jabbar A, Iqbal S, Tamimy MI, Rehman A, Bahaj SA, Saba T. An analytical analysis of text stemming methodologies in information retrieval and natural language processing systems. IEEE Access. 2023;11:133681-702.

https://doi.org/10.1109/ACCESS.2023.3332710

39. Ehrmanntraut A. Historical German text normalization using type-and token-based language modeling. arXiv preprint arXiv:240902841. 2024. https://doi.org/10.48550/arXiv.2409.02841

40. Jader R, Aminifar S. An Intelligent Gestational Diabetes Mellitus Recognition System Using Machine Learning Algorithms. Tikrit Journal of Pure Science. 2023;28(1):82-8.

https://doi.org/10.25130/tjps.v28i1.1269

41. Jena D, Rautaray J, Mishra P, editors. Summarization of document using feature selection method: TF-IDF. International Conference on Artificial Intelligence and Data Science Applications - 2023, ICAIDSC2023; 2025. https://doi.org/10.5120/icaidsc202409

42. Paramesha M, Rane NL, Rane J. Big data analytics, artificial intelligence, machine learning, internet of things, and blockchain for enhanced business intelligence. Partners Universal Multidisciplinary Research Journal. 2024;1(2):110-33. https://doi.org/10.5281/zenodo.12827323

43. Luengo J, García-Gil D, Ramírez-Gallego S, García S, Herrera F. Big data preprocessing. Cham: Springer; 2020. https://doi.org/10.1007/978-3-030-39105-8

44. Mahesh B. Machine learning algorithms-a review. International Journal of Science and Research (IJSR)[Internet]. 2020;9(1):381-6. https://doi.org/10.21275/ART20203995

45. Hussein DM, Beitollahi H. A hybrid deep learning model to accurately detect anomalies in online social media. Tikrit Journal of Pure Science. 2022;27(5):105-16. https://doi.org/10.25130/tjps.v27i5.24

46. Huang FL. Alternatives to logistic regression models in experimental studies. The Journal of Experimental Education. 2022;90(1):213-28. https://doi.org/10.1080/00220973.2019.1699769

47. Priyanka, Kumar D. Decision tree classifier: a detailed survey. International Journal of Information and Decision Sciences. 2020;12(3):246-69. https://doi.org/10.1504/IJIDS.2020.108141

48. Hashim EKM. Arabic sentiment analysis for determining terrorism supporters on Twitter using data mining techniques [Master's thesis]: University of Babylon, College of Information Technology, Iraq; 2019.

49. Bansal M, Goyal A, Choudhary A. A comparative analysis of K-nearest neighbor, genetic, support vector machine, decision tree, and long short term memory algorithms in machine learning. Decision Analytics Journal. 2022;3:100071. https://doi.org/10.1016/j.dajour.2022.100071

50. Halder RK, Uddin MN, Uddin MA, Aryal S, Khraisat A. Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. Journal of Big Data. 2024;11(1):113.

https://doi.org/10.1186/s40537-024-00973-y

51. RAGHUNANDA S, ALEX SA, KANAVALLI DA. Error Classification based on Multinomial NB and Random Forests. https://doi.org/10.36872/LEPI/V51I2/301127

52. Mansoor HH. Classification technique design for spam SMS [Master's thesis]. Baghdad, Iraq: Iraqi Commission for Computers and Informatics, Informatics Institute for Postgraduate Studies; 2019.

53. Al-Anzi FS. An effective hybrid stochastic gradient descent arabic sentiment analysis with partial-order microwords and piecewise differentiation. Fractals. 2022;30(08):2240222. https://doi.org/10.1142/S0218348X22402228

54. Ekici B. Deterministic and Stochastic Schemes for Unconstrained Optimization. 2023.

55. Tikosi K. Convergence results regarding stochastic gradient descent methods for dependent data streams [PhD thesis]: Central European University; 2021.

56. Hassan AF, Bhaya WS. Analysis of BBC News by Applying Classification Algorithms. Journal of Advanced Research in Dynamical and Control Systems. 2020;12(1):148-52. https://doi.org/10.5373/JARDCS/V12I1/20201023

57. Li Q, Peng H, Li J, Xia C, Yang R, Sun L, et al. A survey on text classification: From traditional to deep learning. ACM Transactions on Intelligent Systems and Technology (TIST). 2022;13(2):1-41. https://doi.org/10.1145/3495162