A Social Media Sentiment Analysis Using Machine Learning Approaches
Main Article Content
Abstract
Social media platforms like Twitter provide major means for individuals to express their opinions on various topics; therefore, a need for complex tools to distinguish between negative and positive attitudes in textual content. With consideration for the most suitable models for precisely classifying sentiments within social media data, this study aims to evaluate the efficacy of machine learning algorithms in analyzing sentiment text that people post or comment on Twitter, thus bridging the research gap in the analysis of sentiment in an understudied domain. a set of machine learning algorithms was applied along with feature extraction methods, including TF-IDF and Natural Language Processing (NLP). With an accuracy of 93%, the Random Forest (RF) model proved to be the most effective among other models, Because of its exceptional capacity and generating accurate and dependable results on textual data, the Random Forest (RF) model proves in the study to be the most optimal choice for sentiment analysis textual.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
Tikrit Journal of Pure Science is licensed under the Creative Commons Attribution 4.0 International License, which allows users to copy, create extracts, abstracts, and new works from the article, alter and revise the article, and make commercial use of the article (including reuse and/or resale of the article by commercial entities), provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made, and the licensor is not represented as endorsing the use made of the work. The authors hold the copyright for their published work on the Tikrit J. Pure Sci. website, while Tikrit J. Pure Sci. is responsible for appreciate citation of their work, which is released under CC-BY-4.0, enabling the unrestricted use, distribution, and reproduction of an article in any medium, provided that the original work is properly cited.
References
1. Bessarab A, Mitchuk O, Baranetska A, Kodatska N, Kvasnytsia O, Mykytiv G. Social networks as a phenomen on of the information society. Journal of Optimization in Industrial Engineering. 2021;14(29):17-24.
https://doi.org/10.22094/JOIE.2020.677811
2. Preis J, Klika D. Sustainability as a Message on Social Media: A Case Study of the World Economic Forum’s Twitter (Now X) Account. Problemy Ekorozwoju. 2024;19(2):126-38. https://doi.org/10.35784/preko.5673
3. Liu B. Sentiment analysis and opinion mining. Cham: Springer Nature; 2022.
https://doi.org/10.1007/978-3-031-02145-9
4. Olakangil A, Wang C, Nguyen J, Zhou Q, Jethwa K, Li J, et al., editors. Exploring Embeddings for Measuring Text Relatedness: Unveiling Sentiments and Relationships in Online Comments. 2023 Second International Conference on Informatics (ICI); 2023: IEEE. https://doi.org/10.1109/ICI60088.2023.10421308
5. Tabassum A, Patil RR. A survey on text pre-processing & feature extraction techniques in natural language processing. International Research Journal of Engineering and Technology (IRJET). 2020;7(06):4864-7.
6. Diwali A, Saeedi K, Dashtipour K, Gogate M, Cambria E, Hussain A. Sentiment analysis meets explainable artificial intelligence: a survey on explainable sentiment analysis. IEEE Transactions on Affective Computing. 2023. : https://doi.org/10.1109/TAFFC.2023.3296373
7. Sarker IH. Machine learning: Algorithms, real-world applications and research directions. SN computer science. 2021;2(3):160. https://doi.org/10.1007/s42979-021-00592-x
8. Maruthupandi J, Sivakumar S, Kumar VS, Srikaanth PB. Sem-AI: A Unique Framework for Sentiment Analysis and Opinion Mining Using Social Network Data. SN Computer Science. 2025;6(2):99. https://doi.org/10.1007/s42979-024-03628-0
9. Žitnik S, Blagus N, Bajec M. Target-level sentiment analysis for news articles. Knowledge-Based Systems. 2022;249:108939. https://doi.org/10.1016/j.knosys.2022.108939
10. Sharma S, Mehra R. Conventional machine learning and deep learning approach for multi-classification of breast cancer histopathology images—a comparative insight. Journal of digital imaging. 2020;33(3):632-54. https://doi.org/10.1007/s10278-019-00307-y
11. Tan KL, Lee CP, Lim KM. A survey of sentiment analysis: Approaches, datasets, and future research. Applied Sciences. 2023;13(7):4550. https://doi.org/10.3390/app13074550
12. Han K-X, Chien W, Chiu C-C, Cheng Y-T. Application of support vector machine (SVM) in the sentiment analysis of twitter dataset. Applied Sciences. 2020;10(3):1125.
https://doi.org/10.3390/app10031125
13. Chen J, Becken S, Stantic B. Lexicon based Chinese language sentiment analysis method. Computer Science and Information Systems. 2019;16(2):639-55. https://doi.org/10.2298/CSIS181015013C
14. Huang L, Liu Y, Huang W, Dong Y, Ma H, Wu K, et al. Combining random forest and XGBoost methods in detecting early and mid-term winter wheat stripe rust using canopy level hyperspectral measurements. Agriculture. 2022;12(1):74. https://doi.org/10.3390/agriculture12010074
15. Liu J, Cao M, Bai D, Zhang R, editors. Solar radiation prediction based on random forest of feature-extraction. IOP Conference Series: Materials Science and Engineering; 2019: IOP Publishing. https://doi.org/10.1088/1757-899X/658/1/012006
16. Li F. Novel Data-Driven Machine Learning Models for Heating Load Prediction: Single and Optimized Naive Bayes. International Journal of Advanced Computer Science & Applications. 2024;15(8). https://doi.org/10.14569/IJACSA.2024.0150866
17. Sundararajan SK, editor Detection of conjunctivitis with deep learning algorithm in medical image processing. 2019 Third International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC); 2019: IEEE. https://doi.org/10.1109/I-SMAC47947.2019.9032705
18. Ganaie MA, Hu M, Malik AK, Tanveer M, Suganthan PN. Ensemble deep learning: A review. Engineering Applications of Artificial Intelligence. 2022;115:105151. https://doi.org/10.1016/j.engappai.2022.105151
19. Soni T, Gupta D, Dutta M, editors. Machine Learning in Healthcare: Decision Trees for Asthma Risk Prediction. 2024 4th International Conference on Sustainable Expert Systems (ICSES); 2024: IEEE. https://doi.org/10.1109/ICSES63445.2024.10763341
20. Qian H, Pan Y, Wang X, Li Z. Research on the optimization of belief rule bases using the Naive Bayes theory. Frontiers in Energy Research. 2024;12:1396841. https://doi.org/10.3389/fenrg.2024.1396841
21. Cano A. Social media and machine learning2020. https://doi.org/10.5772/intechopen.78089
22. Alabdulatif A, Thilakarathne NN, Aashiq M. Machine Learning Enabled Novel Real-Time IoT Targeted DoS/DDoS Cyber Attack Detection System. Computers, Materials & Continua. 2024;80(3). https://doi.org/10.32604/cmc.2024.054610
23. Agerri R, Rigau G. Language independent sequence labelling for opinion target extraction. Artificial Intelligence. 2019;268:85-95. ttps://doi.org/10.48550/arXiv.1901.09755. https://doi.org/10.48550/arXiv.1901.09755
24. Siddiqui MK, Morales-Menendez R, Huang X, Hussain N. A review of epileptic seizure detection using machine learning classifiers. Brain informatics. 2020;7(1):5. https://doi.org/10.1186/s40708-020-00105-1
25. Lee CS, Cheang PYS, Moslehpour M. Predictive analytics in business analytics: decision tree. Advances in Decision Sciences. 2022;26(1):1-29. https://doi.org/10.47654/v26y2022i1p1-30
26. Patel TS, Patel DP, Sanyal M, Shrivastav PS. Prediction of heart disease and survivability using support vector machine and Naive Bayes algorithm. bioRxiv. 2023:2023.06. 09.543776. https://doi.org/10.1101/2023.06.09.543776
27. Mohammed MS, Talib HA. Using machine learning algorithms in intrusion detection systems: A review. Tikrit Journal of Pure Science. 2024;29:3. https://doi.org/10.25130/tjps.v29i3.1553
28. Yadav A, Vishwakarma DK. Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review. 2020;53(6):4335-85. https://doi.org/10.1007/s10462-019-09794-5
29. Sharma NA, Ali AS, Kabir MA. A review of sentiment analysis: tasks, applications, and deep learning techniques. International journal of data science and analytics. 2024:1-38.
https://doi.org/10.1007/s41060-024-00594-x
30. Zhang D, Wu C, Liu J. Ranking products with online reviews: A novel method based on hesitant fuzzy set and sentiment word framework. Journal of the Operational Research Society. 2020;71(3):528-42. https://doi.org/10.1080/01605682.2018.1557021
31. Singh H, Sharma V, Singh D. Comparative analysis of proficiencies of various textures and geometric features in breast mass classification using k-nearest neighbor. Visual Computing for Industry, Biomedicine, and Art. 2022;5(1):3. https://doi.org/10.1186/s42492-021-00100-1
32. Osman AIA, Ahmed AN, Chow MF, Huang YF, El-Shafie A. Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia. Ain Shams Engineering Journal. 2021;12(2):1545-56. https://doi.org/10.1016/j.asej.2020.11.011
33. Gomez-Adorno H, Bel-Enguix G, Sierra G, Barajas J-C, Álvarez W, editors. Machine learning and deep learning sentiment analysis models: Case study on the sent-covid corpus of tweets in mexican spanish. Informatics; 2024: MDPI. https://doi.org/10.3390/informatics11020024
34. Wankhade M, Rao ACS, Kulkarni C. A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review. 2022;55(7):5731-80. https://doi.org/10.1007/s10462-022-10144-1
35. Juluru K, Shih H-H, Keshava Murthy KN, Elnajjar P. Bag-of-words technique in natural language processing: a primer for radiologists. RadioGraphics. https://doi.org/10.1148/rg.2021210025
36. Erkan A, Güngör T. Analysis of Deep Learning Model Combinations and Tokenization Approaches in Sentiment Classification. IEEE Access. 2023;11:134951-68.
https://doi.org/10.1109/ACCESS.2023.3337354
37. Kalaivani ER, Marivendan ER. The effect of stop word removal and stemming in datapreprocessing. Annals of the Romanian Society for Cell Biology. 2021;25(6):739-46.
38. Jabbar A, Iqbal S, Tamimy MI, Rehman A, Bahaj SA, Saba T. An analytical analysis of text stemming methodologies in information retrieval and natural language processing systems. IEEE Access. 2023;11:133681-702.
https://doi.org/10.1109/ACCESS.2023.3332710
39. Ehrmanntraut A. Historical German text normalization using type-and token-based language modeling. arXiv preprint arXiv:240902841. 2024. https://doi.org/10.48550/arXiv.2409.02841
40. Jader R, Aminifar S. An Intelligent Gestational Diabetes Mellitus Recognition System Using Machine Learning Algorithms. Tikrit Journal of Pure Science. 2023;28(1):82-8.
https://doi.org/10.25130/tjps.v28i1.1269
41. Jena D, Rautaray J, Mishra P, editors. Summarization of document using feature selection method: TF-IDF. International Conference on Artificial Intelligence and Data Science Applications - 2023, ICAIDSC2023; 2025. https://doi.org/10.5120/icaidsc202409
42. Paramesha M, Rane NL, Rane J. Big data analytics, artificial intelligence, machine learning, internet of things, and blockchain for enhanced business intelligence. Partners Universal Multidisciplinary Research Journal. 2024;1(2):110-33. https://doi.org/10.5281/zenodo.12827323
43. Luengo J, García-Gil D, Ramírez-Gallego S, García S, Herrera F. Big data preprocessing. Cham: Springer; 2020. https://doi.org/10.1007/978-3-030-39105-8
44. Mahesh B. Machine learning algorithms-a review. International Journal of Science and Research (IJSR)[Internet]. 2020;9(1):381-6. https://doi.org/10.21275/ART20203995
45. Hussein DM, Beitollahi H. A hybrid deep learning model to accurately detect anomalies in online social media. Tikrit Journal of Pure Science. 2022;27(5):105-16. https://doi.org/10.25130/tjps.v27i5.24
46. Huang FL. Alternatives to logistic regression models in experimental studies. The Journal of Experimental Education. 2022;90(1):213-28. https://doi.org/10.1080/00220973.2019.1699769
47. Priyanka, Kumar D. Decision tree classifier: a detailed survey. International Journal of Information and Decision Sciences. 2020;12(3):246-69. https://doi.org/10.1504/IJIDS.2020.108141
48. Hashim EKM. Arabic sentiment analysis for determining terrorism supporters on Twitter using data mining techniques [Master's thesis]: University of Babylon, College of Information Technology, Iraq; 2019.
49. Bansal M, Goyal A, Choudhary A. A comparative analysis of K-nearest neighbor, genetic, support vector machine, decision tree, and long short term memory algorithms in machine learning. Decision Analytics Journal. 2022;3:100071. https://doi.org/10.1016/j.dajour.2022.100071
50. Halder RK, Uddin MN, Uddin MA, Aryal S, Khraisat A. Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. Journal of Big Data. 2024;11(1):113.
https://doi.org/10.1186/s40537-024-00973-y
51. RAGHUNANDA S, ALEX SA, KANAVALLI DA. Error Classification based on Multinomial NB and Random Forests. https://doi.org/10.36872/LEPI/V51I2/301127
52. Mansoor HH. Classification technique design for spam SMS [Master's thesis]. Baghdad, Iraq: Iraqi Commission for Computers and Informatics, Informatics Institute for Postgraduate Studies; 2019.
53. Al-Anzi FS. An effective hybrid stochastic gradient descent arabic sentiment analysis with partial-order microwords and piecewise differentiation. Fractals. 2022;30(08):2240222. https://doi.org/10.1142/S0218348X22402228
54. Ekici B. Deterministic and Stochastic Schemes for Unconstrained Optimization. 2023.
55. Tikosi K. Convergence results regarding stochastic gradient descent methods for dependent data streams [PhD thesis]: Central European University; 2021.
56. Hassan AF, Bhaya WS. Analysis of BBC News by Applying Classification Algorithms. Journal of Advanced Research in Dynamical and Control Systems. 2020;12(1):148-52. https://doi.org/10.5373/JARDCS/V12I1/20201023
57. Li Q, Peng H, Li J, Xia C, Yang R, Sun L, et al. A survey on text classification: From traditional to deep learning. ACM Transactions on Intelligent Systems and Technology (TIST). 2022;13(2):1-41. https://doi.org/10.1145/3495162