Multi-scale Machine Learning Prediction of the Spread of Arabic Online Fake News

Authors

  • Fatima Aljwari University of Jeddah, Jeddah, Saudi Arabia
  • Wahaj Alkaberi University of Jeddah, Jeddah, Saudi Arabia
  • Areej Alshutayri University of Jeddah, Jeddah, Saudi Arabia
  • Eman Aldhahri University of Jeddah, Jeddah, Saudi Arabia
  • Nahla Aljojo University of Jeddah, Jeddah, Saudi Arabia
  • Omar Abouola University of Jeddah, Jeddah, Saudi Arabia

DOI:

https://doi.org/10.18662/po/13.1Sup1/411

Keywords:

Arabic fake news, machine learning, naive bayes, logistic regression, random forest, TF-IDF

Abstract

There are a lot of research studies that look at "fake news" from an Arabic online source, but they don't look at what makes those fake news spread. The threat grows, and at some point, it gets out of hand. That's why this paper is trying to figure out how to predict the features that make Arabic online fake news spread. It's using Naive Bayes, Logistic Regression, and Random forest of Machine Learning to do this. Online news stories that were made up were used. They are found by using Term Frequency-Inverse Document Frequency (TF-IDF). The best partition for testing and validating the prediction was chosen at random and used in the analysis. So, all three machine learning classifications for predicting fake news in Arabic online were done. The results of the experiment show that Random Forest Classifier outperformed the other two algorithms. It had the best TF-IDF with an accuracy of 86 percent. Naive Bayes had an accuracy rate of 84%, and Logistic Regression had an accuracy rate of 85%, so they all did well. As such, the model shows that the features in TF-IDF are the most essential point about the content of an online Arabic fake news.

Author Biographies

Fatima Aljwari, University of Jeddah, Jeddah, Saudi Arabia

Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia

Wahaj Alkaberi, University of Jeddah, Jeddah, Saudi Arabia

Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia

Areej Alshutayri, University of Jeddah, Jeddah, Saudi Arabia

Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia

Eman Aldhahri, University of Jeddah, Jeddah, Saudi Arabia

Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia

Nahla Aljojo, University of Jeddah, Jeddah, Saudi Arabia

College of Computer Science and Engineering, Information system and Technology Department University of Jeddah, Jeddah, Saudi Arabia

Omar Abouola, University of Jeddah, Jeddah, Saudi Arabia

College of Computer Science and Engineering, Information system and Technology Department University of Jeddah, Jeddah, Saudi Arabia

References

Aldwairi, M., & Alwahedi, A. (2018). Detecting fake news in social media networks. Procedia Computer Science, 141, 215-222. https://doi.org/10.1016/j.procs.2018.10.171

Aphiwongsophon, S., & Chongstitvatana, P. (2018). Detecting fake news with machine learning method. 2018 15th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON) (pp. 528-531). IEEE. https://doi.org/10.1109/ECTICon.2018.8620051

Bai, J., Li, Y., Li, J., Yang, X., Jiang, Y., & Xia, S. T. (2022). Multinomial random forest. Pattern Recognition, 122, 108331. https://doi.org/10.1016/j.patcog.2021.108331

Balakrishnan, V., Ng, K. S., & Rahim, H. A. (2021). To share or not to share–The underlying motives of sharing fake news amidst the COVID-19 pandemic in Malaysia. Technology in Society, 66, 101676. https://doi.org/10.1016/j.techsoc.2021.101676

Blanquero, R., Carrizosa, E., Ramírez-Cobo, P., & Sillero-Denamiel, M. R. (2021). Variable selection for Naïve Bayes classification. Computers & Operations Research, 135, 105456. https://doi.org/10.1016/j.cor.2021.105456

Bodaghi, A., & Oliveira, J. (2021). The theater of fake news spreading, who plays which role? a study on real graphs of spreading on twitter. Expert Systems with Applications, 116110. https://doi.org/10.1016/j.eswa.2021.116110

Chen, Y., Zheng, W., Li, W., & Huang, Y. (2021). Large group activity security risk assessment and risk early warning based on random forest algorithm. Pattern Recognition Letters, 144, 1-5. https://doi.org/10.1016/j.patrec.2021.01.008

Creech, B. (2020). Fake news and the discursive construction of technology companies’ social power. Media, Culture & Society, 42(6), 952-968. https://doi.org/10.1177/0163443719899801

De Cock, M., Dowsley, R., Nascimento, A. C., Railsback, D., Shen, J., & Todoki, A. (2021). High performance logistic regression for privacy-preserving genome analysis. BMC Medical Genomics, 14(1), 1-18. https://doi.org/10.1186/s12920-020-00869-9

Della Vedova, M. L., Tacchini, E., Moret, S., Ballarin, G., DiPierro, M., & de Alfaro, L. (2018). Automatic online fake news detection combining content and social signals. In 2018 22nd Conference of Open Innovations Association (FRUCT) (pp. 272-279). FRUCT. https://doi.org/10.23919/FRUCT.2018.8468301

Diehl, T., & Lee, S. (2022). Testing the cognitive involvement hypothesis on social media: 'News finds me' perceptions, partisanship, and fake news credibility. Computers in Human Behavior, 128(107121), 345-354. https://doi.org/10.1016/j.chb.2021.107121

Elmadany, A., Abdul-Mageed, M., & Alhindi, T. (2020). Machine generation and detection of Arabic manipulated and fake news. In Proceedings of the Fifth Arabic Natural Language Processing Workshop (pp. 69-84). Association for Computational Linguistics https://aclanthology.org/2020.wanlp-1.7

Escolà-Gascón, Á., Dagnall, N., & Gallifa, J. (2021). Critical thinking predicts reductions in Spanish physicians' stress levels and promotes fake news detection. Thinking Skills and Creativity, 42, 100934. https://doi.org/10.1016/j.tsc.2021.100934

Fernandez, P. (2017). The technology behind fake news. Library Hi Tech News, 34 (7), 1-5.https://doi.org/10.1108/LHTN-07-2017-0054

Girgis, S., Amer, E., & Gadallah, M. (2018). Deep learning algorithms for detecting fake news in online text. In 2018 13th International Conference on Computer Engineering and Systems (ICCES) (pp. 93 97). IEEE. https://doi.org/10.1109/ICCES.2018.8639198

Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression,Vol. 398. John Wiley & Sons.

Islam, F., Alam, M. M., Hossain, S. S., Motaleb, A., Yeasmin, S., Hasan, M., & Rahman, R. M. (2020). Bengali Fake News Detection. In 2020 IEEE 10th International Conference on Intelligent Systems (IS) (pp. 281-287). IEEE. https://doi.org/10.1109/IS48319.2020.9199931

Jo, H., Park, S., Shin, D., Shin, J., & Lee, C. (2021). Estimating Cost of Fighting against Fake News during Catastrophic Situations. Telematics and Informatics, 101734. https://doi.org/10.1016/j.tele.2021.101734

Kaakinen, M., Sirola, A., Savolainen, I., & Oksanen, A. (2020). Shared identity and shared information in social media: development and validation of the identity bubble reinforcement scale. Media Psychology, 23(1), 25-51. https://doi.org/10.1080/15213269.2018.1544910

Kim, J., Aum, J., Lee, S., Jang, Y., Park, E., & Choi, D. (2021). FibVID: Comprehensive fake news diffusion dataset during the COVID-19 period. Telematics and Informatics, 64(1), 101688. https://doi.org/10.1016/j.tele.2021.101688

Lin, J., Tremblay-Taylor, G., Mou, G., You, D., & Lee, K. (2019). Detecting fake news articles. In 2019 IEEE International Conference on Big Data (Big Data) (pp. 3021-3025). IEEE. https://doi.org/10.1109/BigData47090.2019.9005980

Mehrolia, S., Alagarsamy, S., & Solaikutty, V. M. (2021). Customers response to online food delivery services during COVID‐19 outbreak using binary logistic regression. International journal of consumer studies, 45(3), 396-408. https://doi.org/10.1111/ijcs.12630

Nyow, N. X., & Chua, H. N. (2019). Detecting fake news with tweets’ properties. In 2019 IEEE Conference on Application, Information and Network Security (AINS) (pp. 24-29). IEEE. https://doi.org/10.1109/AINS47559.2019.8968706

Saadany, H., Mohamed, E., & Orasan, C. (2020). Fake or real? A study of Arabic satirical fake news. Proceedings of the 3rd International Workshop on Rumours and Deception in Social Media (RDSM), (pp. 70-80). Association for Computational Linguistics. https://aclanthology.org/2020.rdsm-1.0

Sastrawan, I. K., Bayupati, I. P. A., & Arsa, D. M. S. (2021). Detection of fake news using deep learning CNN-RNN based methods. ICT Express. In Press https://doi.org/10.1016/j.icte.2021.10.003

Sharma, U., Saran, S., & Patil, S. M. (2020). Fake News Detection using Machine Learning Algorithms. International Journal of Engineering Research & Technology (IJERT) NTASU, 9(03), 509-518, https://www.ijert.org/fake-news-detection-using-machine-learning-algorithms

Singh, M., Bhatt, M. W., Bedi, H. S., & Mishra, U. (2020). Performance of Bernoulli’s naive bayes classifier in the detection of fake news. Materials Today: Material Today Proceedings, 49(10), 1865-1870. https://doi.org/10.1016/j.matpr.2020.10.896

Song, C., Ning, N., Zhang, Y., & Wu, B. (2021b). Knowledge augmented transformer for adversarial multidomain multiclassification multimodal fake news detection.

Neurocomputing, 462, 88-100. https://doi.org/10.1016/j.neucom.2021.07.077

Song, C., Shu, K., & Wu, B. (2021a). Temporally evolving graph neural network for fake news detection. Information Processing & Management, 58(6), 102712. https://doi.org/10.1016/j.ipm.2021.102712

Wang, W. Y. (2017). "Liar, liar pants on fire": A new benchmark dataset for fake news detection. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics 2(P17-2067) (pp. 422–426). Association for Computational Linguistics https://doi.org/10.18653/v1/P17-2067

Yuan, H., Zheng, J., Ye, Q., Qian, Y., & Zhang, Y. (2021). Improving fake news detection with domain-adversarial and graph-attention neural network. Decision Support Systems, 151, 113633. https://doi.org/10.1016/j.dss.2021.113633

Zhang, P., Wang, R., & Xiu, N. (2022). Multinomial logistic regression classifier via lq, 0-proximal Newton algorithm. Neurocomputing, 468, 148-164. https://doi.org/10.1016/j.neucom.2021.10.005

Downloads

Published

2022-03-14

How to Cite

Aljwari, F., Alkaberi, W., Alshutayri, A., Aldhahri, E., Aljojo, N., & Abouola, O. (2022). Multi-scale Machine Learning Prediction of the Spread of Arabic Online Fake News. Postmodern Openings, 13(1 Sup1), 01-14. https://doi.org/10.18662/po/13.1Sup1/411

Issue

Section

Research Articles

Most read articles by the same author(s)