An Artificial intelligence Approach to Fake News Detection in the Context of the Morocco Earthquake

The catastrophic earthquake that struck Morocco on September 8, 2023, garnered significant media coverage, leading to the swift dissemination of information across various social media and online platforms. However, the heightened visibility also gave rise to a surge in fake news, presenting formidable challenges to the efficient distribution of accurate information crucial for effective crisis management. This paper introduces an innovative approach to detection by integrating Natural language processing, bidirectional long-term memory (Bi-LSTM), convolutional neural network (CNN), and hierarchical attention network (HAN) models within the context of this seismic event. Leveraging advanced machine learning,deep learning, and data analysis techniques, we have devised a sophisticated fake news detection model capable of precisely identifying and categorizing misleading information. The amal gamation of these models enhances the accuracy and efficiency of our system, addressing the pressing need for reliable information amidst the chaos of a crisis.


Background of deep learning in fake news detection
Deep learning (DL) is a subfield of machine learning (ML) and artificial intelligence (AI).DL is increasingly being used to detect fake news.AI a branch of computer science, aims to create intelligent machines that can simulate human intelligence.Deep learning, a method within AI, uses artificial neural networks to learn from data.Detecting fake news poses a complex challenge, requiring an understanding of the semantics and context of a text.Traditional fake news detection methods rely on predefined rules or statistical models, yet these approaches have limitations in terms of accuracy and coverage.
Deep learning presents a promising alternative for detecting fake news.Through artificial neural networks, it becomes possible to learn semantic and contextual representations of sentences and texts.These representations can then be used to classify news items based on their veracity. (15)owever, detecting fake news using deep learning remains a challenge.Neural networks need to be trained on large and diverse datasets to effectively detect fake news.In addition, detecting fake news can be difficult because it can be subtle and spread rapidly across social networks and online news sites.Natural language processing (NLP), a branch of artificial intelligence focuses on the use of natural language in interactions between people and computers.
NLP has its roots in linguistics, computer science, and cognitive psychology, among other disciplines.Early NLP research was primarly rulebased, analyzing and interpreting natural language data using explicit rules and grammars.However these methods, relying on handcrafted rules and unable to consider the complexity of natural language and their limitations.The popularity of statistical techniques in NLP increased in the 1980s and 1990s.Too make NLP systems more precise and scalable, these approaches employed machine learning algorithms to automatically discern patterns and structures in natural language data.Two noteworthy developments of the first statistical machine translation systems and the introduction of the Hidden Markov Model (HMM). (7)ta and Metadata.2024; 3:.377 2 The advent of deep learning in the 2000s further advanced the field of NLP.Deep learning algorithms, such as neural networks, facilitate the automatic learn ing of complex representations of natural language data.This has led to significant progress in a wide range of NLP applications, including machine translation. (8), speech recognition (11) and sentiment analysis. (9)Deep learning architectures, unsupervised learning methods, and the incorporation of multimodal data are continuously evolving, keeping NLP a dynamic and rapidly advancing (e.g., text and images).A diverse range of practical applications, such as chatbots for customer service and medical diagnosis tools, are also increasingly incorporating NLP. (10)

Related work
The field of fake news detection in the context of the September 8, 2023 earthquake in Morocco is continually evolving, with numerous approaches and technologies in development to identify and counter disinformation.Here is a state of theart overview of the main advanced methods for detecting fake news related to this event, with a focus on one of the most commonly used methods, which is textual content analysis: Textual content analysis is one of the most commonly used approaches for detecting earthquakerelated fake news.Researchers employ natural language processing (NLP) techniques to analyze the text of news articles (14) , social media posts, and other sources of information.They search for indicators of fake news, such as unsupported claims or inconsistencies in the narrative. (16)The integration of CNN (Convolutional Neural Network), Bidirectional Long ShortTerm Memory (BiLSTM), and HAN (Hierarchical Attention Network) constitutes a dynamic research field with numerous applications in computer vision, natural language processing, and more.Here is an overview of the current state of this approach:

Bi-LSTM
Bidirectional longshort term memory models (Bi-LSTM) are a type of recurrent neural network architecture (RNN) commonly used for sequential data process-ing, including tasks such as natural language processing (NLP). (20)The Bi-LSTM model extends the traditional LSTM model by considering information from past and future steps of the input sequence.In Bi-LSTM, the input se quence is processed in two directions: forward and backward.This means that the input sequence is fed into two separate LSTM networks.One processes the sequence in its original order (forward LSTM), and the other processes it in reverse order (reverse LSTM).The forward and backward LSTM outputs are then concatenated at each time step.

Cell State
Bi-LSTM cells feature numerous layers for each iteration T , including an input layer X t , an output layer h t , and a hidden layer h t−1 .Every cell shares some states with other cells during training or parameter updates, as shown in Figure William Yang Wang et al. (18) proposes a method for detecting fake information on social media using bidirectional LSTM neural networks.The authors constructed a dataset of 16 000 English tweets containing both true and false information and achieved an accuracy of 85 % with their BiLSTM model.
Monti et al. (22) proposed a method to detect fake news on social networks us ing convolutional Neural Networks and BiLSTM networks.They compared their approach to other existing methods for disinformation detection and demonstrated that their model was more effective.They also employed a deep learning geometry method to enhance accuracy.
Jia et al. (23) proposed a method for detecting fake news using BiLSTM neural networks, considering the author's stance relative to the facts presented in the text.The authors used a dataset containing English news articles and demonstrated that their model was more effective than other approaches to disinformation detection.
Tan et al. (24) proposed a BiLSTM with an attention mechanism for detecting fake news.The model was trained on a dataset of news articles labeled as either real or fake and achieved state of the art performance compared to other models on the same dataset.The study used in (25) a BiLSTM with a self attention mechanism to detect fake news in social media by incorporating social context information.The model achieved state of the art performance on a dataset of tweets labeled as either real or fake.
Zhang et al. (34) proposed an attention based BiLSTM for detecting fake news that can capture both local and global dependencies in text.The model was evaluated on two datasets of news articles labeled as either real or fake and achieved state of the art performance compared to other models on the same datasets.This study proposed an attention based BiLSTM for detecting rumors on social media using multi task learning, which involves jointly learning to classify rumors and identify the source of the rumor.The model achieved state of the art performance on a dataset of tweets labeled as either rumors or no rumors.
Feng et al. (28) proposed a BiLSTM with graph attention for detecting fake news in news articles.The model incorporates the relationships between entities in the text using a graph structure and achieved state of the art performance on a dataset of news articles labeled as either real or fake.
Goyal et al. (32) proposed a model for fake news detection that combines BiLSTM with Attention Mechanism and CNN.The proposed model achieved an F1-score of 0,94 on the LIAR dataset, outperforming several state of the art models.
Wu et al. (29) proposed a model for fake news detection that combines BiL-STM with CNN and Attention Mechanism.The proposed model achieved an F1-score of 0,91 on the LIAR dataset, outperforming several state of the art models.
Panigrahi et al. (30) proposed a model for fake news detection model that inte grates BiLSTM with CNN and Attention Mechanism.achieving an outstanding F1 score of 0,89 on the LIAR dataset.Similarly, Qian et al. (31) proposed a model employing the for fake news detection that combines BiLSTM with CNN and Attention Mechanism.The proposed model achieved an F1 score of 0,83 on the LIAR dataset, outperforming several state of the art models.
In a study by Hamidian et al. (33) the authors proposed a BiLSTM-based model that uses word embedding and attention mechanism to detect fake news.The model achieved an accuracy of 90 % on a dataset of news articles.
In another study by Zhang et al. (34) , the authors used a BiLSTM-CNN model for fake news detection.The model takes advantage of both the sequential and convolutional nature of the input data to extract relevant features.The authors evaluated their model on two benchmark datasets and achieved state of the art results.Similarly, a study by Arora et al. (48) proposed a BiLSTM-based model that uses contextual embeddings and attention mechanism for fake news detection.The authors evaluated their model on a dataset of news articles and achieved an accuracy of 92,4 %.
In a recent study by Chen et al. (36) , the authors proposed a multitask BiLSTM-based model for fake news detection.The model performs two tasks si multaneously: identifying the veracity of news articles and detecting their stance.The authors evaluated their model on a dataset of political news articles and achieved state of the art results on both tasks.
Qiu et al. (37) describes a Bi-LSTM based system for recognizing handwritten characters using the NIST dataset.The authors preprocess the data using image normalization and data augmentation techniques, and use a Bi-LSTM network to classify the characters.They achieve an accuracy of 97,3 % on the NIST dataset.Sun et al. (38) describes a deep learning system for recognizing handwritten English words using the NIST dataset.The authors use a Bi-LSTM network to process the word images and achieve an accuracy of 93,6 %.They also com-pare the performance of their system to other approaches, including traditional feature based methods and other deep learning models.
Zhao et al. (39) describes a Bi-LSTM-based system for recognizing handwrit ten English words using the NIST dataset.The authors use a combination of data augmentation techniques, including rotation and translation, to increase the size of the training set.They achieve an accuracy of 95,5 % on the NIST dataset and compare their approach to other deep learning models.
Conneau in his paper (40) compares the performance of convolutional neural networks (CNNs) and Bi-LSTM networks on a range of text classification tasks.The authors find that the CNN models outperform the Bi-LSTM models on most of the tasks, but the BiLSTM models perform better on certain tasks where capturing long term dependencies in the text is important.
Kim et al. (41) proposes a CNN model for sentence Classification and compares its performance to traditional models such as bag of words and n gram models.The author finds that the CNN model outperforms the other models on several benchmark datasets.
Zhang et al. (42) analyze the performance of CNNs, Bi-LSTM networks, and other models for sentence classification across a range of hyperparameters and input settings.The authors find that Bi-LSTM networks perform well on smaller datasets, while CNNs perform better on larger datasets.They also provide recommendations for optimizing the performance of both models.
Tang et al. (43) propose a gated recurrent neural network (GRU) model for sentiment classification and compare its performance to traditional models such as support vector machines and decision trees.The authors find that the GRU model outperforms the other models on several benchmark datasets.These papers show that BiLSTM neural networks are a promising approach for fake news detection and fact checking.They can be used alone or in combination with other natural language processing techniques to improve the effectiveness of fake news detection.

CNN
Fake news detection is a critical task in the era of social media where fake news can spread rapidly and have serious real world consequences.CNNs have proven effective in detecting fake news due to their ability to automatically learn and extract relevant features from text data.CNN-based models have been successfully applied to various fake news detection tasks, such as identifying fake news articles, detecting fake news headlines, and detecting fake reviews.e of the primary challenges in fake news detection is the absence of large scale annotated datasets.
To tackle this issue, some researchers have proposed leveraging transfer learning techniques to utilize pre trained language models, such as BERT or GPT, aiming to enhance the performance of CNN-based models.(45,46)   Other researchers have concentrated on designing novel CNN architectures as shown in figure 2 for fake news detection.For example, Huang et al. proposed a dual channel CNN model that incorporates both character level and word-level embeddings to capture distinct levels of information in text. (47)Similarly, Qiu et al. introduced a hybrid CNN-LSTM model that combines the strengths f both architectures. (37)The hybrid model utilizes a CNN to extract local features from text and an LSTM to capture the sequential dependencies between words.Another important aspect of fake news detection is the identification of relevant features.Researchers have explored various approaches to feature selection, including using part of speech (POS) tags (48) , sentiment analysis (49) , and named entity recognition (NER).These features can be incorporated into CNN-based models to improve their performance.
As a matter of fact, some researchers have directed their attention toward mitigating adversarial attacks on CNN-based fake news detection models.Adversarial attacks involve adding imperceptible perturbations to the input data to mislead the model.To counteract this challenge, researchers have suggested employing adversarial training (50) Hierarchical Attention Network into CNN-based models.
Overall, CNN-based models have shown promising results in fake news detection, and future research will likely focus on developing more robust models that can handle adversarial attacks and better capture the nuances of human language.For instance, Zhang et al. (51) introduced a CNN-based model employing word embeddings and convolutional layers for classifying news articles as either fake or real.The model demonstrated an accuracy of 92,8 % on a dataset of news articles.
In a separate study, Yang et al. (52) presented a CNN-based model that incorporates both text and visual information to identify fake news headlines.This model achieved an F1 score of 0,794 on a dataset of fake and real news headlines.Moreover, Wang et al. (53) introduced a CNN-based model that integrates linuistic features and Hierarchical Attention Network to detect fake reviews.The model demonstrated an accuracy of 92,8 % on a dataset comprising both fake and real review One of the popular approaches for fake news detection based on CNNs involves using a combination of convolutional and max pooling layers to capture local features and their interactions across different regions of the input text.For example, in the work of Kim (54) , a CNN model was trained on a large dataset of news articles and social media posts to identify fake news.The model consisted of multiple convolutional and pooling layers, followed by a fully connected layer for classification.The results showed that the proposed CNN model outperformed other methods in detecting fake news.Another approach for fake news detection based on CNNs involves using pretrained word embeddings to represent the input text and employing multiple filters of different sizes to capture different levels of granularity in the input text.For example, similarly of Ma et al. (55) , a CNN model was trained on a dataset of news articles and tweets to detect fake news.The model used pretrained word embeddings and multiple filters of different sizes to capture local features and their interactions.The results showed that the proposed CNN model achieved high accuracy in detecting fake news.
So much so that, some researchers have also explored the use of Hierarchical Attention Network in CNNs for fake news detection.For example, in the work of Zhang et al. (56) , a CNN model with attention was proposed to detect fake news from social media platforms.The attention mechanism was assigned weights to different words in the input text based on their relevance to the fake news detection task.The results showed that the CNN model proposed e outperformed other methods for detecting fake news.Yang et al. (57) proposes a multichannel CNN for fake news detection that combines character, word, and document embeddings.The authors conduct experiments on a large scale dataset and show that their model outperforms several baselines, including traditional machine learning models and other neural network architectures.Wang et al. (58) presents a hybrid model that combines a CNN and a bidirectional LSTM for fake news detection.The CNN is used to extract features from word embeddings, and the bidirectional LSTM is used to capture contextual information.The authors evaluate their model on a benchmark dataset and show that it outperforms several baselines.Zhang et al. (59) proposes a CNN-based approach for fake news detection using microblog data.The authors use pretrained word embeddings and a CNN to learn local and global features from the text.They evaluate their model on a Chinese microblog dataset and show that it outperforms several baselines.Ghosh et al. (60) explores the use of multimodal features and CNNs for fake news detection.The authors use both textual and visual features to train a CNN-based model.They evaluate their model on a benchmark dataset and show that incorporating visual features can improve performance.
Thus, Wang et al. (61) proposes a hierarchical CNN with attention for fake news detection.The authors use a two level CNN architecture to capture both local and global features from the text, and then apply attention to highlight important features.They evaluate their model on a benchmark dataset and show that it outperforms several baselines.

HAN
HAN is a type of deep learning model that is designed to handle hierarchical structures in text data.The HAN model consists of two levels of attention: word level attention and sentence level attention.The word level attention layer assigns different weights to each word in the sequence, based on its importance in the context of the news article.The sentence level attention layer assigns different weights to each sentence in the news article, based on its importance in determining whether the news article is real or fake.Several recent studies have explored the use of HAN for fake news detection.Some of the notable studies are as follows: The Gated Recurrent Unit (GRU) incorporates gating units that regulate the information flow within the unit, similar to an LSTM unit.Nonetheless, GRU possesses fewer parameters since it lacks an output gate, resulting in a simpler structure compared to the LSTM process. (63)The fundamental formulation of a GRU layer is as follows: The reset gate r t determines how the new input collaborates with the previous memory, while the update gate z t specifies the extent to which the previous memory is integrated into the current time step.Additionally, the term h ˜t represents the candidate activation of the hidden state h t .Yang et al. (64) proposed a HAN-based model that combines word level and sentence level Hierarchical Attention Network to capture the hierarchical structure of news articles.The model performed well on several reference benchmark datasets.
Zhang et al. (65) proposed a HAN-based approach for fake news detection that combines HAN with Convolutional Neural Networks (CNN).They used a dataset of labeled news articles and achieved an accuracy of 91,73 %.
Li et al. (66) proposed a HAN-based approach for fake news detection that uses a novel feature fusion mechanism.They used a dataset of labeled news articles and achieved an accuracy of 95,7 %.
Karami et al. (67) proposed a HAN-based approach for fake news detection that uses a transfer learning technique.They used a dataset of labeled news articles and achieved an accuracy of 96,15 %.
In a more recent study published in the IEEE Access, Wang et al. (68) proposed a HAN-based model that uses both textual and visual features to detect fake news in social media.The authors incorporated a convolutional neural network (CNN) to extract visual features from images, and a HAN to capture the hierarchical structure of the textual content.The proposed model achieved higher accuracy than several other baseline models.
Another study published in the IEEE Access by Li et al. (69) proposed a HAN-based model that utilizes multitask learning to simultaneously detect fake news and distinguish between different types of fake news.The model performed well on several datasets, demonstrating the effectiveness of the proposed approach.In a study published in the Journal of Ambient Intelligence and Humanized Computing, Dai et al. (71) developed a HAN-based framework that incorporates external knowledge graphs for detecting fake news in news articles.The model achieved state of the art performance on several benchmark datasets.
Kaur et al. (72) proposed a HAN model that incorporates both textual and visual features for detecting fake news in images.The model achieved high accuracy in detecting fake news images on the Snopes dataset.
Other studies have explored the use of HANs in combination with other techniques, such as graph convolutional networks (GCNs) and adversarial training.For instance, Chen et al. (73) proposed a HAN-GCN model that integrated graph information to achieve better performance in fake news detection.In a study by Chen et al. (73) , a HAN model with adversarial training was shown to improve the robustness of fake news detection.
Other studies have explored the use of HANs specifically for identifying the linguistic features of fake news.For example, in a study published in the journal Applied Sciences, Gao et al. [74) developed a fake news detection model that used HANs to analyze the linguistic complexity of news articles.The model achieved an accuracy of 87,3 % on a dataset of real and fake news articles.
Finally, in a study published in the Information Processing and Management, Zhou et al. (70) proposed a HAN based model that leverages external knowledge sources to improve fake news detection.Specifically, the authors incorporated the knowledge from external sources such as word embeddings and social network structures into the HAN model.The proposed model outperformed several other baseline methods on two benchmark datasets.
Similarly, some studies have explored the use of HANs in combination with other types of data, such as social media data or user comments.For example, in a study published in the journal Social Network Analysis and Mining, Hu et al. (75) developed a fake news detection model that combined HANs with social network analysis techniques.The model achieved an accuracy of 83,9 % on a dataset of fake news articles and user comments.
In summary, HANs have shown great potential in the detection of fake news and propaganda, particularly when combined with other machine learning techniques and multimodal data sources.The studies mentioned above provide important insights into the use of HANs for fake news detection and offer promising directions for future research in this area.

A comparative study of the advantages between CNN, BILSTM and HAN
Convolutional neural networks (CNNs), bidirectional LSTM networks (BILSTMs), and Hierarchical Attention Network are all deep learning techniques used to solve natural language processing problems, such as text classification, machine translation, etc.Each of these techniques has specific advantages (76) as described below: CNNs are very effective at extracting features from structured data such as images, videos, or word sequences.They are also very fast and can process large amounts of data in a short time, making them suitable for tasks such as large scale text classification. (89)The advantages of CNNs for text classification tasks are that they can be used to extract relevant features from words and sentences and are less likely to be overfitted to a training dataset. (90)ILSTM: Bidirectional LSTM networks are very effective at processing sequences of data, such as sequences of words, and can retain information about the sequences before and after a given word. (91)This allows BILSTMs to capture contextual information and better understand the meaning of words within a sentence or document.The advantages of BILSTMs for text classification tasks are that they can understand the overall context of sentences and documents and are less likely to be overfitted to a training dataset. (92)AN: Hierarchical Attention Network are very effective in selecting the most important parts of a data sequence, such as the parts of a sentence that are most relevant to a text classification task.They can also be used to weight different parts of a sentence according to their importance, which can improve the accuracy of text classification. (93)The advantages of Hierarchical Attention Network for text classification tasks are that they can identify the most relevant parts of a document and that they are very flexible and can be used in combination with other deep learning techniques, such as BILSTM and CNN. (94)n summary, each deep learning technique has its specific advantages (77,78,79,80,81,82,83) for text classification tasks.CNNs are very fast and efficient at extracting rele vant features from large amounts of data, while BILSTMs can better understand the overall context of sentences and documents.HAN are very effective in identifying the most relevant parts of a document and can be used in combination with other deep learning techniques to improve the accuracy of text classification.

The proposed model and its architecture
In this proposed model as shown in figure 4, we suggest an approach for detecting fake news using a CNN, a BiLSTM and a HAN.This combined architecture.Leverages the characteristics of different techniques to enhance the performance of fake news detection.Data Collection: our model harnesses the power of multisource data integra tion, drawing information from a variety of channels to create a comprehensive dataset.By aggregating data from social media platforms, news articles, blogs, and other online sources, the model captures the nuanced landscape of information dissemination.This diverse input ensures that the model is exposed to a wide array of linguistic styles, enabling it to develop a robust understanding of language nuances and patterns.
Text Preprocessing: a crucial step in the model's pipeline is meticulous text preprocessing. (84)Through advanced text cleaning, tokenization, and stemming (85) the model effectively filters out irrelevant information, noise, and inconsistencies within the dataset.This preprocessing not only enhances the model's efficiency but also contributes to its ability to discern genuine content from deceptive narratives.
Feature Extraction with NLP: Our model excels in extracting NLP features that encapsulate the semantic and syntactic richness of textual data.By lever aging techniques such as: Bag of Words (BoW): we use BoW representation to extract features from the text, creating a vector that reflects the frequency of each word. (86)F-IDF (Term Frequency Inverse Document Frequency): this technique gives more weight to rare words that may be more informative when they appear. (87)ord Embeddings: pretrained models such as Word2Vec, GloVe, or Fast-Text are employed to capture semantic relationships between words.(86) Sentiment Analysis: NLP tools are used to assess the overall tone of the text, identifying emotions and polarities that could indicate potential misinformation.(88) CNN Model for Feature Extraction: the CNN model is used to extract features from the news headlines or content.Pretrained embedding vectors (e.g., Word2Vec, GloVe) are used to represent words.Convolutional filters with different window sizes are applied to the embedding vectors to capture local patterns.The features extracted by the CNN are flattened and fed into the fusion layer.
BiLSTM Model for Sequence Modeling: the BiLSTM model is used to capture sequential and contextual relationships in the data.The flattened features from the CNN are transformed into input sequences for the BiLSTM.The BiLSTM learns from these sequences and captures longterm dependencies.
HAN Model for Hierarchical Attention: the HAN model is used to capture hierarchical relationships between words and sentences in the news.Output sequences from the BiLSTM serve as input for the HAN which utilizes a Hierarchical Attention Network to weigh the importance of words and phrases in the final classification.
Activation function: an output layer with a softmax activation function (5) is used to predict the probability of a news article being true or fake.

DISCUSSION
Trained using gradient backpropagation with an appropriate loss function, such as log likelihood loss.Model evaluation is performed on the test set to assess its performance in terms of accuracy, recall, F1 score, etc.
Our Fake News detection model, employing conbination of CNN, BiLSTM and HAN, has demonstrated promising capabilities in identifying online fake news.An essential consideration is the practical application of this model in specific contexts and its real world performance.In our forthcoming article, we intend to assess its effectiveness using a pertinent dataset information related to the earthquake in Morocco.
The significance of such an application lies in the sensitive nature of information related to natural disasters.False information, rumors, and misleading details can not only create confusion but also endanger the lives of those affected by such events.Our aim is to assess to how effectively our model can contribute to the rapid and accurate verification of information related to this specific earthquake.By using a large dataset comprising press articles, tweets, social media posts, and other sources, we will analyze the performance of our model in a crisis context.our focus will particularly be on its ability to dis tinguish accurate information from false, identify misleading information, and provide reliable information to those affected.
However, we recognize the specific challenges associate with applying applying our model to a recent event, given the rapidly evolvings nature of information, potential contradictions among sourcesy, and the urgent need for accurate data.To address this, we plan to integrate real time update mechanisms into our model, enabling it to adapt to new information and contextual changes.
In summary, applying our model to the dataset related to the earthquake in Morocco represents a crucial step in our research.It will showcase the practcle value of our model in real world situations and contribute to the ongoing battle against fake news during times of crisis.The results of this study will be presented in our upcoming article, which we look forward to sharing with the scientific community and the public.

Figure 4 .
Figure 4. Proposed fake news detection model based on multiple deep learnig models