Mining Nuanced Weibo Sentiment with Hierarchical Graph Modeling and Self-Supervised Learning


1. Introduction

Weibo sentiment analysis [1,2,3] has become increasingly important, especially during the COVID-19 pandemic [4]. Platforms like Weibo have evolved into vital public spaces where people share opinions [5], express emotions, and foster community interactions. The surge in online engagement during this period highlighted Weibo as an invaluable resource for monitoring public sentiment in real-time. This has been particularly critical given the rise in mental health issues, such as anxiety, depression, and stress, that have accompanied prolonged periods of social isolation, health concerns, and economic uncertainty [6]. Accurate sentiment analysis of Weibo posts is essential to understanding collective moods, identifying emerging mental health risks, and providing timely insights to stakeholders such as government agencies, healthcare providers, and mental health professionals who rely on these analyses to measure societal emotional states and plan interventions as needed.
Yet, performing effective sentiment analysis on Weibo is complex and presents a series of challenges. Unlike traditional platforms, Weibo posts often utilize highly informal language, fragmented expressions, and numerous stylistic features unique to digital communication [7]. These include sarcasm, regional slang, dialects, abbreviations, and frequently abbreviated or implicit expressions due to character limits, which limit the context available for precise emotional interpretation [8]. Such nuances pose difficulties for sentiment analysis, as models struggle to recognize context-dependent emotions and attitudes [9,10,11]. For example, identifying sarcasm in a post or discerning the nuanced feelings of frustration or irony becomes particularly challenging. Accurately interpreting these complex and sensitive emotional cues is crucial, as it may enable the early detection of individuals who could be at risk for mental health issues, thereby contributing positively to public health and societal well-being [12,13].
Traditional Weibo sentiment analysis methods [2] largely rely on text-based models that use natural language processing (NLP) techniques [14,15,16] for sentiment classification. Although these NLP models have achieved notable success in detecting broad emotional categories (e.g., positive, negative, or neutral sentiments) [9,17,18], they often fall short of capturing the rich emotional texture present in Weibo posts. For instance, conventional models treat posts as simple sequences of text, neglecting the nuanced cues embedded in specific words or phrases that reveal the underlying emotions of the poster. In the example post, “Will Malaysia Airlines still deny it? What exactly are they hiding?”, the term “still” hints at ongoing frustration and distrust. Standard models, like BERT, may process this post as a single text sequence, potentially diluting the emphasis on the word “still” and missing the full emotional impact. This limitation underscores the need for advanced approaches that go beyond text processing, such as graph neural networks (GNNs), which can capture nuanced relationships between words and more accurately infer the sentiment expressed by the user [19].
Another significant limitation of existing sentiment analysis models is their dependency on a single loss function, typically a classification loss, to guide sentiment inference [20,21]. While this approach can work for simpler tasks, it constrains the model’s ability to capture complex and layered emotional states, as it reduces sentiment to predefined categories that may not fully encapsulate the diversity of human emotions. Single loss functions are insufficient for representing multi-faceted emotional content, as they ignore potential latent structures within emotional expressions [19,22]. In contrast, self-supervised learning offers a pathway to learn generalized emotional representations by leveraging large amounts of unlabeled data [23]. Self-supervised learning can reveal underlying patterns and structures within data, allowing models to infer emotions that go beyond traditional categories [24,25]. Without incorporating these self-supervised objectives, existing models struggle to generalize effectively beyond labeled datasets, limiting their ability to detect subtle emotional cues that might arise from complex linguistic and contextual nuances.

To address these limitations, we propose a novel graph-based framework that integrates self-supervised learning to significantly enhance Weibo sentiment analysis. Our approach introduces a unique “sentiment graph” structure that leverages both word-to-post and post-to-post connections. Unlike traditional models that treat text as isolated sequences, this sentiment graph forms a relational network where words, phrases, and entire posts are treated as interconnected nodes. These connections enable the model to capture fine-grained emotional cues and context-dependent meanings within Weibo posts, particularly those subtle cues that are often overlooked in sequence-based processing. The sentiment graph goes beyond conventional sentiment analysis by embedding two distinct types of relationships: semantic connections between words within individual posts and contextual connections between posts based on thematic or emotional similarity. This dual-level relational structure allows the model to understand not only the immediate sentiment expressed by a post but also how similar sentiments might manifest across different posts in nuanced ways. For instance, it can discern shifts in emotional tone across a user’s posts over time, detect recurring themes of frustration or distrust, and differentiate between subtle variations in sentiment that traditional text-based approaches might miss.

Beyond simply applying graph neural network, our approach is further distinguished by the introduction of a novel gated mechanism within the graph, a key contribution that enhances the model’s ability to capture nuanced sentiment variations. This gating function, integrated within the sentiment graph, allows the model to dynamically filter sentiment signals from neighboring nodes based on their intensity and relevance. By selectively controlling the flow of information, this gated mechanism enhances sensitivity to subtle emotional cues, such as sarcasm or shifts in sentiment intensity, which are common in social media discourse. This enables the model to better differentiate and represent the complex emotional landscape within Weibo posts.

Further distinguishing our approach is the integration of self-supervised learning within the graph framework. We employ a novel self-supervised loss function that operates on the structure of the sentiment graph itself, enabling the model to learn nuanced representations of emotional relationships without relying solely on labeled data. This self-supervised objective is designed to capture latent emotional structures within the graph, such as implicit hierarchies or clusters of sentiment expressions, which can enhance the model’s ability to generalize beyond predefined sentiment categories. By dynamically adjusting to patterns uncovered within unlabeled data, our model can detect complex, layered emotional states that emerge from both the linguistic and relational context in Weibo posts. Integrating sentiment graph and self-supervised learning, we enhance Weibo sentiment classification in a large margin.

In summary, this paper makes four key contributions to advancing Weibo sentiment analysis. First, we introduce a novel sentiment graph framework that leverages the relational connections between words, phrases, and posts, enabling richer contextual understanding than sequence-based models. Second, our dual-level relational structure captures both semantic and contextual relationships, allowing the model to interpret nuanced sentiment patterns across posts. Third, we propose a gated mechanism within the graph framework, enabling the model to selectively adjust information flow based on sentiment intensity—a critical improvement for capturing the nuanced sentiment shifts in social media. Fourth, by incorporating a self-supervised learning objective tailored to the sentiment graph, our model learns complex emotional representations without heavy reliance on labeled data, making it adaptable to diverse, unlabeled social media contexts. Together, these innovations represent a significant step forward in accurately capturing the dynamic, multi-layered emotional landscape of social media platforms like Weibo.

4. Methodology

As shown in Figure 1, this section describes the steps in our proposed approach for Weibo sentiment classification, including the construction of the sentiment graph, feature transformation using FastText embeddings [44], and model training with a novel self-supervised loss. Specifically, FastText is a representation approach that can transform words to embeddings to vectorize Weibo Post.

4.1. Construction of Sentiment Graph

To capture the nuanced sentiments in Weibo posts, we construct a sentiment graph G = ( V , E ) , where V denotes the set of nodes, and E denotes the set of edges. The nodes in V are of two types, “word nodes” and “post nodes”, connected through two types of edges: “word-to-post” and “post-to-post”.

For each post p V , we link a word node w V to p if w appears in p. Formally, we define an edge ( w , p ) E if w p . This inclusion-based link enables the model to associate specific words with posts, capturing the lexical structure of posts and the contribution of individual words to post sentiment.

To capture the semantic similarity between posts, we define edges between post nodes based on content similarity. Given the FastText embeddings e p i and e p j of two posts p i and p j , we compute the cosine similarity:

sim ( p i , p j ) = e p i · e p j | e p i | | e p j | .

If sim ( p i , p j ) > 0.6 , we establish an edge ( p i , p j ) E . This thresholded connection allows the model to capture relational context across similar posts, reflecting trends and contextual sentiment relationships.

The resulting graph G captures both lexical content through word-to-post edges and contextual similarity through post-to-post edges, forming a hybrid structure that models both individual post composition and broader relational patterns.

4.2. Feature Transformation Using FastText Embeddings

To leverage the sentiment graph G , we transform each node in V into an embedding that captures sentiment information via a graph neural network (GNN).

We initialize word nodes and post nodes with FastText embeddings. Let x w R d represent the initial embedding of a word node w, and  x p R d represent the initial embedding of a post node p. FastText’s subword-based embeddings are particularly effective for informal language and slang, which are common in Weibo posts, by capturing morphological nuances within words and enhancing the model’s understanding of variations in sentiment expression.

Using a graph neural network (GNN) with a novel gated feature transformation mechanism, we refine the initial embeddings to learn nuanced, context-sensitive representations of Weibo sentiment. For each node v V , the GNN updates its embedding by aggregating information from neighboring nodes while using a gating mechanism, which we introduce as a key contribution, to selectively control the flow of sentiment information. Let h v ( k ) denote the embedding of node v at the k-th layer. The update rule at each layer k is defined as

h v ( k + 1 ) = σ u N ( v ) g v u ( k ) α v u W ( k ) h u ( k ) ,

where we have the following:

  • N ( v ) denotes the neighbors of v;

  • α v u represents attention weights that capture the importance of neighboring nodes;

  • W ( k ) is a learnable weight matrix;

  • g v u ( k ) = sigmoid ( w g ( k ) · h u ( k ) ) is our proposed learnable gating function that dynamically controls the influence of each neighboring node based on its sentiment features;

  • σ ( · ) is a non-linear activation function.

The gate g v u ( k ) acts as a sentiment-sensitive filter, strengthening or weakening the information flow based on the sentiment intensity or nuance present in each node’s content. This iterative process produces a final embedding h p for each post node p, capturing both structural and word-level nuances as well as sentiment-specific context of interactions. The gating mechanism, as our contribution, enables the GNN to handle the rich sentiment variance in Weibo data more effectively, leading to more accurate sentiment representation and analysis.

The final embedding h p of each post node p serves as input for the sentiment classifier. The classifier maps h p to a sentiment label y positive , negative , neutral , allowing us to predict sentiment based on the aggregated emotional context of each post.

4.3. Model Training with Additional Self-Supervised Loss

To enhance the model’s generalization ability, we introduce a novel self-supervised loss that reconstructs the affinity structure of the sentiment graph G , complementing the primary classification loss.

Our self-supervised objective encourages the model to predict the affinity between nodes based on their embeddings, thereby reinforcing the learned graph structure. For each pair of connected nodes ( u , v ) E , we define a reconstruction loss:

L self-supervised = ( u , v ) E sim ( h u , h v ) A u v 2 ,

where sim ( h u , h v ) is the cosine similarity between embeddings h u and h v , and  A u v is a binary indicator for the existence of an edge between u and v. This loss encourages nodes with an edge to have similar embeddings, while non-connected nodes remain less similar.

The final objective combines the classification loss L class with the self-supervised loss L self-supervised :

L = L class + λ L self-supervised ,

where λ is a hyperparameter that balances the two objectives. This combined loss encourages the model to learn both the sentiment classification task and the inherent relational structure within the sentiment graph, ultimately improving its ability to capture nuanced emotions in Weibo posts.

This methodology enables our model to effectively learn from the unique relational patterns in Weibo data, providing a robust approach to sentiment classification in social media contexts.

5. Experiment

5.1. Datasets

The data for the experiment were sourced from three primary datasets. We show the statistics in Table 1 and Figure 2 (for Dataset 1).

Dataset 1, sourced from the “SMP2020 Weibo Emotion Classification Technology Evaluation”, contains COVID-19-related Weibo posts categorized into six emotional labels: “neutral”, “happy”, “angry”, “sad”, “fear”, and “surprise”. For this study, the dataset was simplified to two sentiment categories: “Positive” (comprising “happy” and “surprise” emotions) with 4620 posts, and “Negative” (including “angry”, “sad”, and “fear” emotions) with 2526 posts. The ”neutral” posts were removed to refine the focus on binary sentiment classification and improve sentiment analysis reliability. In total, the processed Dataset 1 includes 8606 training samples, 2000 validation samples, and 3000 test samples.

Dataset 2 is derived from the “weibo_senti_100k” dataset, featuring 119,984 labeled Sina Weibo comments—59,993 positive and 59,991 negative. This dataset provides straightforward sentiment labels, supporting precise sentiment classification tasks and allowing us to evaluate the model’s performance and generalization on Weibo comment data.

5.2. Example of Data

We also show some examples in the following Table 2:

In the following Weibo review analysis table, we explore a diverse set of posts categorized by sentiment (labeled 0 for negative/neutral and 1 for positive). This sample reflects the various emotional tones and expressions often found on social media platforms.

  • Sentiment Label Distribution: The table predominantly features posts labeled “0”, representing neutral or negative sentiments, with a smaller portion labeled “1” for positive sentiments. This distribution illustrates the range of emotions present in social media, from frustration and disappointment to gratitude and joy.

  • Role of Emoticons and Social Media Language: Emoticons like [Tears], [Dizzy], [Haha], and [Playful] are integral to sentiment interpretation, as these symbols often convey emotions more directly than words alone. This underscores the unique challenge of sentiment analysis on platforms like Weibo, where textual and visual elements combine to express sentiment.

  • Contextual Complexity of Posts: Some posts, such as ID 62050, mention specific events or contexts (e.g., “More negative news about CMB”), making sentiment difficult to assess without background information. Additionally, comments on shared content (e.g., ID 81472) add complexity, as accurate sentiment analysis requires understanding both the primary text and the referenced content.

  • Direct vs. Indirect Sentiment Expression: Positive posts labeled “1” often express clear sentiments, such as gratitude or well wishes (e.g., IDs 7777 and 6598). In contrast, posts labeled “0” show negative sentiments, including frustration and confusion, as seen in IDs 100399 and 82398, where users express dissatisfaction with experiences like getting lost or facing unclear airline policies.

This analysis highlights the nuanced challenges in Weibo sentiment analysis, where accurate assessment requires interpreting not only direct emotional cues but also contextual references and emoticons.

5.3. Environment

The system configuration and tool settings used in the experiment are shown in Table 3.The experiments were conducted on an Ubuntu 18.04 operating system, using PyCharm as the development environment and Python 3.11 as the programming language. A laptop equipped with an RTX 3090 Super GPU was utilized to accelerate model training and inference, benefiting from enhanced GPU computing power to optimize processing speed.

5.4. Data Preprocessing

This section illustrates the flow of the primary preprocessing steps used in our experiments.

1.

Text Cleaning: To remove noise from the text, we applied regular expressions to filter out HTML tags, irrelevant URLs, emoticons, and extraneous letters and numbers, retaining only the essential text content.

2.

Word Segmentation: We employed the Jieba word segmentation library, a versatile tool for Chinese word segmentation. Jieba offers several segmentation modes: precise, full, and search engine modes. Additionally, it allows for custom dictionaries, enabling tailored segmentation for specific domain vocabularies.

3.

Stopword Removal: Stopwords are commonly removed in text preprocessing, as they do not contribute meaningful information. We utilized the Harbin Institute of Technology’s stopword list as our primary source and performed secondary filtering with the Baidu stopword list to enhance text quality.

4.

Numerical Conversion: We used a Tokenizer to transform the cleaned text data into numerical format, making them compatible with model processing.

5.5. Experimental Parameters

For model training, the Embedding layer was used to encode the input integer sequences into dense vector representations. The graph neural network architecture included 128 hidden units with a dropout rate of 0.5. We set the batch size to 16 and the learning rate to 0.0001.

5.6. Baseline Models

We evaluate our model’s performance against several established baselines in text classification and sentiment analysis:

  • Word2Vec: A neural network model that learns word embeddings by predicting context words (skip-gram) or target words (CBOW), effectively capturing semantic relationships between words.

  • FastText: An extension of Word2Vec that represents each word as a collection of character n-grams, enabling the model to capture subword information and improve performance with rare or misspelled words through enhanced morphological understanding.

  • K-Nearest Neighbors (KNN): A non-parametric, distance-based classification method that assigns labels based on the majority label among the k nearest neighbors, often utilizing word embeddings or document vectors to measure proximity in text data.

  • Convolutional Neural Networks (CNNs): Apply 1D convolutional layers on word embeddings to extract local n-gram features, which are aggregated via pooling layers to produce fixed-size vectors for classification tasks.

  • Long Short-Term Memory (LSTM): A recurrent neural network (RNN) variant designed for sequential data, employing gates to regulate information flow and effectively capturing long-term dependencies in text, which is particularly useful for sentiment analysis.

  • CNN-BiLSTM: Combines CNNs to extract local patterns with a bidirectional LSTM (BiLSTM) to capture contextual information from both past and future tokens, leveraging the strengths of both architectures for improved performance.

  • Gated Recurrent Unit (GRU): A streamlined alternative to LSTM with fewer gates and no separate cell state, providing the efficient training and effective modeling of sequential dependencies, especially for shorter text sequences.

  • Dual-Channel Graph [40]: Utilizes attention mechanisms to capture syntactic structures and multi-aspect sentiment dependencies, improving the model’s interpretation of complex, multi-faceted sentences.
  • Knowledge-Enhanced Graph [39]: Incorporates external sentiment vocabularies to enrich aspect-based sentiment analysis by enhancing the model’s understanding of sentiment-heavy words and phrases, representing the current state of the art.
  • Gaussian Similarity Modeling (GSM) [41]: Employs Gaussian similarity metrics to enhance GNN performance by improving the representation of node relationships.
  • Cross-Channel Graph Information Bottleneck (CCGIB) [42]: Leverages cross-channel graph information to improve GNN performance by effectively balancing node information and graph sparsity.

5.7. Classification Results

The results presented in Table 4 highlight the performance of various sentiment classification algorithms across two distinct datasets. Each model’s efficacy is measured through accuracy, precision, and F1 scores, which together provide a well-rounded understanding of each model’s classification ability. The models range from simpler approaches, like K-Nearest Neighbors (KNN), to more complex neural network-based models, including CNN, LSTM, GRU, and several graph-enhanced architectures. The performance metrics indicate a clear trend: more sophisticated architectures generally achieve higher accuracy and F1 scores, particularly on Dataset 2, which may suggest that this dataset has characteristics that benefit from advanced, high-capacity models.

Among the traditional word embedding models, FastText outperforms Word2Vec by a notable margin, achieving a 4% higher accuracy on Dataset 1 and a 3% increase on Dataset 2. This improvement in performance highlights the benefits of subword-level information in FastText, which likely aids in capturing finer nuances in sentiment classification. KNN, as expected, yields the lowest scores among all algorithms, underscoring its limitations in handling complex sentiment data compared to deep learning models. Notably, the CNN and LSTM models both demonstrate strong performance, with accuracy and F1 scores surpassing 90% on both datasets, showing the advantages of capturing spatial and sequential information in text data, respectively.

The dual-architecture models, such as CNN-BiLSTM and GRU, demonstrate incremental improvements over their single-architecture counterparts. These models leverage the strengths of both convolutional and recurrent neural networks, leading to better contextual understanding and feature extraction. For example, the CNN-BiLSTM model achieves 93.08% accuracy on Dataset 1 and 98.16% on Dataset 2, demonstrating its ability to capture intricate sentiment features effectively across diverse datasets.

The most advanced models, including the Dual-Channel Graph and Knowledge-Enhanced Graph architectures, exhibit the highest accuracy and F1 scores. The Dual-Channel Graph model reaches 96.20% accuracy on Dataset 1 and 98.85% on Dataset 2, while the Knowledge-Enhanced Graph model slightly surpasses it, achieving 96.75% and 99.00% accuracy on Datasets 1 and 2, respectively. These models capitalize on graph-based techniques that enhance text representations by capturing complex relationships and contextual dependencies. The Knowledge-Enhanced Graph model, with added semantic information, demonstrates a superior ability to generalize across both datasets, which is particularly advantageous for sentiment classification tasks with nuanced expressions.

Our model, which combines both dual-channel and Knowledge-Enhanced Graph techniques, sets a new performance benchmark with 97.51% accuracy on Dataset 1 and an impressive 99.56% on Dataset 2. This model achieves the highest precision and F1 scores as well, indicating a robust capacity to handle both positive and negative sentiments accurately. Its enhanced architecture likely allows for the extraction of more comprehensive sentiment features, leading to superior generalization. The consistently high performance across both datasets suggests that this model could be well suited for real-world applications in sentiment analysis, where data diversity and complexity often challenge simpler models.

In summary, these results affirm that complex, graph-based models provide the best performance for sentiment classification, with each incremental architectural enhancement translating to measurable improvements in classification metrics. The higher performance of “Our Model” indicates the effectiveness of combining dual-channel and knowledge-enhanced techniques, especially for nuanced sentiment classification tasks, and underscores the potential of these advanced architectures in real-world applications.

5.8. Ablation

The ablation study in Table 5 highlights the impact of removing specific components from our model architecture on performance across two datasets. For Dataset 1, removing the Post-to-Post Link results in a slight drop in F1 score to 95.5%, compared to 96.12% with the full model, indicating a marginal reduction in the model’s overall predictive power. Similarly, removing the Word-to-Post Link also lowers the F1 score slightly to 95.6%. The exclusion of self-supervised loss has a comparable effect, with an F1 score of 95.8%. However, our complete model achieves the highest accuracy (97.51%) and F1 score (96.12%) for Dataset 1, underscoring the contributions of all components to model performance. The results also indicate the importance of using the gating mechanism to improve sentiment classification.

For Dataset 2, the pattern is consistent, with the complete model outperforming ablated versions. The model without the Post-to-Post Link shows an F1 score of 98.3%, which is slightly below the full model’s F1 score of 98.82%. Removing the Word-to-Post Link or self-supervised loss also leads to similar minor drops in the F1 score, suggesting that each component contributes incrementally to model effectiveness. Finally, removing the gating mechanism also causes performance drop. The full model achieves the highest performance across all metrics, emphasizing the importance of each component in enhancing accuracy, precision, and F1 scores.

5.9. Sensitivity Check

Recall that we use a weight λ to balance the two losses:

L = L class + λ L self-supervised

In this sensitivity analysis of the Figure 3, we explore how variations in a specific parameter, denoted as λ , impact the performance of our model relative to a fixed reference point, termed the Knowledge-Enhanced Graph (static). The static graph serves purely as a reference for comparison and does not itself respond to changes in the λ parameter, as it is unaffected by this or any other variable. This reference provides a baseline or benchmark to evaluate the performance fluctuations in our model as λ is adjusted.

The chart shows two lines: the flat blue line representing the Knowledge-Enhanced Graph (static), and the dynamic orange dashed line representing our model. As λ is varied along the x-axis, our model exhibits significant changes in its output, indicating sensitivity to this parameter. The performance of our model initially rises, peaking at a certain λ value before gradually declining. This pattern suggests that our model is optimized for a specific range of λ values, where it performs most effectively, and that its performance diminishes when λ moves beyond this optimal range.

Despite the sensitivity of our model to the λ parameter, it consistently outperforms the Knowledge-Enhanced Graph (static) across all tested λ values. This finding is significant because it highlights our model’s ability to adapt to parameter variations while maintaining an edge over the static baseline. The reference line, though stable, is effectively outpaced by our model at every point, demonstrating the latter’s flexibility and superior performance.

5.10. Training and Validation Losses

Figure 4 demonstrates the consistent and smooth training of our method as indicated by the gradual decline in training loss over time. Unlike many models that tend to overfit quickly, our method shows a minimal gap between the training and validation losses, even after extended epochs. The validation loss remains stable and does not exhibit the sharp upward spikes typical of overfitting, reflecting the robustness of our approach. This suggests that our model generalizes well across both training and validation data, making it less prone to overfitting and capable of maintaining performance over time.

6. Visualization of Attention Maps

The visualization of attention score maps (Figure 5) reveals key insights into the model’s mechanisms for interpreting sentiment across different types of review content. By examining the distribution of attention scores in each sample, we can assess the model’s effectiveness in identifying sentiment-bearing elements within text.

Sample 1: Negative Sentiment In the first sample, “Too much! @Rexzhenghao: More negative news about CMB lately…, high attention scores are assigned to sentiment-rich phrases such as ”Too much!” and “negative news”. This distribution indicates the model’s focus on words that express strong sentiment, as these words likely inform the negative classification of the review. By emphasizing these emotionally charged terms, the model highlights its ability to prioritize critical phrases that contribute to an overall negative tone.

Sample 2: Mixed Sentiment The second sample, “A little tempted to join???? [Sneak smile] Still deciding on the time [Frustrated],” presents a more complex sentiment structure. Attention scores are distributed across phrases such as “tempted to join” and “still deciding”, reflecting a balance between positive curiosity and hesitation. The model appears to account for emoticons like “[Sneak smile]” and “[Frustrated]” in its score allocation, suggesting an understanding of these symbols as mood indicators. This nuanced spread of attention underscores the model’s capacity to capture ambivalence, an essential feature in sentiment analysis when dealing with mixed signals.

Sample 3: Positive Sentiment In the final sample, “[Great] Thanks to everyone supporting Juanwa’s sesame! [Love you]” attention scores are concentrated on overtly positive expressions like “Great”, “Thanks”, and “Love you”. These words carry strong positive connotations, which the model prioritizes in its interpretation. By focusing on these appreciative and affectionate terms, the attention mechanism successfully identifies signals of positive sentiment, thereby enhancing the accuracy of its sentiment prediction.

Overall, the attention score maps show a consistent pattern where the model prioritizes words that convey emotional tone, especially those that are sentiment laden or directly indicative of the review’s mood. This pattern suggests an effective alignment between attention distribution and sentiment-bearing elements within text. Such insights can be instrumental in refining the model’s attention mechanisms, ensuring a greater focus on sentiment-relevant words and improving sentiment analysis accuracy. These observations further highlight the potential for using attention-based interpretability to validate and adjust the model behavior in sentiment prediction tasks.

6.1. Time and Resource Analysis

The training process demonstrates remarkable efficiency as evidenced by the consistent growth of the F1 score over time in the Figure 6. Within just three hours, the model achieves a final F1 score of 0.95, starting at 0.70 in the initial phase. This rapid convergence highlights the effectiveness of the learning algorithm and its ability to optimize performance over a relatively short period. Such efficiency is crucial for iterative development processes, enabling researchers to test and deploy updates quickly without extensive computational delays. Additionally, the steady increase in performance metrics indicates robust generalization and the absence of significant overfitting during training.

One of the most notable strengths of the model is its resource-efficient architecture. The training process uses only 16 GB of memory, underlining its suitability for environments with limited computational resources. This efficiency is further amplified by the adoption of a mini-batch training strategy, which allows the model to handle large-scale datasets effectively without requiring excessive hardware. By dividing the data into manageable mini-batches, the model ensures that memory usage remains low while maintaining high performance. This design choice not only reduces the cost of infrastructure but also ensures scalability, making the model adaptable for a wide range of applications, from small-scale research projects to large-scale industrial tasks.

6.2. Applying to Other Social Media and Language

The results of our sentiment analysis on Twitter data [45], in Table 6, demonstrate the robustness of our proposed model compared to existing approaches. Our model achieves the highest performance across all metrics, with an accuracy of 76.21%, precision of 74.32%, and an F1 score of 75.32%. These scores outperform established models such as the GRU model, Dual-Channel Graph, and Knowledge-Enhanced Graph, indicating the effectiveness of our enhancements in capturing nuanced sentiment patterns. Notably, the Knowledge-Enhanced Graph model and CCGIB, which also leverage external information and graph structures, performed similarly to our model but fell short, particularly in the F1 score, suggesting that our integration of domain-specific knowledge and refined feature representation offers a significant edge in classification tasks.

The promising performance of our model on Twitter sentiment analysis suggests its potential applicability to other social media platforms, such as Facebook, Instagram, and YouTube. These platforms often feature diverse linguistic styles, including short posts, comments, and hashtags, where the adaptability of our approach in capturing contextual and semantic nuances can prove valuable. Furthermore, extending our model to support multilingual sentiment analysis could enhance its utility for global applications, particularly in addressing sentiment dynamics in languages with limited annotated datasets. By leveraging transfer learning or cross-lingual embedding techniques, our model can be fine-tuned to analyze sentiment across languages, enabling insights into cultural and regional sentiment trends and fostering broader applications in marketing, policy-making, and social impact analysis.

7. Conclusions
and Future Work

In this paper, we have presented a novel approach to Weibo sentiment analysis, addressing the unique linguistic and emotional complexities of social media discourse through a graph-based framework that integrates self-supervised learning. By leveraging a sentiment graph with relational structures and an innovative gated mechanism, our model captures the nuanced emotional cues that traditional sequence-based models often miss. Our approach enhances the ability to interpret multi-layered emotional expressions, making it particularly relevant for the real-time monitoring of public sentiment, especially during crises such as the COVID-19 pandemic. Through this framework, we demonstrate significant improvements in accurately identifying and interpreting the subtle emotional shifts and intense sentiment fluctuations that characterize Weibo posts. These advancements underscore the potential of our model to support applications in mental health, policy-making, and societal well-being by offering more reliable insights into collective moods and emerging emotional trends.

Future Work: Future research could expand upon our sentiment graph framework by incorporating multimodal data sources, such as images and videos, which are prevalent in Weibo posts and can enhance emotional interpretation. Additionally, the application of cross-lingual transfer learning to this framework may allow it to be adapted to other social media platforms with different languages and cultural nuances. Another promising direction is the refinement of the self-supervised loss function to further capture temporal dynamics, enabling the model to track sentiment changes within user posts over time. Finally, exploring privacy-preserving mechanisms to analyze social media sentiment while safeguarding user data could broaden the adoption of this technology in sensitive contexts like mental health and crisis response, making it a valuable tool for both researchers and practitioners.



Source link

Chuyang Wang www.mdpi.com