NHSJS Reports

Impact of Public Sentiment on the S&P500: A Literature Review

May 1, 2025

4645

Abstract

This literature review aims to analyze the effects of sentiment analysis on the S&P500 index. It outlines the transition from the simple Natural Language Processing Lexicon means to Modern Machine Learning and Artificial Intelligence Techniques. The study, therefore, seeks to analyze the correlation between social media and news sentiment in the form of articles and company filings relative to the movement of S&P500.The review examines 30 papers related to the topic of sentiment analysis and financial markets. The aim is to review the current developing techniques in the sentiment analysis field and assess the effectiveness of the achieved results in addressing the financial trend forecasting issue. The review comprises studies that make use of the lexicon-based methods and the state-of-the-art deep learning techniques, including LSTM, CNN, and BERT. Information was retrieved on the main results, methods used and the implications of each of the studies to give a review. Findings indicate that LLMs consistently outperform traditional methods by more than 5% in accuracy, with statistical significance established across diverse datasets and tasks. While acknowledging that LLMs have greater complexity and a larger number of parameters, this study ensures fair comparisons by analyzing performance relative to the capabilities of baseline models. The results highlight the value of integrating sentiment analysis into conventional models for financial return forecasting.

Introduction

Background

Public opinion analysis is now deemed appropriate for tracking market trends. In view of the current increasing trend in the use of social media and online discussions, they are transparent platforms for public opinion and expectation. This literature review is designed to examine the effects of public sentiment on the financial markets; more specifically, it will be looking at the S&P500. This paper analyses one of the most popular indicators of the US stock market by applying natural language processing (NLP) and sentiment analysis of social media posts, articles, and financial reports.

Modern approaches to sentiment analysis can be divided based on techniques used into basic lexicon-based and advanced AI-based techniques. The past approaches used simple look-up tables with scores of words, could not interpret the total context and even failed at basic language features like sarcasm or even irony. However, the new trends pose the use of algorithms and deep learning that is based on machine learning and can increase its efficiency with the help of training data. These are more capable of comprehending context, identifying changes in sentiment and coming up with better outcomes.

Sentiment analysis can be enriched with principles from behavioral finance in modern approaches. If we can understand how investor understanding gets made—how these behavioral patterns that create overconfidence and create herd behavior and create loss aversion —and you understand that, then that has predictive value for you. For instance, if the market is to react differently to positive social news than negative social news the researchers can take a deeper look at the market dynamics.

Moreover, an understanding of investor sentiment and the effect it has on market volatility accommodates for a more fine detail of market trends. Using this technique helps it shed some light on the correlation between sentiment indicators and market movements, and thus build more reliable forecasting models. Embedding behavioral finance could improve knowledge about behavioral patterns that determine investors’ decisions and emotions influencing the choice. Such an approach would add more light to the correlation that exists between the overall public attitude and the market trends. By adding investigation on other forms of behavior such as overconfidence and herding, the analysis could expand and therefore enhance the forecasting of financial models¹. This is further discussed in section 3.3. Behavioral finance merges traditional economic theories by incorporating psychological variables that define the behavior of investors as well as the market. Daniel Kahneman and Amos Tversky are pioneers of the work on how cognition of financial behaviors and market outcomes is distorted by cognitive biases. Loss aversion is a psychological concept where people are more motivated to avoid losses than to achieve equivalent gains. This tendency can lead to irrational decision-making in volatile markets. Knowledge of these types of behavioral patterns is necessary, as they in fact can affect not only the individual investors, but also the general market scenery. By incorporating behavioral finance perspectives into the analysis, the significance of correlation between investor psychology and market trends can be shed light on the choices they make in terms of financial decisions.

Significance of the Study

This research has both theoretical and practical significance. From a theoretical perspective, the study aims to contribute new insights into the relationship between public sentiment and financial market behavior. If correlations can be established between sentiment trends and the S&P500, it would provide empirical evidence supporting the Efficient Market Hypothesis which posits that stock prices reflect all available information. Practically, identifying sentiment trends has important applications for investors, financial analysts and institutions. By analyzing sentiment correlations, practitioners may be able to use sentiment analysis as a complementary tool to traditional methods for predicting market movements and making investment decisions. Furthermore, the relation proposed between sentiment analysis and stock exchange may be used to support the herding model proposed by Alan Kirman in 1993. The proposed approach can also benefit public relations and marketing professionals looking to gauge consumer perceptions.

Scope and Limitations

The purpose of this study is to apply sentiment analysis to publicly available social media posts and news articles in English that express sentiment towards the S&P500. Among other things, the analysis will mainly be conducted on publicly shared opinions which will enable it to have a wide look of where public sentiment is heading. Nevertheless, there are some limitations that need to be noted. First of all, using open data means that we exclude private opinions, which are able to noticeably impact the market sentiment. Also, the findings from the study may not reflect the entire range of investor behavior as underlying investor psychology and non-publicly expressed sentiments are not included. On this basis, the study will be able to better contextualize its findings within the landscape of financial analysis as a whole. Thus, they link findings to broader financial theories.

Additionally, the findings of this study should be interpreted in the framework of already developed financial theories, the most reputable being the Efficient Market Hypothesis (EMH) and the Rational Finance. According to the EMH, all available information is reflected in asset price and as such sentiment should have a little effect on stock prices in an efficient market. The bias of incorporating behavioral finance in EMH however shows psychological factors hindering market efficiency. The study does double duty, enhancing its academic rigor by linking the observed sentiment trends to these more general theories and providing a more nuanced understanding of the workings that influence financial markets.

Results

Overview of Sentiment Analysis Methods used to Gauge Public Sentiment

Samuel et al. (2020) tried to understand the public opinion on COVID-19 by analyzing tweets². Based on more than 2 million tweets posted from February to April 2020, their study was performed. Product tokenization, stop words and stemming were performed as text processing techniques. One of the reasons to do the machine learning classifiers is to evaluate six machine learning classifiers, which include Naive Bayes, Logistic Regression, Decision Tree, SVM, KNN and Neural Network. In classifying tweets related to COVID-19, the best result was achieved by the Neural Network as the accuracy achieved was 88%. Although the study offers useful information, the methodological limitations of their approach need to be critically examined.

For example, such a high reported accuracy shows how the question of possible sampling biases arises. The disadvantage of this dataset is that it is dependent on public tweets which may not accurately portray wider population sentiments due in part that it misses the majority of public who do not participate in social media. Additionally, these results are not fully generalized to financial markets. No specific context related to COVID-19’s mood necessarily applies to other market conditions or occurrences.

Future research can improve its ability to evaluate the robustness and applicability of sentiment analysis methods in financial contexts by addressing these methodological limitations. It provides a critical analysis paving the way for increased credibility of the findings and their implications for understanding market behavior affected by public opinion. Alsaeedi et al., (2019) examined different methods for exploring sentiment analysis in tweets³. Used keywords related to top media from 2016. Several preprocessing techniques including tokenization, stopping, stemming, and lemmatization were used. Five supervised machine learning classifiers such as naïve in preprocessing Twitter data, Bayes, support vector machine, maximum entropy, logistic regression, and decision trees were trained and the performance of the classifiers was compared based on accuracy, recall, and F1 scores. SVM had the best overall accuracy of 81% followed by Nev Bayes with 77% of tweets classified a positive, negative or neutral. Whilst the findings themselves cannot be generalized to financial text, the incorporation of different feature models and classifiers may be used to enhance the accuracy of sentiment analysis models for forecasting something as complex as the S&P500. Overall, this study demonstrated the effectiveness of different supervised learning strategies for Twitter sentiment analysis.

Wankhade et al., (2022) conducted a comprehensive review of the state of the art in sentiment analysis methods, applications and challenges⁴. It provides a comprehensive review of traditional machine learning and deep learning techniques used for sentiment classification, including SVM, Naive Bayes, RNN, CNN, etc. The review also includes sentiment-based analysis terminology and grouping. Several applications are tested, including research in consumer reviews, social media, health documentation, and news reports. The paper identifies the obvious challenges of sentiment analysis such as addressing the use of sarcasm, metaphor and irony in language which must be taken into account due to their increase of use. It provides an overview of the field to help guide future research developments and new applications.

Atteveldt et al., (2021) examined the accuracy of different sentiment analysis methods by comparing their ability to accurately predict sentiment in UK newspaper lists of several Machine learning algorithms, crowd-coding by paid professionals, dictionary-based methods, and tested all the emotional expressions of communication experts on the same data and then compared their emotional outcomes with coding the adopted system brought about⁵. The results of the study showed that the best performing algorithms did not match the accuracy of deep manual coding by experts. This provides an important check on the limitations of current automated sensory analysis methods.

Elbagir et al., (2019) presented a method for sentiment analysis on Arabic tweets using Natural Language Toolkit (NLTK) and VADER sentiment dictionary⁶. Tweets about Sudanese politics were collected over time, preprocessed, edited, and coded into sentiments. NLTK was used to tokenize, stem, and partially tag speech tweets. The Arabic-supported VADER assigned perceptual scores to words and sentences. A naive Bayes classifier trained on annotated tweets classified new tweets as positive, negative or neutral. The authors found that VADER achieved an optimal performance of more than 80% accuracy. However, their system was untrained which resulted in high inaccuracies. Furthermore, the difference in language used adds to the complexity of sentiment analysis. This may be an issue for models trained in extracting tweets with specific corpus as the alphabets may not be the same. This study demonstrated an effective approach to Arabic sentiment analysis using freely available tools and dictionaries.

Year	Paper Title	Methods Applied	Dataset Review	Accuracy
2019	A study on sentiment analysis techniques of Twitter data	Naïve Bayes, Maximum Entropy, and SVM	Small dataset of opinions taken from internet, twitter sentiment analysis methods outperformed the rest due to the incorporation of different feature models and classifiers	67.37%
2022	A survey on sentiment analysis methods, applications, and challenges	Decision Tree, Linear Classification, Rule Based Classifier, K- Nearest Neighbor	Variety of data collected from social media, forums, Weblog, and commerce website	89.05%
2021	The validity of sentiment analysis: Comparing manual annotation, crowd-coding, dictionary approaches, and machine learning algorithms	CNN and SVM	Various sentiment measuring methods were used. Discrepancies in the true meaning of terms within the dataset which caused the result to be less accurate	84%
2019	Twitter sentiment analysis using natural language toolkit and VADER sentiment	NLTK and VADER	Small volume of data was used (Four million tweets made public by Stanford University) in an untrained system. VADER proved more effective as the model was relatively less complex so the processing was faster	60.02%
2020	Sentiment analysis based on deep learning: A comparative study	DL, ML, Neural Network, NLP	Social media dataset such as twitter. RNN model proved to be the most reliable due to its complexity however it also resulted in highest computational time	80%
2020	Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong	LSTM	Stock market data	>50%
2020	Predicting stock market trends using machine learning algorithms via public sentiment and political situation analysis	RMSE and MAE	Finance and public sentiment from Twitter	Improves more than 20%
2019	A novel twitter sentiment analysis model with baseline correlation for financial market prediction with improved efficiency	FTSE, TSS, CEFD	Limited time series Twitter data samples based on stock market within a limited data tracking point	67.22%
2020	Covid-19 public sentiment insights and machine learning for tweets classification	Naïve Bayes, SVM, Logistic Regression	Public sentiments obtained from Twitter – dataset restricted to US. Both Naïve Bayes and Logistic performed accurately for short to medium length Tweets.	74%

Table 1: Literature Review Comparison and Summary Table

Table 1 presents different accuracy rates resulting from different approaches to sentiment analysis and exploring different data sets. Techniques like DL and neural networks perform better than basic ones like Naïve Bayes, such as 2020 80% and 67.37% in 2019. About the former, more complex samples and methods that include both survey and real-life data, as well as social media, forums, and commerce websites, suggest higher accuracy by 2022 (89.05%). Contextual information, timeliness of the data and various input types enhance the results, thus underlining the complexity of the sentiment that requires highly sophisticated models.

Moreover, it is important to note factors that cause the effect specified to become prominent such as the size of the sample data, complexity of the features, and amount of noise on data. For instance; DL preeminent outperforms regular methods in cases that have large datasets with complex feature interactions, while for scenarios with scenario restraints such as low datasets, simple methods are adequate. For example, deep learning models are much more suitable for finding nonlinear dependencies in the data whereas a more linear approach might not see the information since it is not that well organized. Through such systematic analysis, the advantage of advanced techniques will be explained when is more appropriate to use them, which will strengthen the quality of the manuscript and help improve the discussions related to the methodologies of sentiment analysis. Based on more than 2 million tweets posted from February to April 2020, Samuel et al. (2020) study was performed⁷. Product tokenization, stop words and stemming were performed as text processing techniques. One of the reasons to do the machine learning classifiers is to evaluate six machine learning classifiers, which include Naive Bayes, Logistic Regression, Decision Tree, SVM, KNN and Neural Network. In classifying tweets related to COVID-19, the best result was achieved by the Neural Network as the accuracy achieved was 88%. Although the study offers useful information, the methodological limitations of their approach need to be critically examined.

For example, such a high reported accuracy shows how question of possible sampling biases arises. Disadvantage of this dataset is that it is dependent on public tweets which may not accurately portray wider populations sentiments due in part that it misses the majority of public who do not participate in social media. Additionally, these results are not fully generalized to financial markets. No specific context related to COVID-19’s mood necessarily applies to other market conditions or occurrences. Future research can improve its ability to evaluate the robustness and applicability of sentiment analysis methods in financial contexts by addressing these methodological limitations. It provides a critical analysis paving the way for increased credibility of the findings and their implications to understanding market behavior affected by public opinion.

Lexicon-Based Sentiment Analysis

Bontà et al. (2019) make a complete review of dictionary-based methods for sentiment analysis and discuss their strengths and weaknesses⁸. The paper discusses three popular dictionaries: SentiWordNet, MPQA and NRC Emotion Lexicon. There are limitations to these resources when applied to analyzing sentiment in texts. In particular, dictionary-based methods typically miss the context and polarity of the words, especially when the meaning is dependent on words near it. The resulting limitation is that this can misinterpret sentiment, particularly in situations that aren’t simple and nuanced. The resulting limitation is that this can misinterpret sentiment, particularly in situations that aren’t simple and nuanced such as when interpreting sarcasm, irony and idiomatic expressions. It is also important to note that, the static nature of lexicon also poses a limitation as it may become outdated as the language evolves.

Additionally, Pandey et al. (2019) developed a dictionary approach for the sensory analysis in film research⁹. The perceptual cues mentioned in the words and phrases were analyzed by using the Bing Liu dictionary. This will better emphasize the necessity to shift from regular dictionary methods to perceptual analysis approaches that can limit the scope of sentiment. By adopting perceptual frameworks, research creates a more dependable, and deeper sense of sentiment analysis. However, in using lexicon-based analysis, there may be the potential biases like overreliance on certain keyword lists within the dictionary, this is likely to underepresent the full spectrum of sentiment expressions. This can result in a false analysis, especially if the frequently occurring words analyzed by the lexicon have strong polarity within the lexicon.

Machine Learning-Based Sentiment Analysis

Do et al., (2019) provided a comprehensive review of deep learning techniques for aspect-based sentiment analysis (ABSA)¹⁰. ABSA aims to determine sentiment towards specific entities/aspects within a text. The study compares various neural frameworks like CNN, RNN, attention mechanisms and memory networks used for ABSA. It discusses their architecture, methodology and evaluation on several benchmark datasets. The paper finds that memory networks had promising results by incorporating aspect information. However, ensemble and multi-task models showed most improvement over single models. This survey is helpful to know the latest enhancement of deep learning for ABSA and to figure out the research opportunities to model contextual Semantics and implicit aspects. It shall be equally useful reading to any researcher operating within this thematic area.

Dang et al., (2020) comparatively studied on various deep learning approaches like CNN, RNN, LSTM, and BERT for sentiment analysis¹¹. The authors benchmark these models on the corresponding Twitter datasets for sentiment classification problems. They consider and evaluate the accuracies and the precisions, recall and the F1 measures of the models. This is very important to understand how accurate the model actually is. For the CNN model, accuracy up to 88% was recorded as the best while for LSTM, it recorded the best results on the small set.

Yadav et al., (2020) presented a detailed literature review of deep learning frameworks used in sentiment analysis¹². It overviews state-of-the-art neural networks such as CNN, RNN, LSTM and other forms associated with attention mechanisms. The paper reviews various methods of analysis and topologies of networks employed in the classification and quantification of sentiment. It reviews them based on the application of these deep models in benchmark datasets and the results are thoroughly described. Finally, it includes the following: sarcasm and implicit sentiments that are the major challenges in the text sentiment analysis. In sum, this survey provides a comprehensive framework of the latest developed representation learning techniques for mining of subjective information. It also helps researchers to get the sentimental analysis of a subject through state-of-the-art deep learning techniques as a reference guide.

Yang et al., (2020) presented two approaches for sentiment analysis of Chinese e- commerce product reviews: comparative study with two approaches, namely lexicon-based HowNet sentiment lexicon and deep learning LSTM model¹³. For lexicon-based approach, reviews are segmented and scored after lookup in lexicon. For LSTM model, reviews are embedded and fed into LSTM layers for classification. Both methods are evaluated on a dataset of 2000 mobile phone reviews annotated for sentiment and compared in accuracy. The deep learning model achieved better performance with accuracy of over 84% compared to lexicon at 78%. The study demonstrates effective application of both techniques on Chinese reviews and usefulness of neural models for understanding contextual sentiment.

Studies measuring impact of public sentiment on stock market

Studies Using News Articles

Audrino et al., (2020) examined the impact of sentiment analysis and attention measures on stock market volatility. It uses machine learning models to conduct sentiment analysis on social media and news data related to S&P500 companies¹⁴. Attention is measured based on frequency of company mentions. The study analyzes correlation between derived sentiment, attention and realized stock volatility using Granger causality and vector autoregression models. Results show negative sentiment has significant causal impact in increasing volatility, while attention showed weaker correlation. This research provides empirical evidence on how investor perceptions and attention measured through alternative data sources can influence financial volatility. It demonstrates the growing utility of natural language tools in quantitative finance.

Shapiro et al., (2022) developed a new approach to quantify news sentiment using deep learning. It trains a bidirectional LSTM model on over 8 million news articles annotated with sentiment scores. The model learns contextual representations to measure sentiment embedded in new articles. It generates both overall and issue-specific sentiment indices. The indices are shown to track various economic indicators and predict stock returns. This is an improvement over existing dictionary-based method. The paper highlights the importance of contextual modeling for news-based sentiment analysis. It demonstrates how such data-driven sentiment measures can provide valuable signals for economic and financial forecasting models. The methodology presents a state-of-the-art framework for quantitative sentiment analysis of news texts.

Li et al., (2020) proposed a model that takes advantage of stock prices and information to improve stock market forecasts¹⁵. It extracts the sentiment score of Chinese media articles about Hong Kong stock using a lexicon-based approach. A recursive neural network combines historical stock prices and news sentiment as input for predicting stock movement direction. Experimental results on a large dataset of news prices show that the proposed model outperforms price-only news. This study highlights the value of combining other disparate data sources to increase predictive power. He provides empirical evidence on how sentiment analysis of financial news can support quantitative trading strategies. The new theory provides neural models for successful application in multi-sensory time series forecasting.

Maqsood et al., (2020) proposed a deep learning model that integrates global and local event sensitivity for stock market forecasting¹⁶. It uses the BiLS™ grid to analyze the sensitivity of firm-specific general media. Local news captures sentiments about the company while global news provides general economic information. The model predicts stock price movements guided by granular and broad sentiment. Analysis of large-scale news and financial data sets shows that the proposed dual-sensory method outperforms the baseline in absolute forecasting. This study uses deep neural networks (DNN) to efficiently take advantage of multilayered events from discrete sources. It highlights the benefits of incorporating both micro and macro sentiment methods for improved stock forecasting capabilities.

Sentiment Analysis Using social media

Cruz et al., (2022) investigated the impact of economic sentiment on Twitter during the H1N1 and COVID-19 pandemics¹⁷. It aggregates tweets about stocks and analyzes sentiment using semantics-based machine learning techniques. The relationship between sentiment indices and stock market movements is then examined. The results show that emotions had a greater impact during the COVID-19 crisis compared to H1N1, and negative emotions were associated with lower prices. This represents one of the first studies to compare the epidemiological effects of social media. Research highlights the growing power of online opinion in investment decisions, especially under conditions of high uncertainty. It also demonstrates the usefulness of automated sentiment analysis to assess investor sentiment.

Khan et al., (2020) propose a machine learning model to predict the dynamics of the stock market based on the analysis of public sentiment and political events¹⁸. It extracts sentiment scores from Twitter data and identifies significant political events using NLP techniques. Different classifiers including LSTM, SVM and random forest are trained on the input features to forecast stock index movements. The developed system is evaluated on US stock market data and shown to achieve over 70% accuracy. This research presents an innovative application of combining alternative unstructured data sources with machine learning for financial forecasting.

Guo et al., (2019) presented a new model for Twitter sentiment analysis, which helps in more efficient prediction of the financial markets¹⁹. It presents a basic level correlation method to reduce the impact of noise and sudden attitude change in social media. Tweet sentiment in the model is looked at by correlation with previous baseline sentiment data. It also makes it possible for it to eliminate extreme value and monitor the true changes in investors’ sentiments. Data on S&P500 stock proves that the new approach based on the improved sentiment lexicon results in more stable and accurate predictions in relation to the traditional sentiment analysis. The research offers corrections to extract more useful signals from the less structured and diverse sources such as Twitter for the purpose of predicting asset management.

Tan et al., (2021) provided an investigation of the effect of social media sentiment on the international stock return and trading volume. It collects Twitter data related to 30 developed countries and measures sentiment using lexicon-based methods. Regression analyses are conducted to evaluate the relationship between sentiment indexes and daily stock market performance. The findings indicate negative sentiment has a greater influence, significantly predicting next-day returns and increasing trading activity. This research provides empirical evidence that investor views expressed on social platforms contain information predictive of cross-border equity movements. It demonstrates the utility of alternative data sources for enhancing insights into global financial market linkages and trends driven by shifting investor emotions.

Samuel et al., (2020) analyzed Covid-19 related tweets to gain public sentiment insights during the pandemic²⁰. It collects a large dataset of English tweets over two months and develops machine learning models to automatically classify tweets by sentiment. Popular hashtags, keywords and geographic variations in sentiment are also explored. The results reveal high levels of concern, support and care among tweeters. This research presents a timely study leveraging social media and artificial intelligence to gauge worldwide public opinions and reactions unfolding around the global health crisis. The dataset and classification models can support future pandemic response by providing real-time monitoring of grassroots perceptions and information needs online.

Studies Using Combinational Methods

Location-Based Sentiment Analysis

Over the past few years, sentiment analysis has evolved in the field from traditional lexicon-based methods to more sophisticated methods that take into account the context. In a comprehensive review of the dictionary-based methods. These methods based on emotion dictionaries are used, where emotion dictionaries assign polarity score for the emotional orientation of words of a text. Designs of popular dictionaries such as SentiWordNet, MPQA and NRC Emotion Lexicon were discussed and their performance were presented.

Nevertheless, lexicon-based approaches impose limited modeling for context and polarity of words to do so, that can misinterpret sentiment. When interpreting in complex or nuanced situations the surrounding text exerts a very strong influence on meaning, thus rendering methods which can take this variability into account necessary.

Pandey et al. (2019) also provided a dictionary-based approach to incorporate the sensory analysis in the case of film research based on the Bing Liu dictionary to analyze the perceptual cues of words and phrases²¹. The advantage of perceptual framework on the accuracy and the depth of sentiment analysis is demonstrated by this approach and demonstrates the need to integrate alternative methodologies in order to overcome the shortcomings of conventional lexicon-based approaches.

Chousa et al., (2021) studied the impact of investor sentiment on the green bond market²². Using a data set of over 800 green bond issues from 2012-2019, the authors measure weekly investor optimism/pessimism using a sentiment index developed from online surveys. A vector autoregression model is used to examine the impact of sentiment variables on green bond issuance rates and yields. The findings show that positive emotions significantly increase exit rates, whereas negative emotions decrease rates and increase productivity. This study provides new insights into the behavioral factors affecting the rapidly growing sustainable finance market. This highlights the importance of monitoring sentiment dynamics to better understand green bond demand and price trends over time.

Applications in Finance

Financial Prediction

This research is aimed at financial applications, especially concerning the prediction of stock price fluctuations. Thus, by conducting sentiment analysis and using sentiment scores together with the price data in machine learning models, the sentiment can be trained to reveal its dependence on the economic activity. Various internal tools for adjusting market sentiment can be implemented into the trading strategies based on sentiment analysis.

For instance, an internal tool could be created whereby the color red signifies a ‘sell signal’ whenever negative feelings or sentiments are found and green to signify ‘buy signal’ should positive sentiments be derived. This tool can constantly track data about trends, users’ reactions, and other real-time updates from social media, news, etc., update the sentiment value in real-time and give traders useful recommendations.

Practical Application Example

An internal tool could be created that tracks sentiment data in real time and offers them useful information for traders to trade actively. For example, the tool could be JustLabs Trading JLT: To change color indicating when to “sell” based on negative sentiment, and green when to “buy” based on positive sentiment²³. Whenever there is a need to trade various securities, investors can manage to incorporate the trading algorithms and the sentiment analysis of the prevailing emotions, ensuring that they incorporate the benefits of real time emotions to improve on the trading techniques and financial resultants.

Furthermore, the use of sentiment analysis technology in portfolio management is formed where investment firms can use quantitative trading models that integrate sentiment scores into the equities allocation after having established predetermined parameters that signal a lowering of the overall exposure to equities due to a negative sentiment. Portfolio suggestion may also benefit from this as for instance, during periods of high negative sentiment, it will suggest a defensive portfolio to reduce portfolio volatility while during positive sentiment, it will suggest a more aggressive portfolio to increase the portfolio returns. This make for better and timely investment decisions to be made based on sentiment analysis that has good forecasting ability in improving financial performance. Additionally, by using methodical techniques like these, portfolio managers can also incorporate sentiment signals into their trading algorithms:

Sentiment-Driven Rebalancing: Modifying asset allocations by establishing predetermined sentiment thresholds. For example, the algorithm might raise holdings in defensive assets like gold or bonds if the general attitude of the market falls below a particular threshold.
Event-Triggered Trading: Automated short-selling or hedging techniques may be used to reduce possible losses in response to a sudden spike in the unfavorable sentiment surrounding a business. Further, integrating sentiment indicators into conventional quantitative tools could enhance decision-making, with the ability to detect correct entry and exit points in trading²⁴.
Sector Rotation Strategies: Tracking mood within a sector to reroute investments to industries that are seen favorably in uncertain times.
Real-time news sentiment integration is the process of automatically rebalancing portfolios in response to news events and earnings announcements by using AI-driven sentiment score.

Investor Behavior Analysis

Examining emotional correlations with marketing metrics can provide behavioral insights. Studies have examined how emotions and voices are related to processes during volatile times. The use of sentiment analysis would allow for careful notice of changes in tone and pitch which can further be categorized into an emotion. Be it by calls or voice messages, this may shed light on how investors’ emotions drive decision-making, with implications for policy making during a recession when emotions exacerbate the recession.

Crisis Monitoring

In unexpected cases, continuous monitoring of sensitivity from other data can help manage problems. By simply detecting large changes in sentiment, financial institutions and governments can assess emerging financial risks in real time. It helps to develop structural threats or increased volatility due to dramatic developments that can help identify sentiment analysis.

Portfolio Optimization

The inclusion of contrasting sensory cues can improve the quality of the classifications. By analyzing the time sensitivity of inflation, strategies can be developed to capture changes. This makes the terms less sensitive to overall market volatility, therefore improving risk-adjusted through cyclical anti-cyclical strategies using contrarian strategies in crowdsourcing in the 19th century.

Risk Management

Continuous monitoring of sensitivity to new data types can help manage risk. Significant changes in mood can serve as early warning signs of systemic problems or impending change. This helps risk teams mitigate proposed ongoing threats by changing crowd perceptions before triggering traditional cues. Furthermore, the analysis algorithm may also predict the risk length and the steps to mitigate it or reduce its impact.

Policymaking

In policymaking, a framework may be designed to incorporate real-time sentiment from social media and other articles, to enable the government to detect emerging issues that are of concern to the public so that they may adjust the fiscal or monetary policies. For instance, a negative sentiment increase over unemployment will lead to a call for hiring or a specific stimulus. However, for policymakers, sentiment dashboards are a useful tool to monitor market and public opinion on trade deals, financial regulations, and economic policies. For instance:

Crisis Management: Early interventions, such monetary easing or stimulus measures, may be prompted during economic downturns by surges in negative sentiment on inflation or interest rate hikes.
Regulatory Impact Analysis: News stories and social media sentiment trends can be used to determine how people are responding to new rules and modify policy accordingly.
Public Confidence Monitoring: Policymakers can evaluate the success of government communication tactics by examining changes in sentiment before and after significant announcements.

Emerging Markets

Examining sentiment in emerging markets can reveal local trends. Understanding the emotional triggers of a particular region provides clues to the innovative games that produce local emotional effects. This allows us to consider mechanisms appropriate to the micro-social undercurrents that are shaping the economies of developing countries different from the mature state.

Discussion

Respective covered studies applied different sentiment analysis techniques such as lexicon-based approach and advanced machine learning algorithms for measuring public sentiment and its impact on S&P500 index.

Sentiment analysis and its relevance in specific financial indicators

Working out the association between market sentiment and financial volatility is essential to predictive modelling and risk assessment. Empirically, studies confirm its predictive power, as well as time lags and directional influence.

Sentiment and market volatility correlation

To analyse sentiment in financial markets, generally we need to use natural language processing (NLP) techniques to get sentiment from different data sources like news articles and the financial reports. Generally, there is a positive direct relationship between sentiment metrics and market volatility since there is a tendency for an increase in volatility whenever sentiment metrics are negatively surged as a result of higher uncertainty and risk aversion in investments. However, research regarding the causal claims made in other literature linking public sentiment metrics, like those related to the S&P500, to volatility does not sufficiently substantiate their connection.

Establishing Causal Relationships

Granger causality tests are often used to determine if shifts in sentiment can forecast future volatility. Vector autoregression (VAR) and vector error correction models (VECM) are applied to the analysis of dynamic relationship between sentiment and financial measures. In using these methods, volatility prediction abilities are increased and the way in which sentiment affects market swings is better explained. Future research should proceed by building more convincing empirical evidence for the causal connections of sentiment changes to market outcomes.

Recommendations for Future Research

Considering the found gaps, future studies should be made to integrate datasets with sentiment scores and the conventional financial indicators. Furthermore, studying interact relationships between behavioral finance aspects and sentiment-based analysis might disclose significant market dynamics. This will create a systematic approach towards making causality, which will increase the robustness of the findings in this field.

Figure 1: Correlation between S&P500 index and sentiment index²⁵

It also compares the changes in the sentiment index to the changes in the S&P500 index in Figure 1, which gives a further understanding of the correlation between public sentiment metrics and market reactions, such as S&P500 fluctuations. Over the amount of time from 1982 to 2018, we can see that the figure’s fluctuations of the sentiment model are much higher than the S&P500 index.

To improve the interpretative framework, the methodology used in this analysis should be improved. With this more structured method, researchers can put more strength and value behind the correlation between market movement and sentiment data. This also involves using advanced statistical techniques, for example, Granger causality tests, to determine the presence of any potential predictive relationship over time.

Additionally, integrating historical sentiment data with market events can supply a unified image of the particular correlation between investor sentiment and the functioning of the stock markets. To better understand the relationship between sentiment and market behavior, this methodological enhancement seeks to clarify the dynamic play between the two, which when added to the current analysis of the S&P500’s response to public sentiment, will contribute to the overall conclusion.

Sentiment Analysis Methods

A number of works were done on sentiment analysis based on the lexicon indices on financial news and SMT texts to derive positive and negative sentiment scores. There are two of these and this work based on certain keyword lists which are not very efficient when it comes to managing sentiment particularly in the global financial markets. For instance, it is almost impossible to compare documents containing general language with those containing market and financial terminology due to general-purpose lexicons’ limitations.

Other advanced models like LSTM neural networks, are better suited for handling contextual sentiment. These models can learn from big databases and are facilitated in detecting changes in sentiment since they are able to qualify in which sense words are being used. However, the disadvantage is that they are slow due to high amounts of processing and entail much training data to achieve reliable results due to their complexity.

These two techniques, ensemble and use of lexicons and deep learning, work hand in hand since each of them has its own advantages and disadvantages. For instance, when integrating a finance-related vocabulary with a deep learning algorithm to determine sentiment, this improves upon the accuracy of the sentiment analysis by informing the machine of the specific financial environment in which the messages are being sent along with the overall trends in sentiment messages. The studies employing ensemble approach indicated higher accuracy and better performance in terms of the forecast of the market directions

The paper then provides tables and graphs summarizing the various trends and differences in the different sentiment analysis techniques to make a clearer picture. For example, Figure 2 below compares the accuracy of models such as RF, LSTM CNN, DT, and FinBERT and shows that BERT gives the highest accuracy because of its self-sufficient nature and extensive amount of training data.

Not only does this provide a visual representation of the performance gap between different models, but it is also a quick reference for practitioners to look out when different techniques may more be suitable for their needs.

Figure 2: Comparison of BERT with LSTM and other models in stock market prediction based on textual data²⁶

In a comparison conducted in a paper, BERT achieves the highest accuracy due to its self-sustaining nature and its large database. However, it is important to note that sophisticated models such as LSTM, CNN, DT and BERT while yielding higher accuracy in predictive forecasts lack explainability, hence, it is challenging to explain how the forecasts are arrived at. This lack of transparency can hamper trust and its deployment in other actual applications. Furthermore, the integration of intricate models may involve considerable degrees of computing and programming, which many or most practitioners may not possess, that’s why including the requirement for advanced knowledge of machine learning and data science, along with the consequences of this, it acts as a barrier for small firms or individual investors who may find sentiment analysis advantageous but lack the funds for these forms of analysis.

Sentiment analysis is taken to advanced models like LSTM, CNN, and BERT with excellent but interpretable accuracy, but at the price of interpretability. Often, these models are ‘black boxes’, which makes users unable to understand how they are making decisions. Their architectures are very complex, comprising hundreds of millions of parameters, and make the interpretative pipeline complicated, and in many cases the results are unreliable when applied to real world data.

As a specific example, the inability to understand the model reasoning leads to a breakdown of trust in clinical applications where it is necessary not only to make a correct decision, but also to understand the reason for it. For instance, in psychiatric applications stakeholders need to be clear regarding such explanations that will give them satisfaction and confidence in the model’s reliability. Also, models can overfit —that is, capture noise rather than underlying patterns, which can result in spurious results.

To increase usage of these models, the inner workings have to be explained by using explainability techniques. For instance, frameworks for extraction of insights on how various inputs affect the prediction are a possibility. To enable more informed decisions, stake holders can gain more confidence and trust in more uses cases by enhancing the interpretability of these models.

In summary, due to the complexity of these models, they make a more accurate predicted than lexicon-based models. This illustrates that in future, advanced models will prove to be much more efficient and accurate though will take a high computational power.

Linking Sentiment Analysis to Financial Theory

Sentiment-Driven Mispricings and Market Anomalies

Extracting and measuring investor sentiment from news stories, social media, earnings reports, and other textual data sources is known as sentiment analysis in the financial markets. Research has shown that sentiment-driven trading can result in bubbles, excessive volatility, and momentum effects, among other pricing oddities. While excessively negative sentiment can lead to extreme pessimism and subsequent reversals, positive sentiment can push asset prices above their inherent values, creating momentum effects. Rather than depending on basic analysis, investors frequently follow popular sentiment trends, which leads to herd-driven asset price bubbles and crashes. Furthermore, periods of strong sentiment are linked to higher trading volumes, whereas periods of low feeling may cause price dislocations and market withdrawals.

Implications for the Efficient Market Hypothesis (EMH)

Sentiment research exposes systemic inefficiencies caused by investor psychology, which puts the strong-form EMH to the test. Though sentiment-driven mispricings imply that psychological variables play a significant role in asset valuation, weak-form and semi-strong forms of the EMH can account for some predictable patterns in stock prices. Due to limitations like risk aversion and short-selling prohibitions, rational arbitrageurs may find it difficult to rectify mispricings even if they are identified. According to the Adaptive Market Hypothesis (AMH), sentiment-induced inefficiencies may continue until arbitrage forces successfully offset them since market efficiency is dynamic and changes over time. Additionally, how quickly and accurately sentiment is digested and acted upon has an impact on market efficiency. Markets are rapidly integrating sentiment data due to the emergence of algorithmic trading and machine learning-based sentiment analysis; nevertheless, it is yet unclear whether these developments increase efficiency or worsen sentiment-driven biases. It is essential to comprehend these dynamics in order to improve financial models that use sentiment analysis and to create plans that lessen the negative impact that behavioral biases have on market efficiency.

Effect on Asset Price and Trading

The majority of all the papers proved that sentiment originated from news and social media has a material impact on the financial markets. Bearishness is frequently associated with declining share values while bulls, on the other hand, give impetus to trading especially in certain areas as perhaps, green bonds. Similar to fault, the effect of sentiment also depends on the type of the asset, location, and economic conditions of the country.

For instance, when COVID-19 began affecting the market, then, sentiment effects manifested advanced negative correlations. This goes to show that sentiment analysis is useful during a time of elevated uncertainty.

In other words, one ‘simple’ technique may be preferable for several reasons over a more ‘complicated’ method that has simply been shown to be more accurate under specific circumstances. For instance, straightforward linear models, such as logistic regression or decision trees, can outperform complex models when data is limited. Complex models are more prone to overfitting, as they may capture noise instead of the underlying data patterns.

Moreover, simpler models offer clearer interpretability, making it easier to explain decisions to stakeholders. This transparency enhances stakeholder confidence, especially in environments where understanding the rationale behind predictions is crucial.

In scenarios demanding quick decision-making, simpler approaches often yield faster results, which is essential for real-time applications. For example, in high-frequency trading or rapidly changing markets, the ability to quickly interpret and act on sentiment can be more valuable than the marginal gains in accuracy offered by complex models.

Additionally, simpler methods can be more robust in certain contexts, particularly when facing high dimensionality or when the underlying assumptions of the data align more closely with those of simpler models. For instance, when the data is well-understood and exhibits linear relationships, a logistic regression model may suffice and perform adequately without introducing unnecessary complexity.

Therefore, future work should emphasize a balanced comparison between machine learning techniques and simpler methods. This will highlight the conditions under which simpler models are not only sufficient but perhaps even preferable. Such discussions will provide valuable insights for practitioners, guiding them in the decision-making process regarding the appropriate methodology to apply based on their specific data context and analytical requirements²⁷

Based on the current results, it is possible to conclude that machine learning models are much more accurate while, at the same time, more computationally costly and requiring large, labeled datasets for their training.

Recommendations and Critical Discussion

From the above analyzed works, It is recommend the use of LSTM and ensemble methods for a better accuracy and because of their ability to perform such a task better. Yet, it is important to acknowledge situations when simple methods prevail over complex models. Simpler techniques such as logistic regression or decision trees can perform better for instance when there is limited data or when speed is required. The methods here are less inclined to overfit and are more informative to stakeholders, providing greater confidence in the results.

Additionally, simpler models tend to give faster predictions can be advantageous for real-time applications where decisions have to be taken quickly. The comparison of their performance and applicability to simpler techniques should then be considered in future work to effectively present the strengths and weaknesses of the latter.

Textual description and interpretation of these models can be accompanied by graphical analysis with the use of diagrams where the differences in the trends of the sentiment scores and S&P500 will best be illustrated. The Volatility Index (VIX) and S&P500 figures are illustrated in Figure 1 and significant points of element inverse correlation are shown. The VIX referred to as the ‘fear index’ is a measure of expected volatility over the next 30 days extracted from the price of option contracts on stocks of S&P500. As depicted in this figure, high levels of VIX come with low levels of the S&P500 index and the latter with high levels vice versa. This kind of an inverse relationship actually provides the need for sentiment analysis when it comes to analyzing markets to determine the subsequent move in relation to investment risks.

Figure 3: Correlation between VIX and S&P500²⁸

Other diagnostic tools that could also assist in establishing the effects of sentiment on the movements of the market include regression coefficients and rates of prediction errors. They give a set of quantitative values that characterizes the quality of the created models and their effectiveness for the financial prognosis. For instance, using such measures in graphics increases the analysis of how sentiment influences the financial market, in the case of VIX and S&P500 in Figure 3.

Incorporating Other Factors

Some papers examined the use of other variables for explanation to enhance the result of sentiment analysis. Some papers used sentiment scores that combined with historical stock price data or key political news in the neural network structures to improve the outcome of predicting the direction of an equity index. This indicated combining alternative metrics with psychological factors could strengthen market forecasting ability. Additionally, a few investigations assessed how individual traits like risk tolerance might impact how sentiment translates into trading decisions. Comparative analysis also provided useful context, such as one paper that contrasted social media emotion impacts during the H1N1 and COVID-19 pandemics. This revealed sentiment effects can differ substantially depending on the specific economic environment. For instance, earning reports contain important details about a firm’s financial health and when meshed into sentiment analysis one will get detailed forecasts of a specific stock. Another example is that interest rates and government policies affects the market condition and when such variables are included, it assists in establishing the of sentiment data within the relative economy.

In terms of the amount of added accuracy, the versatility can be substantial, though the ranges have been documented in studies. For instance, models that embrace both sentiments together with economical and historical factors have been proved to raise the level of accuracy of predictions by a scale of 10-20% against models that include only sentiment.

Emerging versus developed markets

To fully explain the differences in ways that sentiment analysis is employed in emerging vs. developed markets, a few market behavior and sentiment interpretation factors need to be taken into account.

The country also is an emerging market which is generally volatile and has little developed financial network like developed markets. Sentiment indicators in these markets can be strongly influenced by factors like political instability, economy, and culture. For example, investor response to news can also be extremely varied in emerging markets where local sentiments can be driven by socio-political factors to generate more pronounced market reactions.

On the other hand, developed markets tend to have more sophisticated financial infrastructures and higher liquidity, so that investor behavior is more stable and predictable. Sentiment analysis here proves its worth since the historical dataset is wider here. The availability of information coincides with the existence of trading patterns, allowing us to understand it more based on events around the world.

In addition, sentiment indicators tend to be interpreted more traditionally in developed markets and less so in emerging markets where sentiments can be more sensitive to news events and occasionally overreact. This serves to indicate that the sentiment analysis approaches must be customized concerning the differences in each of the markets.

This is important to understand when applying sentiment analysis techniques and to build models that can learn from the diverse market dynamics of emerging versus developed markets²⁹

Emerging Insights

The literature review highlighted several emerging insights. Specifically, the studies emphasized the increasing relevance of unconventional data sources, such as news articles, social media posts and search trends, for comprehending investor psychology and augmenting financial analyses. Conventional sources include structured data such as trading volumes, stock prices, economic indicators, and earnings reports similar to GDP and interest rates. Automated sentiment classification techniques were show to extract valuable predictive signals from such unstructured text-based data. In addition, some papers reported novel analyses, like being among the first examinations of how social platforms influence particular segments like green bonds. Similarly, cross-national studies also provided fresh strategic views on behavioral patterns in developing countries. Also, the ability to track opinions from the public on crises in real-time showed that sentiment tracking could help other fields apart from finance.

Methods

Data Collection

This comprises the results of studies from 2017 and onwards about sentiment analysis of financial markets. Insights were derived from a comparison of the studies based on the identification of some of the papers searched through the academic databases using keyword searches like ‘sentiment analysis and stock markets and social media and financial impact’.

Not only that but tweets and news articles were filtered by reputable sources such as news channels and academic publications to filter the reliability of content. Relevance to financial sentiment, credibility of source and timeliness of information were the factors considered for the selection. This was done so that fake news was not included in the analysis.

In addition, they used datasets that were representative of the broader public sentiment for the tweets and news. For example, terms such as tweets were collected from different geographical areas and from different sectors to neutralize the analysis over any particular industry or location. A time period was also considered; sentiment data before and after major financial events were used, to accurately capture the period in which sentiment fluctuated.

In summary, eligible papers were selected that quantitatively assessed relationships between sentiment and assets using regression or machine learning but were not completely algorithmic. Data was extracted on the methodology, key findings and implications from each. The results were then synthesized and compared based on common themes like the sentiment extraction technique, impact on prices/trading, incorporation of other factors, and emerging insights. This methodology aims to provide an overview and insights across this growing body of research.

Sentiment Analysis Models Evaluated

Sentiment analysis models were incorporated in the literature in many forms. Most of the lexicon-based methods employed dictionaries of positive and negative words to classify sentiment of the text at a document level. Some of the more complex processes included readiness algorithms such as Naïve Bayes, SVM and LSTM neural networks that were usually tested on pre-marked tweets or news. The integrations used ensemble methods that pulled together the results from lexicon-based and deep learning-based models as the two have their individual strengths. In papers assessing the different models, researchers compared benchmarks when it comes to the best model applicable for analysis of market behaviors from social media or other sources that can be considered as ‘alternative data’.

Sentiment Scores Computation

Thus, one of the significant transitions in the mechanism of the sentiment analysis was quantification of sentiment. For this, many papers used lexicon basing method that involves comparison of the text to the dictionaries and assigning polarity values to the word/phrase in the text. They were then summed up to produce overall sentiment indicators for documents, or periods of time, as the case was. Few other works used supervised machine learning trained on annotation data to classify the sentential sentiment into probability labels. Irrespective of the approach applied, cross-tabulation of the computed sentiment scores common ranged between -1 and 1, with negative/positive scores suggesting comparatively larger amount of pessimistic/ optimistic sentiment found. These quantitative parameters were useful in analyzing relationship with the dependent financial measures.

Furthermore, they can also have the deficiency of domain specificity; a word which gives a positive connotation in a particular context may not do so in another context. This results in wrong sentiment classification especially in most vibrant sectors such as finance. Specially, such restrictions need to be investigated in more detail in order to give a deeper insight of the performance of lexicon-based methods in real-life contexts. According to these challenges, solutions could also point to the need to incorporate higher level approaches especially machine learning into the sentiment analysis results³⁰.

Correlating Sentiment Scores with S&P500 Movements

A number of analyzed works were aimed at comparing calculated positive and negative sentiments with fluctuations in the S&P500 stock market index. Daily or weekly sentiment was gathered from SNS post, news articles or search queries. These scores were then matched to S&P500 return data that operated over the same number of years. Regression and such other analytical methods sought to uncover if negative sentiment was high when there were downturns or if downturns were high when there was high negative sentiment.

Evaluation of Sentiment Models

To evaluate the efficacy of sentiment models in forecasting S&P500 movements, several metrics and validation techniques are employed:

Accuracy: For the sentiment classification model, the most basic metric is the percentage accuracy. This includes the process of comparing the sentiment that has been predicted against another test data set that has been labeled and through this way, one is able to determine the level of accuracy that has been achieved in identifying the sentiments.

Precision and Recall: Precision gives the ratio of the size of positive sentiment and the total positive while recall quantifies the size of positive sentiment among actual positives. These metrics are useful in order to discern where the trade-off is between the model’s ability to accurately identify positive sentiment and its ability to capture all of the positive sentiment keywords.

F1 Score: F1 score is a single containing precise and recall in its blend as it is the reciprocal of the average of reciprocal of precision and recall. It should be for digit recognition, a higher F1 score represents better models.

Confusion Matrix: The elements in a confusion matrix include true positive, true negative, false positive and false negative which gives additional information about a sentiment model. This proves helpful in as far as it as enables the modeler to pinpoint areas where the model could be wrong.

Cross-Validation: It is also frequent to use K-fold cross-validation in order to check the stability of the sentiment model. Cross-validation is done with the help of the same dataset by partitioning the mentioned dataset into k subsets and training the model for k times, using the k subset for testing purpose and rest of all subsets for training purpose. This assists in raising the bar for the model’s proficiency in not depending on the train-test split.

Real-time Testing: To check the performance of the proposed model, it is used in real- time data streams of news articles and social media tweets. This entails the practice of constantly checking the validity of the developed models through comparing model predictions against real market movements with the intention of determining predictive ability.

Conclusion

In summary, this literature review of the use of sentiment analysis in economics revealed a number of promising applications that can be explored through further research. Using psychological indicators from other data sources will include quantitative modeling and marketing strategies They must be well prepared. While sentiment analysis is an emerging trend, its impact on asset prices and volumes demonstrates the value of continuing to collect these techniques. The full potential of these interdisciplinary approaches can be realized by combining economics and psychology with larger and higher-level data of emotions is their scope for developing tailored emotion-based solutions have been tailored to unique circumstances in areas such as emerging markets.

Future Scope

The future scope for applying sentiment analysis to economics seems larger as there is increasing recognition that emotional factors shape investment decisions. As data systems grow in size and coverage across geographies, there are opportunities to build more nuanced predictive models that are probed deeper with tools such as deep learning for clues implement customized marketing strategies for improving market growth. Regulators may also explore the formulation of policy tools based on psychological approaches to promote stability.

Specific implementation strategies should be outlined in order to add the beneficial elements of sentiment analysis to the practical use of the technique. This could include for example quantitative trading models to be developed by firms to incorporate sentiment scores in conjunction with traditional financial metrics in the portfolio management. If firms could establish clear thresholds for sentiment indicators, trading decisions could be automated – i.e., portfolios could be adjusted according to sentiment trends.

In addition, combining textual sentiment analysis with other data types, visual, or numerical market data, offers potential benefits. A simple example of such integration would be to add sentiment data to real time stock prices or trading volumes for richer insights and better forecasting.

Recent innovations in sentiment detection: The more recent development of sentiment detection through the use of emojis and sound cues, while being ‘novel’, are also ripe for investigation. This could also enable more complete capture of nuances that text may not communicate as well. An example is that sentiment analysis could benefit from additional context such as tone and emotion in spoken language.

These innovations can be explored by researchers to expand the relevance and applicability of sentiment analysis in the financial domain for the benefit of investors and the policymakers alike.

References

Kumar, P., Islam, M.A., Pillai, R. and Sharif, T. (2023). Analysing the behavioural, psychological, and demographic determinants of financial decision making of household investors. Heliyon, [online] 9(2), p.e13085. https://doi.org/10.1016/j.heliyon.2023.e13085. [↩]
J. Samuel, G. M. N. Ali, M. M. Rahman, E. Esawi, & Y. Samuel, Covid-19 public sentiment insights and machine learning for tweets classification. Information, 11(6), 314 (2020). DOI: https://doi.org/10.3390/info11060314 [↩]
A. Alsaeedi & M. Z. Khan, A study on sentiment analysis techniques of Twitter data. International Journal of Advanced Computer Science and Applications, 10(2), 361-374(2019). Link:https://thesai.org/Publications/ViewPaper?Volume=10&Issue=2&Code=ijacsa&Serialo=48 [↩]
M. Wankhade, A. C. S. Rao, & C. Kulkarni, A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review, 55(7), 5731-5780 (2022). DOI: https://doi.org/10.1007/s10462-02210144-1 [↩]
W. Van Atteveldt, M. A.Van der Velden, & M. Boukes, The validity of sentiment analysis: Comparing manual annotation, crowd-coding, dictionary approaches, and machine learning algorithms. Communication Methods and Measures, 15(2), 121-140 (2021). DOI: https://doi.org/10.1080/19312458.2020.1869198 [↩]
S. Elbagir, & J. Yang, Twitter sentiment analysis using natural language toolkit and VADER sentiment. In Proceedings of the international multiconference of engineers and computer scientists (Vol. 122, No. 16) (2019, March). Link: https://www.iaeng.org/publication/IMECS2019/IMECS2019_pp12-16.pdf [↩]
J. Samuel, G. M. N. Ali, M. M. Rahman, E. Esawi, & Y. Samuel, Covid-19 public sentiment insights and machine learning for tweets classification. Information, 11(6), 314 (2020). DOI: https://doi.org/10.3390/info11060314 [↩]
V. Bonta, N. Kumaresh, & N. Janardhan, A comprehensive study on lexicon-based approaches for sentiment analysis. Asian Journal of Computer Science and Technology, 8(S2), 1-6 (2019). DOI: https://doi.org/10.51983/ajcst-2019.8.S2.2037 [↩]
M. Pandey, R. Williams, N. Jindal, & A. Batra, Sentiment analysis using lexicon-based approach. IITM Journal of Management and IT, 10(1), 68-76 (2019). DOI:https:// www.indianjournals.com/ijor.aspx?target=ijor:iitmjmit&volume=10&issue=1&article=012 [↩]
H. H. Do, P. W. Prasad, A. Maag, & A. Alsadoon, Deep learning for aspect-based sentiment analysis: a comparative review. Expert systems with applications, 118, 272-299 (2019). DOI: https://doi.org/10.1016/j.eswa.2018.10.003 [↩]
N. C. Dang, M. N. Moreno-García, & De la Prieta, F. Sentiment analysis based on deep learning: A comparative study. Electronics, 9(3), 483 (2020). DOI: https://doi.org/10.3390/electronics9030483 [↩]
A. Yadav, & D. K. Vishwakarma, Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review, 53(6), 4335-4385 (2020). 10.1007/s10462-019-09794-5. Link: https://sci-hub.se/downloads/2019-12-06/6e/10.1007@s10462-019-09794-5.pdf [↩]
L. Yang, Y. Li, J. Wang, & R. S. Sherratt, Sentiment analysis for E-commerce product reviews in Chinese based on sentiment lexicon and deep learning. IEEE access, 8, 23522-23530 (2020). DOI: https://doi.org/10.1109/ACCESS.2020.2969854 [↩]
F. Audrino, F. Sigrist, & D. Ballinari, The impact of sentiment and attention measures on stock market volatility. International Journal of Forecasting, 36(2), 334-357 (2020). DOI: https://doi.org/10.1016/j.ijforecast.2019.05.010 [↩]
X. Li, P. Wu, & W. Wang, Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong. Information Processing & Management, 57(5), 102212 (2020). DOI: https://doi.org/10.1016/j.ipm.2020.102212 [↩]
H. Maqsood, I. Mehmood, M. Maqsood, M. Yasir, S. Afzal, F. Aadil, … & K. A. Muhammad, Local and global event sentiment based efficient stock exchange forecasting using deep learning. International Journal of Information Management, 50, 432-451 (2020). DOI: https://doi.org/10.1016/j.ijinfomgt.2019.07.011 [↩]
D. Valle-Cruz, V. Fernandez-Cortez, A. López-Chau, & R. Sandoval-Almazán, Does twitter affect stock market decisions? financial sentiment analysis during pandemics: A comparative study of the h1n1 and the covid-19 periods. Cognitive computation, 14(1), 372-387 (2022). DOI: https://doi.org/10.1007/s12559-021-09819-8 [↩]
W. Khan, U. Malik, M. A. Ghazanfar, M. A. Azam, K. H. Alyoubi, & A. S. Alfakeeh, Predicting stock market trends using machine learning algorithms via public sentiment and political situation analysis. Soft Computing, 24(15), 11019-11043 (2020). DOI: https://doi.org/10.1007/s00500-019-04347-y [↩]
X. Guo, & J. Li, A novel twitter sentiment analysis model with baseline correlation for financial market prediction with improved efficiency. In 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS) (pp. 472-477) (2019, October). IEEE. DOI: https://doi.org/10.1109/SNAMS.2019.8931720 [↩]
J. Samuel, G. M. N. Ali, M. M. Rahman, E. Esawi, & Y. Samuel, Covid-19 public sentiment insights and machine learning for tweets classification. Information, 11(6), 314 (2020). DOI: https://doi.org/10.3390/info11060314 [↩]
M. Pandey, R. Williams, N. Jindal, & A. Batra, Sentiment analysis using lexicon based approach. IITM Journal of Management and IT, 10(1), 68-76 (2019). Link: https:// www.indianjournals.com/ijor.aspx?target=ijor:iitmjmit&volume=10&issue=1&article=012 [↩]
J. Piñeiro-Chousa, M. Á. López-Cabarcos, J. Caby, & A. Šević, The influence of investor sentiment on the green bond market. Technological Forecasting and Social Change, 162, 120351 (2021). DOI: https://doi.org/10.1016/j.techfore.2020.120351 [↩]
JustLabs. (2023, November 30). JustLabs. https://justlabs.org/ [↩]
E. Georgiadou, S. Angelopoulos, and H. Drake, Big data analytics and international negotiations: Sentiment analysis of Brexit negotiating outcomes. International Journal of Information Management, p.102048 (2019). DOI: https://doi.org/10.1016/j.ijinfomgt.2019.102048. [↩]
D. Vidal-Tomás, & S. Alfarano, An agent-based early warning indicator for financial market instability. 15(1), 49–87 (2020). DOI: https://doi.org/10.1007/s11403-019-00272-3 [↩]
R. Tandon, Prediction of Stock Market Trends Based on Large Language Models. ITEGAM- Journal of Engineering and Technology for Industrial Applications (ITEGAM-JETIA), 11(9) (2024). DOI: https://www.researchgate.net/publication/384017320_Prediction_of_Stock_Market_Trends_Based_on_Large_Language_Models [↩]
V. Hassija, V. Chamola, A. Mahapatra, A. Singal, D. Goel, K. Huang, S. Scardapane, I. Spinelli, M. Mahmud, and A. Hussain, Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence. Cognitive Computation, [online] 16(1) (2023). DOI: https://doi.org/10.1007/s12559-023-10179-8. [↩]
F. C. Dumiter, F. M. Turcaș, Ș. A. Nicoară, C. Bente, & M. Boiţă, The Impact of Sentiment Indices on the Stock Exchange—The Connections between Quantitative Sentiment Indicators, Technical Analysis, and Stock Market. Mathematics, 11(14), 3128–3128 (2023). DOI: https://doi.org/10.3390/math11143128 [↩]
J. Deeks, J. Higgins and D. Altman, Chapter 10: Analysing data and undertaking meta-analyses. Cochrane Training [online] (2023). DOI: https://training.cochrane.org/handbook/current/chapter-10. [↩]
P. Ventura and F. Cruz, Domain Specificity vs. Domain Generality: The Case of Faces and Words. Vision, [online] 8(1), p.1 (2024). DOI: https://doi.org/10.3390/vision8010001. [↩]