Multilingual X/Twitter sentiment analysis of geopolitical risk using granger causality focusing on the Ukraine war and financial markets

Authors

  • John Burns University of St Andrews, Scotland, United Kingdom https://orcid.org/0000-0002-2325-0235
  • Tom Kelsey School of Computer Science, University of St Andrews, St Andrews, Scotland, United Kingdom https://orcid.org/0000-0002-8091-1458
  • Carl Donovan School of Mathematics and Statistics, University of St Andrews, St Andrews, Scotland, United Kingdom

DOI:

https://doi.org/10.29329/jsomer.23

Keywords:

X / Twitter, Ukraine War, Sentiment Analysis, Financial Market Analysis, Multilingual Analysis

Abstract

This paper investigates the changes in financial assets and markets from December 1st, 2021, to April 30th, 2022, during the start of the Ukraine War. These dates roughly correspond to the prelude to the War in December 2021 to a few weeks after Russian troops withdrew from the Kyiv area on April 7th, 2022. We used the Goldstein 1992 Results Table to create Positive and Negative Geopolitical Risk bigrams (Goldstein, 1992). With these bigrams, we collected over 3.6 million tweets during our research period in seven different languages (English, Spanish, French, Portuguese, Arabic, Japanese, and Korean) to capture worldwide reaction to the Ukraine War. Using various sentiment analysis methods, we constructed a time series of changes in the daily Geopolitical Risk sentiment. We explored its relationship to 39 financial assets and markets at various time lags. We found through Granger causality that the geopolitical risk time series contained predictive information on several assets and market changes.

References

Research on 100 Million Tweets: What It Means for Your Social Media Strategy for Twitter. (2018) Vicinitas https://www.vicinitas.io/blog/twitter-social-media-strategy-2018-research-100-million-tweets#language.

Twarc. (2023) Twarc https://twarc-project.readthedocs.io/en/latest/.

World Map: Simple. (2022) Map Chart https://www.mapchart.net/world.html.

Abouzahra, M., & Tan, J. (2021) Twitter Vs. Zika—the Role of Social Media in Epidemic Outbreaks Surveillance. Health Policy and Technology, vol. 10, no. 1, pp. 174-81, doi:https://doi.org/10.1016/j.hlpt.2020.10.014.

Abraham, J., Higdon, D., Nelson, J., & Ibarra, J. (2018) Cryptocurrency Price Prediction Using Tweet Volumes and Sentiment Analysis. SMU Data Science Review, vol. 1, no. 1, https://scholar.smu.edu/datasciencereview/vol1/iss3/1.

Altig, D., Baker, S. R., Barrero, J. M., Bloom, N., Bunn, P., Chen. S., Davis. S. J., Leather, J., Meyer. B.H., Mihaylov, E., Mizen, P., Parker, N. B., Renault .T., Smietanka, P., & Thwaites, G. (2020) Economic Uncertainty before and During the Covid-19 Pandemic. Working Paper Series, National Bureau of Economic Research, doi:10.3386/w27418.

Amen, S. (2020) Political Market Making: Trading Financial Markets Using Thorfinn Political Indices. Data, Foreign Exchange, General, vol. 2023, Cuemacro, https://www.cuemacro.com/2020/06/26/political-market-making/.

Aroussi, R. (2023) Reliably Download Historical Market Data from with Python. Ran Aroussi https://aroussi.com/post/python-yahoo-finance.

Association, National Sunflower (2023) World Supply & Disappearance. National Sunflower Association https://www.sunflowernsa.com/stats/world-supply/.

Augustop. (2019) Portuguese Tweets for Sentiment Analysis. Kaggle https://www.kaggle.com/datasets/augustop/portuguese-tweets-for-sentiment- analysis?select=TweetsWithTheme.csv.

Baker, S. R., Bloom, N., Davisc, S. J., & Renaultd, T. (2021) Twitter-Derived Measures of Economic Uncertainty. Twitter-based Uncertainty Indices, Economic Policy Uncertainty, May 13th, 2021, pp. 1-14. general editor, Economic Policy Uncertainty, https://www.policyuncertainty.com/media/Twitter_Uncertainty_5_13_2021.pdf.

Baur, D. G., Hong, K., & Lee, A. D. (2018). Bitcoin: Medium of Exchange or Speculative Assets? Journal of International Financial Markets, Institutions and Money, vol. 54, pp. 177-89, doi:https://doi.org/10.1016/j.intfin.2017.12.004.

Beykikhoshk, A., Arandjelovic, O., Phung, D., & Venkatesh, S. (2015). Using Twitter to Learn About the Autism Community. Social Network Analysis and Mining, vol. 5, no. 1, doi:10.1007/s13278-015-0261-5.

Bollen, J., Mao, H., & Zeng, X. (2011). Twitter Mood Predicts the Stock Market. Journal of Computational Science, vol. 2, no. 1, pp. 1-8, doi:10.1016/j.jocs.2010.12.007.

Brady, W. J., Will,s J. A., Jost, J. T., & Van Bavel, J. J. (2017). Emotion Shapes the Diffusion of Moralized Content in Social Networks. Proceedings of the National Academy of Sciences, vol. 114, no. 28, pp. 7313-18, doi:10.1073/pnas.1618923114.

Burns, J. C. (2024). Automatic-GR GitHub. https://github.com/jb370/Automatic-GR

Caldara, D., & Iacoviello, M., (2022). Measuring Geopolitical Risk. American Economic Review, 112(4) ,1194-225, doi:10.1257/aer.20191823.

Cambria, E. (2013). An Introduction to Concept-Level Sentiment Analysis. Advances in Soft Computing and Its Applications. MICAI 2013, Heidelberg, Berlin, doi:https://doi.org/10.1007/978-3-642-45111-9_41.

Cañete, J., Chaperon, G., Fuentes, R., Ho, J. H., Kang, H., & Pérez, J. (2020) BETO: Spanish BERT. ICLR 2020, https://github.com/dccuchile/beto?tab=readme-ov-file

CNBC. (2023). U.S. 2 Year Treasury. CNBC https://www.cnbc.com/quotes/US2Y.

Darkmap. (2016) japanese_sentiment/data, Github.com, https://github.com/Darkmap/japanese_sentiment/tree/master/data

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. Google AI Language. doi:10.48550/arxiv.1810.04805.

Engelberg, J., & Parsons, C. A. (2009). The Causal Impact of Media in Financial Markets. Workshop in Behavorial Finance, Yale University, pp. 1-44. http://www.econ.yale.edu/~shiller/behfin/2009_11/engelberg-parsons.pdf.

Gamebusterz. (2017) xac, French-Sentiment-Analysis-Dataset, Github.com, https://github.com/gamebusterz/French- Sentiment-Analysis-Dataset/bl ob/master/xac

Gamebusterz. (2017) xaj, French-Sentiment-Analysis-Dataset, Github.com, https://github.com/gamebusterz/French- Sentiment-Analysis- Dataset/blob/master/xaj

Géron, A. (2019). Hands-on Machine Learning with Scikit-Learn and Tensorflow: Concepts, Tools, and Techniques to Build Intelligent Systems. O'Reilly

Gilbert, E., & Karahalios, K. (2010). Widespread Worry and the Stock Market. Proceedings of the International AAAI Conference on Web and Social Media, vol. 4, no. 1, 2010, pp. 58-65, doi:10.1609/icwsm.v4i1.14023.

GoldHub. (2023). Gold Spot Prices. GoldHub https://www.gold.org/goldhub/data/gold-prices.

Goldstein, J. S. (1992). A Conflict-Cooperation Scale for Weis Events Data. The Journal of Conflict Resolution, vol. 36, no. 2, pp. 369-85, https://www.jstor.org/stable/174480.

Granger, C. W. J. (1969). Investigating Causal Relations by Econometric Models and Cross-Spectral Methods. Econometrica, vol. 37, no. 3, p. 424, doi:10.2307/1912791.

Granger, C. W. J. (2003). Time Series Analysis, Cointegration, and Applications. Nobel Prize. https://www.nobelprize.org/uploads/2018/06/granger-lecture.pdf.

Hayes, A. (2023). What Is Price Stickiness? Definition, Triggers, and Example. ECONOMICS. Investopedia https://www.investopedia.com/terms/p/price_stickiness.asp#:~:text=%22Sticky%22%20is%20a%20general%20ec onomics,that%20is%20resistant%20to%20change.

Hutto, C. J. & Gilbert, E, (2014). Vader: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. International AAAI Conference on Weblogs and Social Media (ICWSM), http://eegilbert.org/papers/icwsm14.vader.hutto.pdf.

Inoue, G., Alhafni, B., Baimukan, N., Bouamor, H., & Habash, N., (2021). The Interplay of Variant, Size, and Task Type in Arabic Pre-Trained Language Models. Proceedings of the Sixth Arabic Natural Language Processing Workshop, Association for Computational Linguistics, https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-mix-sentiment.

Investing.com. (2023a) Germany 10-Year Bond Yield. Investing.com https://www.investing.com/rates-bonds/germany -10-year-bond-yield-historical-data.

Investing.com. (2023b). United States 2-Year Bond Yield. Investing.com https://www.investing.com/rates-bonds/u.s.-2 -year-bond-yield-historical-data.

Kleinnijenhuis, J., Schultz, F., Oegema, D., & van Atteveldt, W. (2013) Financial News and Market Panics in the Age of High- Frequency Sentiment Trading Algorithms. Journalism, vol. 14, no. 2, pp. 271-91, doi:10.1177/1464884912468375.

Lee, S., Jang, H., Baik, Y., Park, S., & Shin, H. (2020). KR-BERT: a small-scale Korean-specific language model, arXiv, https://doi.org/10.48550/arXiv.2008.03979

Lhessani, S. (2023). Python: How to Get Live Market Data (Less Than 0.1-Second Lag) Medium https://towardsdatascience.com/python-how-to-get-live-market-data-less-than-0-1-second-lag-c85ee280ed93.

LiveCharts.co.uk. (2023) Live Charts - Crude Oil Chart. LiveCharts https://www.livecharts.co.uk/MarketCharts/crude.php.

Martin, L., Muller, B., Ortiz Suárez, P. J., Dupont, Y., Romary, L., de la Clergerie, É., Seddah, D., & Sagot, B. (2020). CamemBERT: a tasty french language model, Proceedings of the 58th Annual Meeting of the Association of Computer Linguistics, https://aclanthology.org/2020.acl-main.645/

McClelland, C. (2006). World Event/Interaction Survey (WEIS) Project, 1966-1978. Inter-university Consortium for Political and Social Research [distributor], doi: 10.3886/ICPSR05211.v3

Monitor, Markets. (2023) Metals & Mining Overview. ETF.com https://www.etf.com/topics/metals-mining.

Niu, Z., Wang, C., & Zhang, H. (2023). Forecasting stock market volatility with various geopolitical risks categories: New evidence from machine learning models. International Review of Financial Analysis, vol. 89, October 2023, https://doi.org/10.1016/j.irfa.2023.102738

Nofsinger, J. R, (2005). Social Mood and Financial Economics. Journal of Behavioral Finance, 6(3), 144-60, https://doi.org/10.1207/s15427579jpfm0603_4

Pak, A., & Paroubek, P. (2010) Twitter as a Corpus for Sentiment Analysis and Opinion Mining. Seventh International Conference on Language Resources and Evaluation ({LREC}'10), European Language Resources Association (ELRA), http://www.lrec-conf.org/proceedings/lrec2010/pdf/385_Paper.pdf.

Park, L. (2015) Nsmc. GitHub.com https://github.com/e9t/nsmc.

Pearse, B. (2021) Human and Machine Translation: Both Alive and Kicking — and Here to Stay. Translation & Localization Blog. Smart Cat https://www.smartcat.com/blog/human-and-machine-translation-both-alive-and-kicking-and-here-to-stay/

Pop, C., Bozdog, D., Calugaru, A., & Georgescu, M. A. (2016) Chapter 7 - an Assessment of the Real Development Prospects of the Eu 28 Frontier Equity Markets. Handbook of Frontier Markets, Academic Press, 2016, pp. 117-46, https://doi.org/10.1016/B978-0-12-803776-8.00007-0

Pota, M., Ventura, M., Fujita, H., & Esposito, M. (2021) Multilingual Evaluation of Pre-Processing for Bert-Based Sentiment Analysis of Tweets. Expert Systems with Applications, vol. 181, p. 115-119, https://doi.org/10.1016/j.eswa.2021.115119.

Prabhakaran, S. (2023) Granger Causality Test in Python. Machine Learning Plus https://www.machinelearningplus.com/time-series/granger-causality-test-in-python/.

Rajput, N. K., Grover, B. A., & Rathi, V. K. (2020) Word Frequency and Sentiment Analysis of Twitter Messages During Coronavirus Pandemic. Computing Research Repository (CoRR), vol. abs/2004.03925, https://doi.org/10.48550/arxiv.2004.03925.

Rognone, L., Hyde, S., & Zhang, S. S. (2020) News Sentiment in the Cryptocurrency Market: An Empirical Comparison with Forex. International Review of Financial Analysis, vol. 69, https://doi.org/10.1016/j.irfa.2020.101462.

Smith, A. (2023) 23 Essential Twitter Statistics to Guide Your Strategy in 2023. Sprout Blog. sproutsocial https://sproutsocial.com/insights/twitter-statistics/.

Souza, F., Nogueira, R., & Lotufo, R. (2020) BERTimbau: pretrained BERT models for Brazilian Portuguese, Intelligent Systems: 9th Brazilian Conference, BRACIS 2020, https://dl.acm.org/doi/10.1007/978-3-030-61377-8_28

Standards, National Institute of. (2023) Engineering Statistics Handbook: Stationarity. Process or Product Monitoring and Control. National Institute of Standards https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc442.htm#:~:text=Stationarity%20can%20be%20define d%20in,no%20periodic%20fluctuations%20(seasonality).

Strycharz, J., Strauss, N., & Trilling, D. (2017) The Role of Media Coverage in Explaining Stock Market Fluctuations: Insights for Strategic Financial Communication. International Journal of Strategic Communication, vol. 12, pp. 1-19, https://doi.org/10.1080/1553118X.2017.1378220.

TASS 2020. (2020) Workshop on Semantic Analysis at Sepln 2020. TASS 2020, http://tass.sepln.org/2020/.

Tetlock, P. C. (2007). Giving Content to Investor Sentiment: The Role of Media in the Stock Market. The Journal of Finance, vol. 62, no. 3, pp. 1139-1168, https://doi.org/10.1111/j.1540-6261.2007.01232.x

Tetlock. P. C., Saar-Tsechansky, M., & MacSkassy, S. (2008) More Than Words: Quantifying Language to Measure Firms' Fundamentals. The Journal of Finance, vol. 63, no. 3, pp. 1437-67, doi:10.1111/j.1540-6261.2008.01362.x.

Thurman, W., & Fisher, M. E. (1988). Chickens, Eggs, and Causality, or Which Came First? American Journal of Agricultural Economics, vol. 70, no. 2, pp. 237-38, https://webdoc.agsci.colostate.edu/koontz/arec- econ535/papers/thurman%20fisher%20(ajae%201988).pdf.

Tohoku-nlp (2020). Bert-base-japanese, Hugging Face, https://huggingface.co/tohoku-nlp/bert-base-japanese

Uhl, M. W. (2014). Reuters Sentiment and Stock Returns. Journal of Behavioral Finance, vol. 15, no. 4, pp. 287-98, https://doi.org/10.1080/15427560.2014.967852.

Wold, C. (2023). Top 3 Defense Etfs (Ppa, Xar). Investopedia https://www.investopedia.com/news/top-3-defense-etfs- ppa-xar/.

Yilmazkuday, H. (2024). Geopolitical Risk and Stock Prices. Department of Economics, Florida International University, Working Paper 2407, https://economics.fiu.edu/research/working-papers/2024/2407.pdf

YuJeong, S., Yun, D. Y., Hwang, C., & Moon, S. J. (2021). Korean Sentiment Analysis Using Natural Network: Based on Ikea Review Data. International Journal of Internet, Broadcasting and Communication, vol. 13, no. 2, pp. 173-78 https://doi.org/http://dx.doi.org/10.7236/IJIBC.2021.13.2.173.

Zach. (2023). How to Perform a Granger-Causality Test in Python. statology https://www.statology.org/granger- causality-test-in-python/.

Zote, J. (2025). 45+ Twitter (X) stats to know in marketing in 2025. Sprout Blog. Sproutsocial https://sproutsocial.com/insights/twitter-statistics/

Additional Files

Published

02.06.2025

How to Cite

Burns, J., Kelsey, T., & Donovan, C. (2025). Multilingual X/Twitter sentiment analysis of geopolitical risk using granger causality focusing on the Ukraine war and financial markets. Journal of Social Media Research, 2(2), 122–138. https://doi.org/10.29329/jsomer.23

Issue

Section

Review Article