Multilingual X/Twitter sentiment analysis of geopolitical risk using granger causality focusing on the Ukraine war and financial markets
DOI:
https://doi.org/10.29329/jsomer.23Keywords:
X / Twitter, Ukraine War, Sentiment Analysis, Financial Market Analysis, Multilingual AnalysisAbstract
This paper investigates the changes in financial assets and markets from December 1st, 2021, to April 30th, 2022, during the start of the Ukraine War. These dates roughly correspond to the prelude to the War in December 2021 to a few weeks after Russian troops withdrew from the Kyiv area on April 7th, 2022. We used the Goldstein 1992 Results Table to create Positive and Negative Geopolitical Risk bigrams (Goldstein, 1992). With these bigrams, we collected over 3.6 million tweets during our research period in seven different languages (English, Spanish, French, Portuguese, Arabic, Japanese, and Korean) to capture worldwide reaction to the Ukraine War. Using various sentiment analysis methods, we constructed a time series of changes in the daily Geopolitical Risk sentiment. We explored its relationship to 39 financial assets and markets at various time lags. We found through Granger causality that the geopolitical risk time series contained predictive information on several assets and market changes.
References
Research on 100 Million Tweets: What It Means for Your Social Media Strategy for Twitter. (2018) Vicinitas https://www.vicinitas.io/blog/twitter-social-media-strategy-2018-research-100-million-tweets#language.
Twarc. (2023) Twarc https://twarc-project.readthedocs.io/en/latest/.
World Map: Simple. (2022) Map Chart https://www.mapchart.net/world.html.
Abouzahra, M., & Tan, J. (2021) Twitter Vs. Zika—the Role of Social Media in Epidemic Outbreaks Surveillance. Health Policy and Technology, vol. 10, no. 1, pp. 174-81, doi:https://doi.org/10.1016/j.hlpt.2020.10.014.
Abraham, J., Higdon, D., Nelson, J., & Ibarra, J. (2018) Cryptocurrency Price Prediction Using Tweet Volumes and Sentiment Analysis. SMU Data Science Review, vol. 1, no. 1, https://scholar.smu.edu/datasciencereview/vol1/iss3/1.
Altig, D., Baker, S. R., Barrero, J. M., Bloom, N., Bunn, P., Chen. S., Davis. S. J., Leather, J., Meyer. B.H., Mihaylov, E., Mizen, P., Parker, N. B., Renault .T., Smietanka, P., & Thwaites, G. (2020) Economic Uncertainty before and During the Covid-19 Pandemic. Working Paper Series, National Bureau of Economic Research, doi:10.3386/w27418.
Amen, S. (2020) Political Market Making: Trading Financial Markets Using Thorfinn Political Indices. Data, Foreign Exchange, General, vol. 2023, Cuemacro, https://www.cuemacro.com/2020/06/26/political-market-making/.
Aroussi, R. (2023) Reliably Download Historical Market Data from with Python. Ran Aroussi https://aroussi.com/post/python-yahoo-finance.
Association, National Sunflower (2023) World Supply & Disappearance. National Sunflower Association https://www.sunflowernsa.com/stats/world-supply/.
Augustop. (2019) Portuguese Tweets for Sentiment Analysis. Kaggle https://www.kaggle.com/datasets/augustop/portuguese-tweets-for-sentiment- analysis?select=TweetsWithTheme.csv.
Baker, S. R., Bloom, N., Davisc, S. J., & Renaultd, T. (2021) Twitter-Derived Measures of Economic Uncertainty. Twitter-based Uncertainty Indices, Economic Policy Uncertainty, May 13th, 2021, pp. 1-14. general editor, Economic Policy Uncertainty, https://www.policyuncertainty.com/media/Twitter_Uncertainty_5_13_2021.pdf.
Baur, D. G., Hong, K., & Lee, A. D. (2018). Bitcoin: Medium of Exchange or Speculative Assets? Journal of International Financial Markets, Institutions and Money, vol. 54, pp. 177-89, doi:https://doi.org/10.1016/j.intfin.2017.12.004.
Beykikhoshk, A., Arandjelovic, O., Phung, D., & Venkatesh, S. (2015). Using Twitter to Learn About the Autism Community. Social Network Analysis and Mining, vol. 5, no. 1, doi:10.1007/s13278-015-0261-5.
Bollen, J., Mao, H., & Zeng, X. (2011). Twitter Mood Predicts the Stock Market. Journal of Computational Science, vol. 2, no. 1, pp. 1-8, doi:10.1016/j.jocs.2010.12.007.
Brady, W. J., Will,s J. A., Jost, J. T., & Van Bavel, J. J. (2017). Emotion Shapes the Diffusion of Moralized Content in Social Networks. Proceedings of the National Academy of Sciences, vol. 114, no. 28, pp. 7313-18, doi:10.1073/pnas.1618923114.
Burns, J. C. (2024). Automatic-GR GitHub. https://github.com/jb370/Automatic-GR
Caldara, D., & Iacoviello, M., (2022). Measuring Geopolitical Risk. American Economic Review, 112(4) ,1194-225, doi:10.1257/aer.20191823.
Cambria, E. (2013). An Introduction to Concept-Level Sentiment Analysis. Advances in Soft Computing and Its Applications. MICAI 2013, Heidelberg, Berlin, doi:https://doi.org/10.1007/978-3-642-45111-9_41.
Cañete, J., Chaperon, G., Fuentes, R., Ho, J. H., Kang, H., & Pérez, J. (2020) BETO: Spanish BERT. ICLR 2020, https://github.com/dccuchile/beto?tab=readme-ov-file
CNBC. (2023). U.S. 2 Year Treasury. CNBC https://www.cnbc.com/quotes/US2Y.
Darkmap. (2016) japanese_sentiment/data, Github.com, https://github.com/Darkmap/japanese_sentiment/tree/master/data
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. Google AI Language. doi:10.48550/arxiv.1810.04805.
Engelberg, J., & Parsons, C. A. (2009). The Causal Impact of Media in Financial Markets. Workshop in Behavorial Finance, Yale University, pp. 1-44. http://www.econ.yale.edu/~shiller/behfin/2009_11/engelberg-parsons.pdf.
Gamebusterz. (2017) xac, French-Sentiment-Analysis-Dataset, Github.com, https://github.com/gamebusterz/French- Sentiment-Analysis-Dataset/bl ob/master/xac
Gamebusterz. (2017) xaj, French-Sentiment-Analysis-Dataset, Github.com, https://github.com/gamebusterz/French- Sentiment-Analysis- Dataset/blob/master/xaj
Géron, A. (2019). Hands-on Machine Learning with Scikit-Learn and Tensorflow: Concepts, Tools, and Techniques to Build Intelligent Systems. O'Reilly
Gilbert, E., & Karahalios, K. (2010). Widespread Worry and the Stock Market. Proceedings of the International AAAI Conference on Web and Social Media, vol. 4, no. 1, 2010, pp. 58-65, doi:10.1609/icwsm.v4i1.14023.
GoldHub. (2023). Gold Spot Prices. GoldHub https://www.gold.org/goldhub/data/gold-prices.
Goldstein, J. S. (1992). A Conflict-Cooperation Scale for Weis Events Data. The Journal of Conflict Resolution, vol. 36, no. 2, pp. 369-85, https://www.jstor.org/stable/174480.
Granger, C. W. J. (1969). Investigating Causal Relations by Econometric Models and Cross-Spectral Methods. Econometrica, vol. 37, no. 3, p. 424, doi:10.2307/1912791.
Granger, C. W. J. (2003). Time Series Analysis, Cointegration, and Applications. Nobel Prize. https://www.nobelprize.org/uploads/2018/06/granger-lecture.pdf.
Hayes, A. (2023). What Is Price Stickiness? Definition, Triggers, and Example. ECONOMICS. Investopedia https://www.investopedia.com/terms/p/price_stickiness.asp#:~:text=%22Sticky%22%20is%20a%20general%20ec onomics,that%20is%20resistant%20to%20change.
Hutto, C. J. & Gilbert, E, (2014). Vader: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. International AAAI Conference on Weblogs and Social Media (ICWSM), http://eegilbert.org/papers/icwsm14.vader.hutto.pdf.
Inoue, G., Alhafni, B., Baimukan, N., Bouamor, H., & Habash, N., (2021). The Interplay of Variant, Size, and Task Type in Arabic Pre-Trained Language Models. Proceedings of the Sixth Arabic Natural Language Processing Workshop, Association for Computational Linguistics, https://huggingface.co/CAMeL-Lab/bert-base-arabic-camelbert-mix-sentiment.
Investing.com. (2023a) Germany 10-Year Bond Yield. Investing.com https://www.investing.com/rates-bonds/germany -10-year-bond-yield-historical-data.
Investing.com. (2023b). United States 2-Year Bond Yield. Investing.com https://www.investing.com/rates-bonds/u.s.-2 -year-bond-yield-historical-data.
Kleinnijenhuis, J., Schultz, F., Oegema, D., & van Atteveldt, W. (2013) Financial News and Market Panics in the Age of High- Frequency Sentiment Trading Algorithms. Journalism, vol. 14, no. 2, pp. 271-91, doi:10.1177/1464884912468375.
Lee, S., Jang, H., Baik, Y., Park, S., & Shin, H. (2020). KR-BERT: a small-scale Korean-specific language model, arXiv, https://doi.org/10.48550/arXiv.2008.03979
Lhessani, S. (2023). Python: How to Get Live Market Data (Less Than 0.1-Second Lag) Medium https://towardsdatascience.com/python-how-to-get-live-market-data-less-than-0-1-second-lag-c85ee280ed93.
LiveCharts.co.uk. (2023) Live Charts - Crude Oil Chart. LiveCharts https://www.livecharts.co.uk/MarketCharts/crude.php.
Martin, L., Muller, B., Ortiz Suárez, P. J., Dupont, Y., Romary, L., de la Clergerie, É., Seddah, D., & Sagot, B. (2020). CamemBERT: a tasty french language model, Proceedings of the 58th Annual Meeting of the Association of Computer Linguistics, https://aclanthology.org/2020.acl-main.645/
McClelland, C. (2006). World Event/Interaction Survey (WEIS) Project, 1966-1978. Inter-university Consortium for Political and Social Research [distributor], doi: 10.3886/ICPSR05211.v3
Monitor, Markets. (2023) Metals & Mining Overview. ETF.com https://www.etf.com/topics/metals-mining.
Niu, Z., Wang, C., & Zhang, H. (2023). Forecasting stock market volatility with various geopolitical risks categories: New evidence from machine learning models. International Review of Financial Analysis, vol. 89, October 2023, https://doi.org/10.1016/j.irfa.2023.102738
Nofsinger, J. R, (2005). Social Mood and Financial Economics. Journal of Behavioral Finance, 6(3), 144-60, https://doi.org/10.1207/s15427579jpfm0603_4
Pak, A., & Paroubek, P. (2010) Twitter as a Corpus for Sentiment Analysis and Opinion Mining. Seventh International Conference on Language Resources and Evaluation ({LREC}'10), European Language Resources Association (ELRA), http://www.lrec-conf.org/proceedings/lrec2010/pdf/385_Paper.pdf.
Park, L. (2015) Nsmc. GitHub.com https://github.com/e9t/nsmc.
Pearse, B. (2021) Human and Machine Translation: Both Alive and Kicking — and Here to Stay. Translation & Localization Blog. Smart Cat https://www.smartcat.com/blog/human-and-machine-translation-both-alive-and-kicking-and-here-to-stay/
Pop, C., Bozdog, D., Calugaru, A., & Georgescu, M. A. (2016) Chapter 7 - an Assessment of the Real Development Prospects of the Eu 28 Frontier Equity Markets. Handbook of Frontier Markets, Academic Press, 2016, pp. 117-46, https://doi.org/10.1016/B978-0-12-803776-8.00007-0
Pota, M., Ventura, M., Fujita, H., & Esposito, M. (2021) Multilingual Evaluation of Pre-Processing for Bert-Based Sentiment Analysis of Tweets. Expert Systems with Applications, vol. 181, p. 115-119, https://doi.org/10.1016/j.eswa.2021.115119.
Prabhakaran, S. (2023) Granger Causality Test in Python. Machine Learning Plus https://www.machinelearningplus.com/time-series/granger-causality-test-in-python/.
Rajput, N. K., Grover, B. A., & Rathi, V. K. (2020) Word Frequency and Sentiment Analysis of Twitter Messages During Coronavirus Pandemic. Computing Research Repository (CoRR), vol. abs/2004.03925, https://doi.org/10.48550/arxiv.2004.03925.
Rognone, L., Hyde, S., & Zhang, S. S. (2020) News Sentiment in the Cryptocurrency Market: An Empirical Comparison with Forex. International Review of Financial Analysis, vol. 69, https://doi.org/10.1016/j.irfa.2020.101462.
Smith, A. (2023) 23 Essential Twitter Statistics to Guide Your Strategy in 2023. Sprout Blog. sproutsocial https://sproutsocial.com/insights/twitter-statistics/.
Souza, F., Nogueira, R., & Lotufo, R. (2020) BERTimbau: pretrained BERT models for Brazilian Portuguese, Intelligent Systems: 9th Brazilian Conference, BRACIS 2020, https://dl.acm.org/doi/10.1007/978-3-030-61377-8_28
Standards, National Institute of. (2023) Engineering Statistics Handbook: Stationarity. Process or Product Monitoring and Control. National Institute of Standards https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc442.htm#:~:text=Stationarity%20can%20be%20define d%20in,no%20periodic%20fluctuations%20(seasonality).
Strycharz, J., Strauss, N., & Trilling, D. (2017) The Role of Media Coverage in Explaining Stock Market Fluctuations: Insights for Strategic Financial Communication. International Journal of Strategic Communication, vol. 12, pp. 1-19, https://doi.org/10.1080/1553118X.2017.1378220.
TASS 2020. (2020) Workshop on Semantic Analysis at Sepln 2020. TASS 2020, http://tass.sepln.org/2020/.
Tetlock, P. C. (2007). Giving Content to Investor Sentiment: The Role of Media in the Stock Market. The Journal of Finance, vol. 62, no. 3, pp. 1139-1168, https://doi.org/10.1111/j.1540-6261.2007.01232.x
Tetlock. P. C., Saar-Tsechansky, M., & MacSkassy, S. (2008) More Than Words: Quantifying Language to Measure Firms' Fundamentals. The Journal of Finance, vol. 63, no. 3, pp. 1437-67, doi:10.1111/j.1540-6261.2008.01362.x.
Thurman, W., & Fisher, M. E. (1988). Chickens, Eggs, and Causality, or Which Came First? American Journal of Agricultural Economics, vol. 70, no. 2, pp. 237-38, https://webdoc.agsci.colostate.edu/koontz/arec- econ535/papers/thurman%20fisher%20(ajae%201988).pdf.
Tohoku-nlp (2020). Bert-base-japanese, Hugging Face, https://huggingface.co/tohoku-nlp/bert-base-japanese
Uhl, M. W. (2014). Reuters Sentiment and Stock Returns. Journal of Behavioral Finance, vol. 15, no. 4, pp. 287-98, https://doi.org/10.1080/15427560.2014.967852.
Wold, C. (2023). Top 3 Defense Etfs (Ppa, Xar). Investopedia https://www.investopedia.com/news/top-3-defense-etfs- ppa-xar/.
Yilmazkuday, H. (2024). Geopolitical Risk and Stock Prices. Department of Economics, Florida International University, Working Paper 2407, https://economics.fiu.edu/research/working-papers/2024/2407.pdf
YuJeong, S., Yun, D. Y., Hwang, C., & Moon, S. J. (2021). Korean Sentiment Analysis Using Natural Network: Based on Ikea Review Data. International Journal of Internet, Broadcasting and Communication, vol. 13, no. 2, pp. 173-78 https://doi.org/http://dx.doi.org/10.7236/IJIBC.2021.13.2.173.
Zach. (2023). How to Perform a Granger-Causality Test in Python. statology https://www.statology.org/granger- causality-test-in-python/.
Zote, J. (2025). 45+ Twitter (X) stats to know in marketing in 2025. Sprout Blog. Sproutsocial https://sproutsocial.com/insights/twitter-statistics/
Additional Files
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 John Burns, Tom Kelsey, Carl Donovan

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.