Developing a Multifaceted Central Bank Communication Dataset for Natural Language Processing-Driven Economic Analysis

Hasta Pradana; Ardik Ardianto; Abdurrahman Thaha; Kurnia Kasmiarno

Previous Article in event

A Quantum Leap in Asset Pricing:Explaining Anomalous Returns

Next Article in event

Hybrid Machine Learning Models for Long-Term Stock Market Forecasting: Integrating Technical Indicators

Developing a Multifaceted Central Bank Communication Dataset for Natural Language Processing-Driven Economic Analysis

Hasta Dwi Pradana

^{*

1},

Ardik Ardianto

²,

Abdurrahman Rahim Thaha

³,

Kurnia Sari Kasmiarno

⁴

¹ Department of Development Economics, Universitas Terbuka, 15437, Indonesia
² Department of English Language and Literature, Universitas Terbuka, 15437, Indonesia
³ Department of Business Administration, Universitas Terbuka, 15437, Indonesia
⁴ Department of Sharia Economics, Universitas Terbuka, 15437, Indonesia

Academic Editor: Thanasis Stengos

Published: 13 June 2025 by MDPI in The 1st International Online Conference on Risk and Financial Management session Machine Learning in Economics and Finance

Abstract:

Central bank communication is a pivotal component in supporting economic and monetary policy in many countries. The efficacy of central bank communication affects market perception and the credibility of monetary policy, thus necessitating analytical tools to assess it. This study seeks to develop a dataset called CentralBankCorpus, the first multi-faceted dataset in Indonesia designed to comprehensively analyze monetary policy and central bank communication. This study employed a document analysis method with a labeling technique. It began by collecting official Bank Indonesia communication documents by means of transcription and scrapping. The collected data were further pre-processed and labeled with six linguistic tags. The dataset yields the CentralBankCorpus, comprising nearly half a million linguistically tagged tokens, spanning economic agent, topic, sentiment, transparency, key terms, and economic impact. This dataset will profoundly influence multiple facets. Academically, it will serve as the primary reference for NLP-focused research in economics, public policy, and organizational communication. Practically, it can assist Bank Indonesia in comprehending and addressing public perceptions of their policies, hence enhancing institutional accountability. This research ultimately endorses Bank Indonesia’s digital transformation through innovative application of NLP technology. Furthermore, it addresses a gap in the literature and contributes significantly to Indonesia’s economic development, while enhancing the nation’s role in the use of modern technology for policy communication at a broader level.

Keywords: Central Bank; corpus; dataset; economy; natural language processing

13 Reads
0 Recommendations

Hasta Pradana

Ardik Ardianto

Abdurrahman Thaha

Kurnia Kasmiarno