Central bank communication is a pivotal component in supporting economic and monetary policy in many countries. The efficacy of central bank communication affects market perception and the credibility of monetary policy, thus necessitating analytical tools to assess it. This study seeks to develop a dataset called CentralBankCorpus, the first multi-faceted dataset in Indonesia designed to comprehensively analyze monetary policy and central bank communication. This study employed a document analysis method with a labeling technique. It began by collecting official Bank Indonesia communication documents by means of transcription and scrapping. The collected data were further pre-processed and labeled with six linguistic tags. The dataset yields the CentralBankCorpus, comprising nearly half a million linguistically tagged tokens, spanning economic agent, topic, sentiment, transparency, key terms, and economic impact. This dataset will profoundly influence multiple facets. Academically, it will serve as the primary reference for NLP-focused research in economics, public policy, and organizational communication. Practically, it can assist Bank Indonesia in comprehending and addressing public perceptions of their policies, hence enhancing institutional accountability. This research ultimately endorses Bank Indonesia’s digital transformation through innovative application of NLP technology. Furthermore, it addresses a gap in the literature and contributes significantly to Indonesia’s economic development, while enhancing the nation’s role in the use of modern technology for policy communication at a broader level.
                    Previous Article in event
            
                            
    
                    Next Article in event
            
                            
                                                    
        
                    Developing a Multifaceted Central Bank Communication Dataset for Natural Language Processing-Driven Economic Analysis
                
                                    
                
                
                    Published:
13 June 2025
by MDPI
in The 1st International Online Conference on Risk and Financial Management
session Machine Learning in Economics and Finance
                
                
                
                    Abstract: 
                                    
                        Keywords: Central Bank; corpus; dataset; economy; natural language processing
                    
                
                
                 
         
            
 
        
    
    
         
    
    
         
    
    
         
    
    
         
    
