Please login first
Legal Document Classification into High-Frequency Procedural Categories Using Machine Learning
* , * ,
1  PPGEEL- Postgraduate Program in Electrical Engineering, State University of Amazonas, Manaus, Amazonas, Brazil
Academic Editor: Lucia Billeci

Abstract:

The continuous increase in the number of legal cases submitted to the judiciary has imposed a significant burden on the court system, making the analysis, identification, and classification of similar actions within large and complex datasets increasingly challenging. When performed manually, this task becomes not only time-consuming but also highly prone to human error, potentially compromising the efficiency and reliability of judicial procedures. This study investigates the application of Natural Language Processing (NLP) techniques to automate and enhance the classification of procedural acts into high-frequency categories. Specifically, the Continuous Bag of Words (CBOW) and Skip-gram models—both based on word embedding strategies—were implemented in conjunction with the Logistic Regression algorithm for supervised classification. The dataset, comprising approximately 311,000 legal documents from the Court of Justice of the State of Amazonas (TJAM), was processed through a robust pipeline, including automated web scraping, advanced text preprocessing, vocabulary construction, and model training. The experimental results were highly promising: the models achieved an accuracy rate of 95% and an F1-score of 95%, demonstrating the strong potential of integrating NLP with machine learning to optimize procedural management. By automating repetitive and labor-intensive classification tasks, the proposed approach not only reduces processing time and human workload but also enables judicial institutions to allocate more resources to complex cases requiring expert human judgment, thereby improving efficiency, reducing backlog, and enhancing access to justice.

Keywords: Natural Language Processing, Text Classification, Legal Classification
Comments on this paper
Currently there are no comments available.


 
 
Top