Comparing Machine Learning and an Expert System for Legal Document Classification

Authors

DOI:

https://doi.org/10.59490/dgo.2025.947

Keywords:

machine learning, legal document classification, expert systems, overfitting, natural language processing

Abstract

This study assesses the performance of machine learning models and a rule-based expert system in classifying legal documents, specifically in distinguishing relevant from irrelevant cases. The evaluated models include Random Forest, Naive Bayes, XGBoost, SVM, and Decision Tree, alongside an expert system developed by a State Attorney from PGE-PE. The datasets, representing Alvará, Arrolamento, and Inventário legal processes, contain labeled instances of legal cases. The models were assessed based on accuracy, precision, recall, and F1-score. The results suggest that while machine learning models—particularly Random Forest—achieve higher accuracy and precision, the expert system outperforms in recall and F1-score, ensuring that no relevant cases are overlooked. The choice between machine learning models and expert systems depends on the legal context, requiring a balance between efficiency (reducing false positives) and reliability (capturing all relevant cases).

Downloads

Download data is not yet available.

Downloads

Published

2025-05-19

How to Cite

de Queiroz Santos Filho, J. J., Araújo Dantas, F., da Silva Lima, M., Barbosa dos Santos, S., Genesis, G., Lima da Salva, M. G., Farias Pinheiro, Álvaro, & Galdino da Silva, E. (2025). Comparing Machine Learning and an Expert System for Legal Document Classification. Conference on Digital Government Research, 1. https://doi.org/10.59490/dgo.2025.947