Comparing Machine Learning and an Expert System for Legal Document Classification
DOI:
https://doi.org/10.59490/dgo.2025.947Keywords:
machine learning, legal document classification, expert systems, overfitting, natural language processingAbstract
This study assesses the performance of machine learning models and a rule-based expert system in classifying legal documents, specifically in distinguishing relevant from irrelevant cases. The evaluated models include Random Forest, Naive Bayes, XGBoost, SVM, and Decision Tree, alongside an expert system developed by a State Attorney from PGE-PE. The datasets, representing Alvará, Arrolamento, and Inventário legal processes, contain labeled instances of legal cases. The models were assessed based on accuracy, precision, recall, and F1-score. The results suggest that while machine learning models—particularly Random Forest—achieve higher accuracy and precision, the expert system outperforms in recall and F1-score, ensuring that no relevant cases are overlooked. The choice between machine learning models and expert systems depends on the legal context, requiring a balance between efficiency (reducing false positives) and reliability (capturing all relevant cases).
Downloads
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2025 José Jorge de Queiroz Santos Filho, Filipe Araújo Dantas, Melquezedeque da Silva Lima, Shirley Barbosa dos Santos, Galileu Genesis, Maria Gabriely Lima da Salva, Álvaro Farias Pinheiro, Eraylson Galdino da Silva

This work is licensed under a Creative Commons Attribution 4.0 International License.