Assessing the capability of open government data for process mining
a study on a Brazilian open data portal
DOI:
https://doi.org/10.59490/dgo.2025.1041Keywords:
Digital Government, Process Mining, Open Government Data, Public Service OptimizationAbstract
The growth in Open Government Data (OGD) availability over the last decade offers significant potential for promoting transparency and improving operational efficiency in government. OGD can support the identification of inefficiencies and the extraction of process models through Process Mining (PM) applied to public service tasks. Various OGD portals, maintained by federal and subnational governments, provide access to diverse public datasets that can be explored for these purposes. Applying PM to OGD, however, requires addressing key aspects such as the tabular nature of data, variability in quality and standardization, and the frequent lack of process context. This study proposes a method to classify and evaluate OGD datasets based on their relevance and potential for process discovery. The method was applied to data from the Brazilian Federal OGD portal (dados.gov.br) to investigate whether OGD can be effectively used in PM tasks to identify processes and bottlenecks. The research involved steps of data collection, selection, and evaluation. Datasets were classified based on their potential utility and suitability for transformation into event logs. The results showed that 24% of the sampled datasets were considered relevant for PM, with 23% in dated tabular format and 1% already in event log structure. Moreover, 52% of the datasets addressed public policies, suggesting the potential of PM to reveal inefficiencies in processes that affect citizens. These findings demonstrate that, with structured evaluation criteria, PM can be effectively applied to public data. This research contributes by presenting an empirical and replicable method to support the evaluation and preparation of OGD for PM appli- cations in digital government. The results reinforce the importance of adopting clear assessment protocols to broaden the use of OGD in understanding and improving public processes.
Downloads
References
Barcellos, R., Bernardini, F., & Viterbo, J. (2022). Towards defining data interpretability in open data portals: Challenges and research opportunities. Information Systems, 106. DOI: https://doi.org/10.1016/j.is.2021.101961.
Barcellos, R., Bernardini, F., Zuiderwijk, A., & Viterbo, J. (2024). Exploring interpretability in open government data with chatgpt. Proceedings of the 25th Annual International Conference on Digital Government Research, 186–195. DOI: https://doi.org/10.1145/3657054.3657079.
Berners-Lee, T. (2006, July). Linked data - design issues. [link]
Brasil, do Planejamento e Orçamento, M., & de Planejamento, S. N. (2023). Plano plurianual 2024-2027: Mensagem presidencial/ministério do planejamento e orçamento, secretaria nacional de planejamento. [link]
Brazil. (2024a). Brazilian federal government’s open data portal - portal brasileiro de dados abertos. [link]
Brazil. (2024b). What it is and how it works. federal government transparency portal. [link]
Burke, A. T., Leemans, S. J., Wynn, M. T., van der Aalst, W. M., & ter Hofstede, A. H. (2024). A chance for models to show their quality: Stochastic process model-log dimensions. Information Systems, 124, 102382. DOI: https://doi.org/10.1016/J.IS.2024.102382.
Cartaxo, B., Pinto, G., & Soares, S. (2020). Rapid reviews in software engineering. In M. Felderer & G. H. Travassos (Eds.), Contemporary empirical methods in software engineering (pp. 357–384). Springer International Publishing. DOI: https://doi.org/10.1007/978-3-030-32489-6_13.
Chen, K., Abtahi, F., Carrero, J.-J., Fernandez-Llatas, C., & Seoane, F. (2023). Process mining and data mining applications in the domain of chronic diseases: A systematic review. Artificial Intelligence in Medicine, 144, 102645. DOI: https://doi.org/10.1016/j.artmed.2023.102645.
Chen, K., Abtahi, F., Carrero, J.-J., Fernandez-Llatas, C., Xu, H., & Seoane, F. (2024). Proposing a novel methodology for process mining in clinical epidemiology: Insights and validation from a case study on chronic kidney disease progression. SSRN.
Corrêa, A. S., de Paula, E. C., Corrêa, P. L. P., & da Silva, F. S. C. (2017). Transparency and open government data: A wide national assessment of data openness in brazilian local governments. Transforming Government: People, Process and Policy, 11, 58–78. DOI: https://doi.org/10.1108/TG-12-2015-0052/FULL/PDF.
da Costa, H. P., de Assis, J. M. V., & de Vasconcelos, C. C. (2020, January). Case study: Government process mining in the brazilian executive branch. [link]
de Murillas, E. G. L., Reijers, H. A., & van der Aalst, W. M. (2019). Connecting databases with process mining: A meta model and toolset. Software and Systems Modeling, 18, 1209–1247. DOI: https://doi.org/10.1007/S10270-018-0664-7/FIGURES/34.
de Oliveira, D., de Oliveira, D. G., & Filho, O. O. (2024). Dados abertos da previdência social: Um estudo avaliativo. Revista Meta: Avaliação, 0. DOI: https://doi.org/10.22347/2175-2753v0i0.4797.
de Vasconcelos, G. S. (2024, October). Process mining in open government data [PhD thesis]. Fluminense Federal University, Computer Science Institute. [link]
de Vasconcelos, G. S., Bernardini, F., & Viterbo, J. (2024a). A comparison between the most used process mining tools in the market and in academia: Identifying the main features based on a qualitative analysis. In A. Rocha, H. Adeli, G. Dzemyda, F. Moreira, & V. Colla (Eds.), Information systems and technologies (pp. 218–228). Springer Nature Switzerland. DOI: https://doi.org/10.1007/978-3-031-45645-9_21.
de Vasconcelos, G. S., Bernardini, F., & Viterbo, J. (2024b). Criteria for evaluating and qualifying public datasets obtained from the brazilian federal government’s open data portal - dados.gov. DOI: https://doi.org/10.17632/X8SGCYKTHN.1.
de Vasconcelos, G. S., Bernardini, F., & Viterbo, J. (2024c, March). Datasets obtained from the brazilian federal government’s open data portal - dados.gov. DOI: https://doi.org/10.6084/m9.figshare.25514884.v5.
der Aalst, W. V. (2016). Process mining: Data science in action. Springer. DOI: https://doi.org/10.1007/978-3-662-49851-4.
Dilmegani, C. (2024). Top 44 process mining use cases applications in 2024. AIMultiple Research. [link]
Drakoulogkonas, P., & Apostolou, D. (2021). On the selection of process mining tools. Electronics (Switzerland), 10, 1–24. DOI: https://doi.org/10.3390/electronics10040451.
Elkoumy, G., Pankova, A., & Dumas, M. (2023). Differentially private release of event logs for process mining. Information Systems, 115. DOI: https://doi.org/10.1016/j.is.2022.102161.
Elsevier B.V. (2025). Scopus preview - scopus - welcome to scopus. [link]
Erdem, S., & Demirörs, O. (2017). An exploratory study on usage of process mining in agile software development. Communications in Computer and Information Science, 770, 187–196. DOI: https://doi.org/10.1007/978-3-319-67383-7_14.
Febrianti, N. A. (2024). Analisis proses pengajuan akta kelahiran dispendukcapil surabaya menggunakan process mining untuk mempercepat waktu proses. ITS. [link]
Google LLC. (2024). Google trends: Search term ”process mining”, from 2004 to 2024, in brazil. [link]
Google LLC. (2025). About google scholar. [link]
Heumann, M., Kraschewski, T., Werth, O., & Breitner, M. H. (2024). Reassessing taxonomy-based data clustering: Unveiling insights and guidelines for application. SSRN. DOI: https://doi.org/10.2139/SSRN.4716206.
Ito, S., Vymětal, D., & Šperka, R. (2020). Process mining approach to formal business process modelling and verification: A case study. Journal of Modelling in Management, 16. DOI: https://doi.org/10.1108/JM2-03-2020-0077.
Jetzek, T., Avital, M., & Bjørn-Andersen, N. (2019). The sustainable value of open government data. Journal of the Association for Information Systems, 20, 6. DOI: https://doi.org/10.17705/1jais.00549.
Kerremans, M., Iijima, K., Sachelarescu, A. R., Duffy, N., & Sugden, D. (2023, March). Magic quadrant for process mining tools. [link]
Kerremans, M., Sugden, D., & Duffy, N. (2024, April). Magic quadrant for process mining platforms. [link]
Koch, R. (1999). The 80/20 principle: The secret of achieving more with less (3a). Crown Business.
Lanoue, J. (2020). Disparate environmental monitoring as a barrier to the availability and accessibility of open access data on the tidal thames. Publications, 8(1), 6.
Macedo, D. F., & da Silva Lemos, D. L. (2024). Open government data: Maturity diagnosis model for quality data published on the web. Em Questão, 30, 1–14. DOI: https://doi.org/10.1590/1808-5245.30.132617.
Montgomery, D. C., & Runger, G. C. (2018). Applied statistics and probability for engineers (7th). Wiley. ONU. (2015). Objetivos de desenvolvimento sustentável. [link]
Parks, W. (1957). The open government principle: Applying the right to know under the constitution. The George Washington Law Review, 26, 1–22. [link]
Rajabi, E., Midha, R., & de Souza, J. F. (2023). Constructing a knowledge graph for open government data: The case of nova scotia disease datasets. Journal of Biomedical Semantics, 14, 1–10. DOI: https://doi.org/10.1186/S13326-023-00284-W/FIGURES/5.
Rawiro, D., Gaol, F. L., Supangkat, S., & Ranti, B. (2023). Process mining applications in government sector: A systematic literature review. IEOM, 1–14. DOI: https://doi.org/10.46254/eu05.20220617.
Seeliger, A., Guinea, A. S., Nolle, T., & Mühlhäuser, M. (2019). Processexplorer: Intelligent process mining guidance. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11675 LNCS, 1]. DOI: https://doi.org/10.1007/978-3-030-26619-6_15.
Unger, A. J., Neto, J. F. D. S., Fantinato, M., Peres, S. M., Trecenti, J., & Hirota, R. (2021). Process mining-enabled jurimetrics: Analysis of a brazilian court’s judicial performance in the business law processing. Proceedings of the 18th International Conference on Artificial Intelligence and Law, ICAIL 2021, 240–244. DOI: https://doi.org/10.1145/3462757.3466137.
Vasconcelos, L., Barcellos, R., Viterbo, J., Bernardini, F., Salgado, L., & Trevisan, D. (2020). Investigating communicability issues in the open data manipulation flow. AMCIS 2020 Proceedings. [link]
Zerbato, F., Soffer, P., & Weber, B. (2021). Initial insights into exploratory process mining practices. Lecture Notes in Business Information Processing, 427 LNBIP, 145–161. DOI: https://doi.org/10.1007/978-3-030-85440-9_9.
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2025 Gyslla Vasconcelos, Jose Viterbo, Flavia Bernardini

This work is licensed under a Creative Commons Attribution 4.0 International License.
