Data-Driven Analysis for Improving Educational Policies
A Case in Brazil’s Textbook Program
DOI:
https://doi.org/10.59490/dgo.2025.1023Keywords:
Official Approval, Data Science, Descriptive Analysis, PNLDAbstract
Data analytics can support evidence-based decision-making in public policies by enabling the identification of patterns, forecasting needs, and prioritizing actions. Consequently, data-driven analysis can aid policymakers in redesigning and enhancing educational policies, such as textbook distribution for public schools. However, there is no consensus on a structured approach for descriptive analytics in this context. This study presents a descriptive approach to textual data analysis aimed at improving policy implementation and monitoring, with a focus on effort, productivity, and quality of reviews produced by textbook evaluators. Through a case study, we apply natural language processing techniques to analyze thousands of answers to rubrics during the pedagogical evaluation of a public call for literary works under the Brazilian textbook program (Programa Nacional do Livro e do Material Didático - PNLD). The PNLD is one of the most extensive textbook policies, impacting millions of students. Our findings shed light on challenges related to the effort involved and the quality of written reports in the pedagogical evaluation process. Analyzing reports, which reflect some desired and undesired behaviors of evaluators, can offer policymakers insights for making informed decisions and improving textbook programs worldwide. Our descriptive approach to textual data analysis leverages insights to enhance transparency, inform improvements, and guide policy implementation through real-time monitoring.
Downloads
References
Badrul Hisham, A. A., Mohamed Yusof, N. A., Salleh, S. H., & Abas, H. (2024). Transforming governance: A systematic reviewof ai applications in policymaking. Journal of Science, Technology and Innovation Policy, 10(1), 7–15.
Clark, P., Llewellyn, K., Capó García, R., & Clifford, S. (2024). Context matters in history textbook studies: A call to address the socio-political landscape of textbook production. Historical Encounters, 11(1), 136–150.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Delen, D. (2019). Prescriptive analytics: The final frontier for evidence-based management and optimal decision making. FT Press.
Honnibal, M., & Montani, I. (2017). Spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing.
Kulal, A., Rahiman, H. U., Suvarna, H., Abhishek, N., & Dinesh, S. (2024). Enhancing public service delivery efficiency: Exploring the impact of ai. Journal of Open Innovation: Technology, Market, and Complexity, 10, 100329.
Kwangil Park, J. S. H., & Kim, W. (2020). A methodology combining cosine similarity with classifier for text classification. Applied Artificial Intelligence, 34(5), 396–411.
Liu, X. (2023). Effects of free textbooks on academic performance: Evidence from china’s compulsory education. Review of Development Economics, 27(4), 2518–2537.
McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochem Med (Zagreb), 22(3), 276–282. Pencheva, I., Esteve, M., & Mikhaylov, S. J. (2020). Big data and ai in education policy: Addressing inequality through technological adoption. Policy and Internet, 12(3), 256–276.
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using siamese BERT-networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3982–3992.
Santoso, A., Kartika, D., Lestari, P., & Yusof, Z. B. (2024). Reimagining public resource allocation: Insights into ai and analytical techniques to address governance challenges in the 21st century. Journal of Applied Smart Healthcare Informatics, 66–72.
Sharda, R., Delen, D., & Turban, E. (2018). Business intelligence, analytics, and data science: A managerial perspective. pearson.
Silva, A. S., Araújo, A., Palomino, P., Araújo, R., & Dermeval, D. (2024). Mastering requirements volatility: Strategies for dynamic environments and multiple stakeholders in government software projects. IEEE Access, 12, 183060–183077.
Sobrinho, A., Ibert Bittencourt, I., Carvalho Melo da Silveira, A., Pedro da Silva, A., Dermeval, D., Brandão Marques, L., Cezar IanzerRodrigues, N., Carolina Silva e Souza, A., Ferreira, R., & Isotani, S. (2023). Towards digital transformation of the validation and triage process of textbooks in the brazilian educational policy. Sustainability, 15(7).
UNESCO. (2016). Every child should have a textbook. Verma, V., & Aggarwal, R. K. (2020). A comparative analysis of similarity measures akin to the jaccard index in collaborative recommendations: Empirical and theoretical perspective. Social Network Analysis and Mining, 10(1), 43.
Wang, J., & Dong, Y. (2020). Measurement of text similarity: A survey. Information, 11(9).
Downloads
Published
How to Cite
Conference Proceedings Volume
Section
License
Copyright (c) 2025 André Araújo, Rafael Araújo, Luciano Cabral, Luciane Silva, Hilario Tomaz, Emerson Martins, Diego Dermeval, Álvaro Sobrinho, Alan Pedro da Silva, Leonardo Marques, Filipe Recch, Sebastian Munoz-Najar Galvez, Seiji Isotani, Ig Ibert Bittencourt

This work is licensed under a Creative Commons Attribution 4.0 International License.
