Not only what, but also when

Understanding Brazilian political comments on legislative bills over time through Stance Detection and Topic Modeling

Authors

DOI:

https://doi.org/10.59490/dgo.2025.955

Keywords:

Natural Language Processing, Legal Text Analysis, Stance Detection, Topic Modeling

Abstract

Legislative public spaces are important structures for participatory democracy, allowing citizens’ voices to get engaged with politic decisions. As a consequence of the popularization of information and communication technologies, internet based tools have played an important role to improve public participation in political decisions, known as e-Democracy. These tools are usually composed of a set of functionalities or small services, named microservices. The better the microservices, the higher the citizen participation. This work investigates how to extract useful knowledge from citizen participation in the microservices of the public portal of the Brazilian Chamber of Deputies. For such, it analyzes public comments incorporating Natural Language Processing and Artificial Intelligence techniques in a platform named Ulysses. The tasks developed on this paper focus on a temporal analysis of comments on bills in the portal through Stance Detection and dynamic Topic Modeling tasks. For the first task, OxêSD, a BERTimbau-based model, was trained on two different corpora, one of them translated into Portuguese, and its predictive performance was evaluated using the F1 and ROC-AUC metrics, achieving 73% for both on our proposed Political-BRSD a mixed dataset containing both translated content from a bigger multilingual dataset (adapted from x-Stance) and bill-specific content (adapted from Ulysses-SD); for the second, BERTopic, a Topic Modeling framework, was used. Visualization tools to analyze how the proposed approach addressed the task were also used to explore the knowledge extracted. They allow the user to understand over time how the comments relate to each other and how the comments relate to a given legislative bill.

Downloads

Download data is not yet available.

References

Addor, F. (2018). Reflections on local participatory democracy in latin america. Revista de Administração Pública, 52(6), 1108–1124. DOI: https://doi.org/10.1590/0034-761220170131.

Alkhalifa, R., Kochkina, E., & Zubiaga, A. (2021). Opinions are made to be changed: Temporally adaptive stance classification. Proceedings of the 2021 Workshop on Open Challenges in Online Social Networks, 27–32. DOI: https://doi.org/10.1145/3472720.3483620.

Alturayeif, N., Luqman, H., & Ahmed, M. (2023). A systematic review of machine learning techniques for stance detection and its applications. Neural Computing and Applications, 35(7), 5113–5144. DOI: https://doi.org/10.1007/s00521-023-08285-7.

Avgerinos Loutsaris, M., Lachana, Z., Alexopoulos, C., & Charalabidis, Y. (2021). Legal text processing: Combing two legal ontological approaches through text mining. DG.O2021: The 22nd Annual International Conference on Digital Government Research, 522–532. DOI: https://doi.org/10.1145/3463677.3463730.

Baccouri, N. (2023). Deep translator.

Bergam, N., Allaway, E., & Mckeown, K. (2022). Legal and political stance detection of SCOTUS language. Proceedings of the Natural Legal Language Processing Workshop 2022, 265–275. DOI: https://doi.org/10.18653/v1/2022.nllp-1.25.

Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. Proceedings of the 23rd International Conference on Machine Learning, 113–120. DOI: https://doi.org/10.1145/1143844.1143859.

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993–1022.

Câmara dos Deputados. (2018, November). Câmara lança ulysses, robô digital que articula dados legislativos. Portal da Câmara dos Deputados. Retrieved November 27, 2023, from [link]

Câmara dos Deputados. (2020, July). Enquete do PL 2630/2020. Portal da Câmara dos Deputados. Retrieved November 27, 2023, from [link]

Câmara dos Deputados. (2021, December). Enquete do PL 2564/2020. Portal da Câmara dos Deputados. Retrieved November 27, 2023, from [link]

Cardoso, F. H. (2007). A democracia no centro da agenda. In B. Sorj & M. D. d. Oliveira (Eds.), Sociedade civil e democracia na américa latina: Crise e reinvenção da política (pp. 7–10). SciELO Books - Centro Edelstein.

Carvalho, N. R., & Barbosa, L. S. (2018). Transforming legal documents for visualization and analysis. Proceedings of the 11th International Conference on Theory and Practice of Electronic Governance, 23–26. DOI: https://doi.org/10.1145/3209415.3209424.

Chowdhary, K. R. (2020). Natural language processing. In Fundamentals of artificial intelligence (pp. 603–649). Springer India. DOI: https://doi.org/10.1007/978-81-322-3972-7_19.

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. DOI: https://doi.org/10.18653/v1/N19-1423.

Egli, E., Mamie, N., Dolev, E. L., & Muller, M. (2023). Voting booklet bias: Stance detection in swiss federal communication. arXiv preprint arXiv:2306.08999. [link]

George, L. E., & Birla, L. (2018). A study of topic modeling methods. 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), 109–113. DOI: https://doi.org/10.1109/ICCONS.2018.8663152.

Ghosh, S., Chakraborty, P., Nsoesie, E. O., Cohn, E., Mekaru, S. R., Brownstein, J. S., & Ramakrishnan, N. (2017). Temporal topic modeling to assess associations between news trends and infectious disease outbreaks. Scientific Reports, 7(1), 40841. DOI: https://doi.org/10.1038/srep40841.

Grootendorst, M. (2022). Bertopic: Neural topic modeling with a class-based tf-idf procedure. arXiv preprint arXiv:2203.05794. [link]

He, Y., Lin, C., Gao, W., & Wong, K.-F. (2014). Dynamic joint sentiment-topic model. ACM Trans. Intell. Syst. Technol., 5(1). DOI: https://doi.org/10.1145/2542182.2542188.

Hirschberg, J., & Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245), 261–266. DOI: https://doi.org/10.1126/science.aaa8685.

HuggingFace. (2023). Evaluate. [link]

Ikram, A. Y., & Chakir, L. (2019). Arabic text classification in the legal domain. 2019 Third International Conference on Intelligent Computing in Data Sciences (ICDS), 1–6. DOI: https://doi.org/10.1109/ICDS47004.2019.8942343.

James, C., Nagda, M., Haji Ghassemi, N., Kloft, M., & Fellenz, S. (2023, September). Evaluating dynamic topic models. DOI: https://doi.org/10.48550/arXiv.2309.08627.

Joshi, K. P., & Saha, S. (2020). A semantically rich framework for knowledge representation of code of federal regulations. Digit. Gov.: Res. Pract., 1(3). DOI: https://doi.org/10.1145/3425192.

Katz, D. M., Hartung, D., Gerlach, L., Jana, A., & Bommarito, I., Michael J. (2023, February). Natural language processing in the legal domain [Provided by the SAO/NASA Astrophysics Data System]. DOI: https://doi.org/10.48550/arXiv.2302.12039.

Kneuer, M. (2016). E-democracy: A new challenge for measuring democracy. International Political Science Review / Revue internationale de science politique, 37(5), 666–678. Retrieved September 18, 2023, from [link]

Kuçuk, D., & Can, F. (2020). Stance detection: A survey. ACM Comput. Surv., 53(1). DOI: https://doi.org/10.1145/3369026.

La Cava, L., Simeri, A., & Tagarelli, A. (2022). Lawnet-viz: A web-based system to visually explore networks of law article references. Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 3300–3305. DOI: https://doi.org/10.1145/3477495.3531668.

Lai, M., Cignarella, A. T., Hernández Farías, D. I., Bosco, C., Patti, V., & Rosso, P. (2020). Multilingual stance detection in social media political debates. Computer Speech & Language, 63, 101075. DOI: https://doi.org/10.1016/j.csl.2020.101075.

Lai, M., Patti, V., Ruffo, G., & Rosso, P. (2020). #brexit: Leave or remain? the role of user’s community and diachronic evolution on stance detection (D. Pinto, V. Singh, & F. Perez, Eds.). Journal of Intelligent &amp Fuzzy Systems, 39(2), 2341–2352. DOI: https://doi.org/10.3233/jifs-179895.

Lai, M., Tambuscio, M., Patti, V., Ruffo, G., & Rosso, P. (2019). Stance polarity in political debates: A diachronic perspective of network homophily and conversations on twitter. Data & Knowledge Engineering, 124, 101738. DOI: https://doi.org/10.1016/j.datak.2019.101738.

Lessa, R. (2010). Democracia, representação e desenvolvimento. In F. d. S. Silva, F. G. Lopez, R. R. C. Pires, et al. (Eds.), Estado, instituições e democracia: Democracia (pp. 47–89, Vol. 2). Instituto de Pesquisa Econômica Aplicada (Ipea).

Lima, J. P., Alfredo Costa, J., & Araujo, D. C. (2021). Comparison of feature extraction methods for brazilian legal documents clustering. 2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI), 1–5. DOI: https://doi.org/10.1109/LA-CCI48322.2021.9769839.

Lukito, J., K Sarma, P., Foley, J., & Abhishek, A. (2019). Using time series and natural language processing to identify viral moments in the 2016 U.S. presidential debate. Proceedings of the Third Workshop on Natural Language Processing and Computational Social Science, 54–64. DOI: https://doi.org/10.18653/v1/W19-2107.

Macedo, P. S. N. d. (2008). Democracia participativa na constituição brasileira. Revista de informação legislativa, 45(178), 181–193. [link]

Maia, D. F., Silva, N. F. F., Souza, E. P. R., Nunes, A. S., Procópio, L. C., Sampaio, G. d. S., Dias, M. d. S., Alves, A. O., Maia, D. F., Ribeiro, I. A., Pereira, F. S. F., & de Carvalho, A. P. d. L. F. (2022). Ulyssessd-br: Stance detection in brazilian political polls. In G. Marreiros, B. Martins, A. Paiva, B. Ribeiro, & A. Sardinha (Eds.), Progress in artificial intelligence (pp. 85–95). Springer International Publishing.

Medina, M. C. C., Da Silva Oliveira, L. M., Ferreira, J. F. C., Silva, L. H. D. S., Rodrigues, C. M. O., De Oliveira, J. F. L., Sobral, P. C., Souza, B., Feitosa, D., & Fernandes, B. J. T. (2022). Classification of legal documents in portuguese language based on summarization. 2022 IEEE Latin American Conference on Computational Intelligence (LA-CCI), 1–6. DOI: https://doi.org/10.1109/LA-CCI54402.2022.9981852.

Nababan, A. H., Mahendra, R., & Budi, I. (2022). Twitter stance detection towards job creation bill [Sixth Information Systems International Conference (ISICO 2021)]. Procedia Computer Science, 197, 76–81. DOI: https://doi.org/10.1016/j.procs.2021.12.120.

Pereira, M. A., Bernardes, C. B., & Vale, M. L. (2022). O papel da moderação nas audiências públicas interativas do e-democracia: Entre a promoção do debate e o receio da censura. Revista Brasileira de Ciência Política, (37), e249679. DOI: https://doi.org/10.1590/0103-3352.2022.37.249679.

Reimers, N., & Gurevych, I. (2019, November). Sentence-BERT: Sentence embeddings using Siamese BERTnetworks. In K. Inui, J. Jiang, V. Ng, & X. Wan (Eds.), Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (emnlp-ijcnlp) (pp. 3982–3992). Association for Computational Linguistics. DOI: https://doi.org/10.18653/v1/D19-1410.

Reuver, M., Verberne, S., Morante, R., & Fokkens, A. (2021, November). Is stance detection topic-independent and cross-topic generalizable? - a reproduction study. In K. Al-Khatib, Y. Hou, & M. Stede (Eds.), Pro-ceedings of the 8th workshop on argument mining (pp. 46–56). Association for Computational Linguistics. DOI: https://doi.org/10.18653/v1/2021.argmining-1.5.

Ronchi, A. M. (2019, February). E-democracy (1st ed.). Springer International Publishing. Rothberg, D. (2008). Por uma agenda de pesquisa em democracia eletrônica. Opinião Pública, 14(1), 149–172. DOI: https://doi.org/10.1590/S0104-62762008000100006.

Sarne, D., Schler, J., Singer, A., Sela, A., & Bar Siman Tov, I. (2019). Unsupervised topic extraction from privacy policies. Companion Proceedings of The 2019 World Wide Web Conference, 563–568. DOI: https://doi.org/10.1145/3308560.3317585.

Sharma, A., Susan, S., Bansal, A., & Choudhry, A. (2022). Dynamic topic modeling of covid-19 vaccine-related tweets. 2022 the 5th International Conference on Data Storage and Data Engineering, 79–84. DOI: https://doi.org/10.1145/3528114.3528127.

Silva, N., Silva, M., Pereira, F., Tarrega, J., Beinotti, J., Fonseca, M., Andrade, F., & Carvalho, A. (2021). Evaluating topic models in portuguese political comments about bills from brazil’s chamber of deputies. Anais da X Brazilian Conference on Intelligent Systems. [link]

Vamvas, J., & Sennrich, R. (2020). X-stance: A multilingual multi-target dataset for stance detection [5th SwissText & 16th KONVENS Joint Conference 2020, SwissText and KONVENS 2020 ; Conference date: 23-06-2020 Through 25-06-2020]. Proceedings of the 5th Swiss Text Analytics Conference (SwissText) & 16th Conference on Natural Language Processing (KONVENS). [link]

Viani, N., Kam, J., Yin, L., Bittar, A., Dutta, R., Patel, R., Stewart, R., & Velupillai, S. (2020). Temporal information extraction from mental health records to identify duration of untreated psychosis. Journal of Biomedical Semantics, 11(1), 2. DOI: https://doi.org/10.1186/s13326-020-00220-2.

Yamunathangam, D., Priya, C., Shobana, G., & Latha, L. (2021). An overview of topic representation and topic modelling methods for short texts and long corpus. 2021 International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), 1–6. DOI: https://doi.org/10.1109/ICAECA52838.2021.9675579.

Zhang, D. C., & Lauw, H. (2022, July). Dynamic topic models for temporal document networks. In K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, & S. Sabato (Eds.), Proceedings of the 39th international conference on machine learning (pp. 26281–26292, Vol. 162). PMLR. [link]

Downloads

Published

2025-05-19

How to Cite

Cerqueira, M., da Silva, N. F., Souza, E., Albuquerque, H. O., Dias, M. de S., & de Carvalho, A. C. (2025). Not only what, but also when: Understanding Brazilian political comments on legislative bills over time through Stance Detection and Topic Modeling. Conference on Digital Government Research, 26. https://doi.org/10.59490/dgo.2025.955

Conference Proceedings Volume

Section

Research papers