Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 9931022.
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020).
Language Models are Few-Shot Learners.
https://arxiv.org/abs/2005.14165
Chan, C., & Sältzer, M. (2020). Oolong: An r package for validating automated content analysis tools.
Journal of Open Source Software,
5(55), 2461.
https://doi.org/10.21105/joss.02461
Chen, Y., Peng, Z., Kim, S.-H., & Choi, C. W. (2023). What We Can Do and Cannot Do with Topic Modeling: A Systematic Review.
Communication Methods and Measures,
17(2), 111–130.
https://doi.org/10.1080/19312458.2023.2167965
Engel, U., Quan-Haase, A., Liu, S. X., & Lyberg, L. (2021).
Digital trace data (1st ed., pp. 100–118). Routledge.
https://www.taylorfrancis.com/books/9781003024583/chapters/10.4324/9781003024583-8
Grimmer, J., & Stewart, B. M. (2013). Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.
Political Analysis,
21(3), 267–297.
https://doi.org/10/f458q9
Harari, G. M., Lane, N. D., Wang, R., Crosier, B. S., Campbell, A. T., & Gosling, S. D. (2016). Using Smartphones to Collect Behavioral Data in Psychological Science: Opportunities, Practical Considerations, and Challenges.
Perspectives on Psychological Science,
11(6), 838–854.
https://doi.org/10.1177/1745691616650285
Hase, V. (2023).
Automated Content Analysis (F. Oehmer-Pedrazzi, S. H. Kessler, E. Humprecht, K. Sommer, & L. Castro, Eds.; pp. 23–36). Springer Fachmedien Wiesbaden.
https://link.springer.com/10.1007/978-3-658-36179-2_3
Jurafsky, D., & Martin, J. H. (2024).
Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition with language models (3rd ed.).
https://web.stanford.edu/~jurafsky/slp3/
Krippendorff, K. (2018). Content Analysis: An Introduction to Its Methodology (4th ed.). SAGE Publications.
Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G., Reber, U., Häussler, T., Schmid-Petri, H., & Adam, S. (2018). Applying LDA Topic Modeling in Communication Research: Toward a Valid and Reliable Methodology.
Communication Methods and Measures,
12(2-3), 93–118.
https://doi.org/10.1080/19312458.2018.1430754
Mayring, P. (2022). Qualitative Inhaltsanalyse: Grundlagen und Techniken (13th ed.). Beltz.
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021).
Learning Transferable Visual Models From Natural Language Supervision.
https://arxiv.org/abs/2103.00020
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
https://arxiv.org/abs/1506.02640
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).
https://arxiv.org/abs/1908.10084
Smith, R. (2007). An Overview of the Tesseract OCR Engine.
Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR), 629–633.
https://research.google/pubs/an-overview-of-the-tesseract-ocr-engine/
Struminskaya, B., Lugtig, P., Keusch, F., & Höhne, J. K. (2020). Augmenting Surveys With Data From Sensors and Apps: Opportunities and Challenges.
Social Science Computer Review, 089443932097995.
https://doi.org/10.1177/0894439320979951