Bachelor/Master Thesis Opportunity: Extreme Classification for Digital Humanities

1 minute read


Position filled!

Are you fascinated by the crossroads of machine learning and human culture? Work with us on a Bachelor/Master Thesis that delves into Extreme Classification—a nuanced approach to handling complex datasets in the realm of Digital Humanities.

The Challenge

You will develop a human-in-the-loop algorithm for sorting texts into a massive hierarchical categorization scheme of 1.3k labels. Tackle the challenge of assigning multiple labels to a single data point, mirroring the nuanced nature of humanities datasets. In addition, your solution should not only work well for the majority classes in the dataset, but target an improved performance for the long tail of minority classes. Show that your solution is also applicable to other extreme classification scenarios such as patent or news classification, or medical code identification. In the context of a Master thesis, we propose to also focus on a good confidence score calibration to improve ranking of labels that are suggested to the human labeler.

Interdisciplinary Collaboration

Bridge the gap between computer science and humanities by creating models that resonate with domain experts, fostering collaborative research. One use case addressed in this thesis will be addressed together with the Chair of Ancient History at the Department of Philology and History.

Why Choose This Thesis?

Innovation: Apply cutting-edge machine learning to real-world challenges in Digital Humanities. You will get practical experience with pre-trained large language models / foundation models and deep learning frameworks such as HuggingFace/PyTorch.

Impact: Contribute to the advancement of knowledge preservation, cultural analysis, and interdisciplinary collaboration.

Guidance: During your thesis, we are committed to providing excellent and regular advice on experimental design, implementation, and scientific writing. Work with and get mentorship from experts in natural language processing and domain experts.


Open to passionate Bachelor’s/Master’s students with an interest in natural language processing, machine learning, strong Python skills, and a drive to make a meaningful impact.


Prof. Dr. Annemarie Friedrich / Dr. Jakob Prange