Arne Rubehn
Arne Rubehn

Curriculum vitae
since 2023: PhD Student at the Chair for Multilingual Computational Linguistics, University of Passau.
2019-2023: Master of Arts, Computational Linguistics, University of Tübingen.
03-07/2018: Abroad studies (ERASMUS+), Applied Linguistics, Universitat Pompeu Fabra, Barcelona.
2015-2019: Bachelor of Arts, General Linguistics and Latin, University of Tübingen.
Publications
- Bocklage, K., Georgakopoulos, T., van Dam, K. P., Ciucci, L., Blum, F., Kučerová, A., Rubehn, A., Stephen, A., Snee, D., and List, J.-M. (2025). Testing the Potential of Automatically Inferred Affix Colexifications for Linguistic Typology. Humanities Commons [preprint, not peer-reviewed, under review]. https://doi.org/10.17613/a06m1-c9939
- Rubehn, A., Rzymski, C., Ciucci, L., Bocklage, K., Kučerová. A., Snee, D., Stephen, A., van Dam, K. P., and List, J.-M. (2025). Annotating and Inferring Compositional Structures Across Languages. In Proceedings of the 7th Workshop on Research in Computational Linguistic Typology and Multilingual NLP (SIGTYP). https://doi.org/10.18653/v1/2025.sigtyp-1.4
- Snee, D., Ciucci, L., Rubehn, A., van Dam, K. P., and List, J.-M. (2025). Unstable Grounds for Beautiful Trees? Testing the Robustness of Concept Translations in the Compilation of Multilingual Wordlists. In Proceedings of the 7th Workshop on Research in Computational Linguistic Typology and Multilingual NLP (SIGTYP). https://doi.org/10.18653/v1/2025.sigtyp-1.3
- Rubehn, A. and List, J.-M. (2025). Partial Colexifications Improve Concept Embeddings. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://aclanthology.org/2025.acl-long.1004
- Rubehn, A., Nieder, J., Forkel, R., and List, J.-M. (2024). Generating Feature Vectors from Phonetic Transcriptions in Cross-Linguistic Data Formats. In Proceedings of the 2024 Meeting of the Society for Computation in Linguistics (SCiL). https://doi.org/10.7275/scil.2144
- Rubehn, A., Montemagni, S., and Nerbonne, J. (2024). Extracting Tuscan phonetic correspondences from dialect pronunciations automatically. Language Dynamics and Change, 14(1), 1-33. https://doi.org/10.1163/22105832-bja10034
- Rubehn, A. (2022). A feature-based neural model of sound change informed by global lexicostatistical data. Master's thesis, Eberhard Karls Universität Tübingen. https://doi.org/10.15496/publikation-94055
Focus areas
I am a PhD student within the „ProduSemy“ project and focus on computer-assisted, data-driven methods for historical linguistics. I aim at advancing comparative historical linguistics by the means of intelligent algorithmic methods that can alleviate researchers’ workload by processing large-scale data efficiently. My current research focuses on embedding of "intuitive" linguistic knowledge to make it accessible for computational methods as well.
I have studied Computational Linguistics, General Linguistics, and Latin at the University of Tübingen. Within my MA thesis project I have trained a neural network that estimates global probabilities for arbitrary sound changes. Additionally, I have years of working experience as a software developer for EtInEn (Etymological Inference Engine), a software for historical linguists that is being developed at the Linguistic Department in Tübingen.