Publications of work that use omorfi

This is a collection of scientific work describing or using omorfi. If you find something is missing, please notify us on the google groups or other means.

To get BibTeX or other formats of citations, feel free to use the google scholar system.

How to cite

Ideal way to cite omorfi is is to cite the most recent publication about it:

  1. Tommi A Pirinen (2015) Development and Use of Computational Morphology of Finnish in the Open Source and Open Science Era: Notes on Experiences with Omorfi Development, in SKY Journal of Linguistics (28), ISSN: 1456-8438

Google scholar for the newest article and corresponding bibtex:

  title={Development and Use of Computational Morphology of Finnish in the Open
Source and Open Science Era: Notes on Experiences with Omorfi Development.},
  author={Pirinen, Tommi A},
  journal={SKY Journal of Linguistics},

If your publication allows modern style software citations, please use the one deposited to LINDAT repository:

  1. Pirinen, Tommi A ; Listenmaa, Inari ; Johnson, Ryan ; Tyers, Francis M. ; Kuokkala, Juha Open morphology of Finnish, (2017) PID:

In bibtex:

 title = {Open morphology of Finnish},
 author = {Pirinen, Tommi A and Listenmaa, Inari and Johnson, Ryan and Tyers, Francis M. and Kuokkala, Juha},
 url = {},
 note = {{LINDAT}/{CLARIN} digital library at the Institute of Formal and Applied Linguistics, Charles University},
 copyright = {{GNU} General Public Licence, version 3},
 year = {2017} 

If you are writing a paper in LREC or such, the ISLRN to use is 887-124-499-095-1.

Publications about omorfi specifically

This list might not be complete, I do not know if anyone else has written something.

  1. Tommi A Pirinen (2015) Development and Use of Computational Morphology of Finnish in the Open Source and Open Science Era: Notes on Experiences with Omorfi Development, in SKY Journal of Linguistics (28), ISSN: 1456-8438
  2. Tommi A Pirinen (2015) Omorfi—Free and open source morphological lexical database for Finnish in Proceedings of the 20th Nordic Conference of Computational Linguistics, NODALIDA 2015.
  3. Tommi A Pirinen (2011) Modularisation of Finnish Finite-State Language Description—Towards Wide Collaboration in Open Source Development of Morphological Analyser in Proceedings of Nodalida 2011 (18).
  4. Tommi Pirinen (2008), Suomen kielen äärellistilainen automaattinen morfologinen analyysi avoimen lähdekoodin menetelmin, Master’s Thesis, University of Helsinki (in Finnish)

Publications using omorfi

This list is most likely not complete, please suggest additions (or removals) if you have any.

Machine Translation

  1. Raphael Rubino, Tommi Pirinen, Miquel Esplà-Gomis, Nikola Ljubešić, Sergio Ortiz Rojas, Vassilis Papavassiliou, Prokopis Prokopidis and Antonio Toral (2015), Abu-MaTran at WMT 2015 Translation Task: Morphological Segmentation and Web Crawling at WMT2015
  2. Lane Schwartz, Bill Bryce, Chase Geigle, Sean Massung, Yisi Liu, Haoruo Peng, Vignesh Raja, Subhro Roy and Shyam Upadhyay (2015), The University of Illinois submission to the WMT 2015 Shared Translation Task at WMT2015
  3. Jörg Tiedemann, Filip Ginter and Jenna Kanerva (2015) Morphological Segmentation and OPUS for Finnish-English Machine Translation at WMT2015
  4. Ann Clifton, Anoop Sarkar (2011). Combining morpheme-based machine translation with post-processing morpheme prediction, in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL 2011.
  5. Ann Clifton, Anoop Sarkan, Morphology Generation for Statistical Machine Translation
  6. Ann Clifton (2010). Unsupervised Morphological Segmentation For Statistical Machine Translation Doctoral dissertation, Simon Fraser University.

Universal Dependencies, syntax, treebanking

  1. Sampo Pyysalo (2015) Universal Dependencies for Finnish, In: Nordic Conference of Computational Linguistics NODALIDA
  2. Jenna Kanerva et al. (2014), Syntactic n-gram collection from a large-scale corpus of internet Finnish, Proceedings of the Sixth International Conference Baltic HLT.
  3. Bernd Bohnet, Joakim Nivre, Igor Boguslavsky, Filip Ginter and Jan Hajič (2013), Joint Morphological and Syntactic Analysis for Richly Inflected Languages in Transactions of the Association for Computational Linguistics
  4. Kristiina Muhonen, Tanja Purtonen (2012), Rule-Based Detection of Clausal Coordinate Ellipsis in LREC 2012
  5. Kristiina Muhonen, Tanja Purtonen (2012), Detecting Semantic Ambiguity: Alternative Readings in Treebanks
  6. Mozgovoy, Maxim (2010) Extensible dependency grammar for education: ideas and experiments, J Converg (JoC) 1.1


  1. Silfverberg, Miikka, and Jack Rueter (2015) Can Morphological Analyzers Improve the Quality of Optical Character Recognition?. in: First International Workshop of Computational Linguistics for Uralic Languages, Septentrio Conference Series. No. 2, 2015.

Semantic Web

  1. Eetu Mäkelä. (2014) Combining a rest lexical analysis web service with sparql for mashup semantic annotation from text. In: The Semantic Web: ESWC 2014 Satellite Events.
  2. Reetta Sinkkilä, O. Suominen, E. Hyvönen (2011), Automatic semantic subject indexing of web documents in highly inflected languages, The Semantic Web: Research and …, 2011
  3. E. Ahonen, Eero Hyvönen (2009), Publishing Historical Texts on the Semantic Web-A Case Study, in Semantic Computing, 2009. ICSC’09.


  1. Tommi A Pirinen (2014), Weighted Finite-State Methods for Spell-Checking and Correction, PhD thesis
  2. Tommi A Pirinen (2014), State-of-the-Art in Weighted Finite-State Spell-Checking, in CICLing 2014, proceedings in LNCS
  3. Tommi A Pirinen, Sam Hardwick (2012), Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction, in Proceedings of 10th International Workshop on Finite-State Methods and/in Natural Language Processing FSMNLP 2012
  4. Tommi A Pirinen, Miikka Silfverberg (2012), Improving Finite-State Spell-Checker Suggestions with Part-of-Speech N-grams in Proceedings of International Conference on Intelligent Text Processing and Computational Linguistics CICLING 2012
  5. Miikka Silfverberg, Mirka Hyvärinen, Tommi A Pirinen (2011), Improving Predictive Entry of Finnish Text Messages using IRC Logs in Proceedings of the Computational Linguistics-Applications Conference 2011.

for further results, see