omorfi 0.9.9
Open morphology of Finnish
Public Member Functions | Static Public Member Functions
com.github.flammie.omorfi.Omorfi Class Reference

An object holding automata for all functions of omorfi. More...

Public Member Functions

 Omorfi ()
 construct empty omorfi holder.
 
void loadAnalyser (String path) throws java.io.FileNotFoundException, java.io.IOException, net.sf.hfst.FormatException
 Load an omorfi analyser from the given file. The file should contain a single HFST automaton for omorfi style analyses. More...
 
Collection< String > analyse (String wf) throws net.sf.hfst.NoTokenizationException
 Perform a simple morphological analysis lookup. More...
 
List< String > tokenise (String line)
 Perform tokenisation with loaded tokeniser if any, or split. More...
 

Static Public Member Functions

static void main (String[] args)
 example CLI analysis app. More...
 

Detailed Description

An object holding automata for all functions of omorfi.

Currently supported automata functions are:

The java code can perform minimal string munging by tokenisation, recasing.

Member Function Documentation

◆ analyse()

Collection< String > com.github.flammie.omorfi.Omorfi.analyse ( String  wf) throws net.sf.hfst.NoTokenizationException

Perform a simple morphological analysis lookup.

If can_titlecase does not evaluate to False, the analysis will also be performed with first letter uppercased and rest lowercased. If can_uppercase evaluates to not False, the analysis will also be performed on all uppercase variant. If can_lowercase evaluates to not False, the analysis will also be performed on all lowercase variant.

The analyses with case mangling will have an additional element to them identifying the casing.

Parameters
wfthe token to analyse as a string.
Returns
a collection of analyses for the token.

◆ loadAnalyser()

void com.github.flammie.omorfi.Omorfi.loadAnalyser ( String  path) throws java.io.FileNotFoundException, java.io.IOException, net.sf.hfst.FormatException

Load an omorfi analyser from the given file. The file should contain a single HFST automaton for omorfi style analyses.

Parameters
paththe path to analyser encoded as string.

◆ main()

static void com.github.flammie.omorfi.Omorfi.main ( String[]  args)
static

example CLI analysis app.

Parameters
argsCommand-line arguments.

◆ tokenise()

List< String > com.github.flammie.omorfi.Omorfi.tokenise ( String  line)

Perform tokenisation with loaded tokeniser if any, or split.

If tokeniser is available, it is applied to input line and if result is achieved, it is split to tokens according to tokenisation strategy and returned as a list.

If no tokeniser are present, or none give results, the line will be tokenised using java's basic string functions.

Parameters
lineA string containing a line from corpus to split into tokens.
Returns
an ordered collection of tokens that makes up the string.

The documentation for this class was generated from the following file: