omorfi 0.9.9
Open morphology of Finnish
Public Member Functions | Data Fields | Static Public Attributes
omorfi.analyser.Analyser Class Reference

Public Member Functions

def __init__ (self)
 
def load_analyser (self, str hfstfile)
 
def load_udpipe (self, str filename)
 
def load_lexical_frequencies (self, lexfile)
 
def load_omortag_frequencies (self, omorfile)
 
def analyse (self, Token token)
 
def analyse_sentence (self, tokens)
 
def accept (self, token)
 

Data Fields

 analyser
 
 udpiper
 
 udpipeline
 
 uderror
 
 can_udpipe
 udpipe is loaded
 
 lexlogprobs
 
 taglogprobs
 

Static Public Attributes

int PENALTY = 28021984
 

Detailed Description

An object for omorfi’s morphological analysis.

Constructor & Destructor Documentation

◆ __init__()

def omorfi.analyser.Analyser.__init__ (   self)
Initialise an empty analyser.

Member Function Documentation

◆ accept()

def omorfi.analyser.Analyser.accept (   self,
  token 
)
Check if the token is in the dictionary or not.

Returns:
    False for OOVs, True otherwise. Note, that this is not
necessarily more efficient than bool(analyse(token))

◆ analyse()

def omorfi.analyser.Analyser.analyse (   self,
Token  token 
)
Perform a simple morphological analysis lookup.

The analysis will be performed for re-cased variants based on the
state of the member variables. The re-cased analyses will have more
penalty weight and additional analyses indicating the changes.

Side-Effects:
    The analyses are stored in the token, and only the new analyses
    are returned.

Args:
    token: token to be analysed.

Returns:
    An HFST structure of raw analyses, or None if there are no matches
    in the dictionary.

◆ analyse_sentence()

def omorfi.analyser.Analyser.analyse_sentence (   self,
  tokens 
)
Analyse a full tokenised sentence.

for details of analysis, see @c analyse(self, token).
If further models like udpipe are loaded, may fill in gaps with that.

◆ load_analyser()

def omorfi.analyser.Analyser.load_analyser (   self,
str  hfstfile 
)
Load analyser model from a file.

Args
    f: containing single hfst automaton binary.

◆ load_lexical_frequencies()

def omorfi.analyser.Analyser.load_lexical_frequencies (   self,
  lexfile 
)
Load a frequency list for lemmas. Experimental.
Currently in uniq -c format, subject to change.

Args:
    lexfile: file with frequencies.

◆ load_omortag_frequencies()

def omorfi.analyser.Analyser.load_omortag_frequencies (   self,
  omorfile 
)
Load a frequenc list for tags. Experimental.
Currently in uniq -c format. Subject to change.

Args:
    omorfile: path to file with frequencies.

◆ load_udpipe()

def omorfi.analyser.Analyser.load_udpipe (   self,
str  filename 
)
Load UDPipe model for statistical parsing.

UDPipe can be used as extra information source for OOV symbols
or all tokens. It works best with sentence-based analysis, token
based does not keep track of context.

@param filename  path to UDPipe model

The documentation for this class was generated from the following file: