![]() |
omorfi 0.9.9
Open morphology of Finnish
|
Public Member Functions | |
| def | __init__ (self) |
| def | load_analyser (self, str hfstfile) |
| def | load_udpipe (self, str filename) |
| def | load_lexical_frequencies (self, lexfile) |
| def | load_omortag_frequencies (self, omorfile) |
| def | analyse (self, Token token) |
| def | analyse_sentence (self, tokens) |
| def | accept (self, token) |
Data Fields | |
| analyser | |
| udpiper | |
| udpipeline | |
| uderror | |
| can_udpipe | |
| udpipe is loaded | |
| lexlogprobs | |
| taglogprobs | |
Static Public Attributes | |
| int | PENALTY = 28021984 |
An object for omorfi’s morphological analysis.
| def omorfi.analyser.Analyser.__init__ | ( | self | ) |
Initialise an empty analyser.
| def omorfi.analyser.Analyser.accept | ( | self, | |
| token | |||
| ) |
Check if the token is in the dictionary or not.
Returns:
False for OOVs, True otherwise. Note, that this is not
necessarily more efficient than bool(analyse(token))
| def omorfi.analyser.Analyser.analyse | ( | self, | |
| Token | token | ||
| ) |
Perform a simple morphological analysis lookup.
The analysis will be performed for re-cased variants based on the
state of the member variables. The re-cased analyses will have more
penalty weight and additional analyses indicating the changes.
Side-Effects:
The analyses are stored in the token, and only the new analyses
are returned.
Args:
token: token to be analysed.
Returns:
An HFST structure of raw analyses, or None if there are no matches
in the dictionary.
| def omorfi.analyser.Analyser.analyse_sentence | ( | self, | |
| tokens | |||
| ) |
Analyse a full tokenised sentence. for details of analysis, see @c analyse(self, token). If further models like udpipe are loaded, may fill in gaps with that.
| def omorfi.analyser.Analyser.load_analyser | ( | self, | |
| str | hfstfile | ||
| ) |
Load analyser model from a file.
Args
f: containing single hfst automaton binary.
| def omorfi.analyser.Analyser.load_lexical_frequencies | ( | self, | |
| lexfile | |||
| ) |
Load a frequency list for lemmas. Experimental.
Currently in uniq -c format, subject to change.
Args:
lexfile: file with frequencies.
| def omorfi.analyser.Analyser.load_omortag_frequencies | ( | self, | |
| omorfile | |||
| ) |
Load a frequenc list for tags. Experimental.
Currently in uniq -c format. Subject to change.
Args:
omorfile: path to file with frequencies.
| def omorfi.analyser.Analyser.load_udpipe | ( | self, | |
| str | filename | ||
| ) |
Load UDPipe model for statistical parsing. UDPipe can be used as extra information source for OOV symbols or all tokens. It works best with sentence-based analysis, token based does not keep track of context. @param filename path to UDPipe model