Rules Ruling Neural Networks —
Neural vs. Rule-Based Grammar Checking for a Low Resource Language¹¹ 1 find original in ACL anthology

Linda Wiechetek
UiT Norgga árktalaš
universitehta
Norway

Flammie A Pirinen
UiT Norgga árktalaš
universitehta
Norway

Mika Hämäläinen
University of Helsinki,
Rootroo Ltd
Finland

Chiara Argese
UiT Norgga árktalaš
universitehta
Norway

(September 7, 2021)

Abstract

We investigate both rule-based and machine learning methods for the task of compound error correction and evaluate their efficiency for North Sámi, a low resource language. The lack of error-free data needed for a neural approach is a challenge to the development of these tools, which is not shared by bigger languages. In order to compensate for that, we used a rule-based grammar checker to remove erroneous sentences and insert compound errors by splitting correct compounds. We describe how we set up the error detection rules, and how we train a bi-RNN based neural network. The precision of the rule-based model tested on a corpus with real errors (81.0%) is slightly better than the neural model (79.4%). The rule-based model is also more flexible with regard to fixing specific errors requested by the user community. However, the neural model has a better recall (98%). The results suggest that an approach that combines the advantages of both models would be desirable in the future. Our tools and data sets are open-source and freely available on GitHub and Zenodo.

1 Introduction

This paper presents our work on automatically correcting compound errors in real world text of North Sámi and exploring both rule-based and neural network methods. We chose this error type as it is the most frequent grammatical error type (after spelling and punctuation errors) and twice as frequent as the second most frequent grammatical error (agreement error). It also regards both spelling and grammar as the error is a space between two words, but its correction requires grammatical context.

A grammar checker is a writer’s tool and particularly relevant to improve writing skills of a minority language in a bilingual context, as is the case for North Sámi. According to UNESCO [25], North Sámi, spoken in the North of Norway, Sweden and Finland, has around 30,000 speakers. It is a low resource language in a bilingual setting, and language users frequently face bigger challenges to writing proficiency as there is always a competing language. [30] Developing a reliable grammar checker with a high precision that at the same time covers a lot of errors has therefore been our main focus. Good precision (i.e. avoiding false alarms) is a priority because users get easily frustrated if a grammar checker gives false alarms and underlines correct sentences.

In this paper we focus on the correction of compound errors. This type of errors is easy to generate artificially in the absence of large amounts of error marked-up text, and we have a good amount of manually marked-up corpus for evaluation for this error type. Compound errors (i.e. one-word compounds that are erroneously written as two words) can be automatically inserted by using a rule-based morphological analyser on the corpus and splitting the word wherever we get a compound analysis. Unlike other error types (like e.g. real word errors) they are easily inserted, and existing compounds are seldom errors. In addition, they are interesting from a linguistic point of view as they are proper (complex) syntactic errors and not just spelling errors and serve as an example for higher level tools. Two adjacent words can either be syntactically related or erroneous compounds, depending on the syntax. In North Sámi orthography, as in the majority languages spoken in the region (Norwegian, Swedish and Finnish), nouns that form a new concept are usually written together. For example, the North Sámi word boazodoalloguovlu ‘reindeer herding area’ consists of three words boazu ‘reindeer’, doallu ‘industry’ and guovlu ‘area’, and thus it is written together as a single compound. The task of our methods is to correct spellings such as boazodoallu guovlu into boazodoalloguovlu in case the words have been written separately in error.

We develop both a rule-based and a neural model for the correction of compound errors. The rule-based model (GramDivvun) is based on finite-state technology and Constraint Grammar. The neural model is bi-directional recurrent (BiRNN). While the rule-based model has earlier produced good precision, it did not handle unknown compounds well, which is why we were interested in a neural approach. However, neural models depend on large amounts of ‘clean’ data and synthetic error generation (or alternatively marked-up data).

Typical for low-resource languages and also North Sámi, the corpora are not clean and contain a fair amount of a variety of different spelling and grammatical errors (see [4]).

Therefore, efficiently preparing data as to making it available for neural model training is an important part of this paper. In our case, we make use of the existing rule-based tools to both, generate synthetic error data and clean the original data for training. For evaluation, on the other hand, we use real world error data.

Our free and open-source rule-based tools can be found on GiellaLT GitHub.²² 2 https://github.com/giellalt/ The training data and the neural models are freely available on Zenodo.³³ 3 https://zenodo.org/record/5172095 We hereby want to promote a wider academic interest in conducting NLP research for the North Sámi.

2 Background

Sámi open source rule-based language tools have a long and successful tradition (nearly 20 years) [38, 27, 3, 34]. North Sámi is a low-resource language in terms of available corpus data (32.24M tokens raw data). Although there is a fair amount of data, it contains many real errors and only a small amount is marked up for errors.

Applying neural approaches for high-level language tasks to low resource languages is an interesting research question due the various limitations of minority language corpora, versus the existing research in the topic in well-resourced, majority languages and artificially constrained setups [28]. Rules have been used and are in a wide-spread use in the context of endangered Uralic languages. There is recent work on grammar checking for North Sámi [40] and spell checking for Skolt Sámi [37]. Other rule-based approaches to grammar checking are extensively described in Wiechetek [41].

Before the era of neural models, it was common to use statistical machine translation (SMT) as a method for grammar error correction [6, 23, 15]. Many recent papers on grammar checking use bi-directional LSTM models that are trained to tag errors in an input sentence. Such methods have been proposed for Latvian [9], English [33] and Chinese [17]. Similar LSTM based approaches have also been applied for error correction [43, 11, 18]. Other recent approaches [19, 29] use methods that take advantage of BERT [10] and other data-hungry models. While such rich sentence embeddings can be used for English and a few other languages with a large amount of data, their use is not viable for North Sámi.

3 Data

For evaluation and training the neural model we use the [36] (the Sámi International KORpus), which is a collection of texts in different Sámi languages compiled by UiT The Arctic University of Norway and the Norwegian Sámi Parliament. It consists of two subcorpora: GT-Bound⁴⁴ 4 https://gtsvn.uit.no/boundcorpus/orig/sme/ (texts limited by a copyright which are available only by request) and GT-Free⁵⁵ 5 https://gtsvn.uit.no/freecorpus/orig/sme/ (the publicly available texts). As a preprocessing step, we run a rule-based grammar checker [42] and remove sentences with potential compound errors, as we cannot automatically ensure whether these errors are real or not. This is needed as we want this data to be fully free of any compound errors as it serves as the target side of our neural model.

Thereafter, we take in each sentence in this error free data and analyse it by a rule-based morphological analyser⁶⁶ 6 https://github.com/giellalt/lang-sme. When the analyser sees a potential compound word, it indicates the word boundary with a compound (+Cmp#) tag. We use this information to automatically split all compounds identified by the rule-based analyser. This results in a parallel corpus of the original sentences as the prediction target and their corresponding versions with synthetically introduced compound errors. Many of the compound boundaries are ambiguous, and the algorithm decides the one used in training data based on heuristics: maximum number of compound boundaries where the splitting will not cause any other modifications of the word stems or other content.

As an additional data source, we use the North Sámi Universal Dependencies treebank [39]. We parse the corpus with UralicNLP [14] and split the compounds the rule-based morphological analyser identifies as consisting of two or more words in order to synthetically introduce errors. We also run the rule-based morphological analyser and morpho-syntactic disambiguator to add part-of-speech (POS) information to produce an additional data set with POS tags. For the Universal Dependencies data, we use the POS tags provided in the data set.

We then make sure that all sentences have at least one generated compound error and that the only type of error the sentences have is the compound error (no other changes introduced by the rule-based models). We shuffle this data randomly and split it on a sentence level into 70 % training, 15 % validation and 15 % testing. The size of the data set can be seen in Table 1, the sentences were tokenized based on punctuation marks.

	Sentences	Source tokens
Train	43,658	388,167
Test	9,356	83,107
Validation	9,355	82,566
Real-world errors	3,291	26,565

Table 1: Training, testing and validation sizes for the neural model (corpus with synthetic errors)

For the rule-based model GramDivvun we do not generate synthetic errors. We have hand-selected a large corpus for rule development and as regression tests, consisting of representative sentences from GT-Free. The current selection for syntactic compound errors includes 3,291 sentences with real world compound errors (and possibly other errors in addition).

4 Methods

We use a neural models and a rule-based model for compound error correction.

4.1 Neural Model

Input

Output

g e a h č č a l a d d a n _ p r o š e a k t a n

g e a h č č a l a d d a n p r o š e a k t a n

g e a h č č a l a d d a n _ p r o š e a k t a n _ p r o š e a k t a n

g e a h č č a l a d d a n p r o š e a k t a n _ p r o š e a k t a n

V> g e a h č č a l a d d a n <V _ N> p r o š e a k t a n <N

g e a h č č a l a d d a n p r o š e a k t a n

V> g e a h č č a l a d d a n <V _ N> p r o š e a k t a n <N _

N> j a g i <N

g e a h č č a l a d d a n p r o š e a k t a n _ p r o š e a k t a n

Table 2: Examples of the character-level input and output, where n indicates the chunk size. The first examples are without POS tags and the last with POS tags

We model the problem at a character instead of word level in NMT (neural machine translation). The reason for using a character-level model instead of a word-level model is that, this way, the model can work better with out-of-vocabulary words. This is important due to the low-resourced nature of North Sámi, although there are other deep learning methods for endangered languages that do not utilize character level models [2]. In practice, we split words into characters separated by white spaces and mark actual spaces between words with an underscore (_). We train the model to predict from text with compound errors into text without compound errors. As previous research [31, 1] has found that using chunks of words instead of full sentences at a time improves the results in character level models, we will be training different models with different chunk sizes. This means that we will train a model to predict two words at a time, three words at a time, all the way to five words at a time.

We train the models with and without POS tags. For the models with POS tags, we surround each word with a token indicating the beginning and the end of the POS tag. The POS tags are included only on the source side, not on the target side. They are separated from the word with a white space.

An example of the data can be seen in Table 2. Even though every sentence in the training data has a compound error, this does not mean that every input chunk the model sees would have a compound error. This way, the model will also learn to leave the input unchanged if no compound errors are detected.

We train all models using a bi-directional long short-term memory (LSTM) based model [16] by using OpenNMT-py [22] with the default settings except for the encoder where we use a BiRNN [35] instead of the default RNN (recurrent neural network), since BiRNN based models have been shown to provide better results in character-level models [13]. We use the default of two layers for both the encoder and the decoder and the default attention model, which is the general global attention presented by [24]. The models are trained for the default of 100,000 steps. All models are trained with the same random seed (3,435) to ensure reproducibility.

During the training of the neural models, we evaluate the models using simple sentence level scores. There we look only at full-sentence matches and evaluate their accuracy, precision and recall, as opposed to the evaluations in Section 5, where we study them more carefully at the word-level. The results of the neural models for the generated corpus (where errors were introduced by splitting compounds) can be seen in Table 3. The results indicate that both of the models receiving a chunk of two words at a time reached to the highest accuracy, and the model without the POS tags also reached to the highest precision.

Chunk	POS	Accuracy	Precision	Recall
2	no	0.925	0.949	0.974
3	no	0.847	0.883	0.955
4	no	0.852	0.892	0.950
5	no	0.869	0.909	0.952
2	yes	0.925	0.948	0.976
3	yes	0.906	0.934	0.968
4	yes	0.856	0.896	0.951
5	yes	0.857	0.895	0.953

Table 3: Sentence level scores for different neural models tested on a corpus with artificially introduced errors

The POS tags were not important for the models, as the results with and without them are fairly similar. The largest gain was when the compound error correction was done for three words at a time. As this performance gain only occurred for that specific model, it suggests that it is more of an artefact of the training data and how it is fed into the model than any actual improvement.

4.2 Rule-based Model

The rule-based grammar checker GramDivvun is a full-fledged grammar checker fixing spelling errors, (morpho)-syntactic errors (including real word spelling errors⁷⁷ 7 Real word errors are spelling errors where the outcome is an actual word that is not fit for the context., inflection errors, and compounding errors) and punctuation and spacing errors.

It takes input from the finite-state transducer (FST) to a number of other modules, the core of which are several Constraint Grammar modules for tokenization disambiguation, morpho-syntactic disambiguation and a module for error detection and correction. The full modular structure (Figure 1) is described in Wiechetek [40]. This work regards predominantly the modification of the disambiguation and error detection modules mwe-dis.cg3, grc-disambiguator.cg3, and grammerchecker-release.cg3. We are using finite-state morphology [5] to model word formation processes. The technology behind our FSTs is described in Pirinen [32]. Constraint Grammar is a rule-based formalism for writing disambiguation and syntactic annotation grammars [21, 20]. In our work, we use the free open source implementation VISLCG-3 [7]. All components are compiled and built using the GiellaLT infrastructure [26]. The code and data for the model is available for download ⁸⁸ 8 https://github.com/giellalt/lang-sme/releases/tag/naacl-2021-ws with specific version tagged for reproducibility.

System architecture of the North Sámi grammar checker
( — Figure 1: System architecture of the North Sámi grammar checker (GramDivvun)

The syntactic context is specified in hand-written Constraint Grammar rules. The REMOVE-rule below removes the compound error reading (identified by the tag Err/SpaceCmp) if the head is a 3rd person singular verb (cf. l.2) and the first element of the potential compound is a noun in nominative case (cf. l.3). The context condition further specifies that there should be a finite verb (VFIN) somewhere in the sentence (cf. l.4) for the rule to apply.

{Verbatim}

[frame=single,framerule=0.2mm,framesep=3mm,fontsize=,baselinestretch=1] REMOVE (Err/SpaceCmp) (0/0 (V Sg3)) (0/1 (N Sg Nom)) (*0 VFIN);

All possible compounds written apart are considered to be errors by default, unless the lexicon specifies a two or several word compound or a syntactic rule removes the error reading.

The process of rule writing includes several consecutive steps, and like neural network models they require data. The process is as follows:

1.

Modelling an error detection rule based on at least one actual sentence containing the error
2.

Adding constraints based on the linguist’s knowledge of possible contexts (remembered data)
3.

A corpus search for sentences containing similar forms/errors, testing of the rule and reporting rule mistakes
4.

Modification of constraints in the rule based on this data and testing against regression tests so that unfit constraints depending on results for precision and recall (focus on precision)

The basis of rule development is continuous integration. Typical shortcomings and bad errors can be fixed right away with added conditions. Neural models are not usually trained in this way.

The frequent experience of false alarms can decrease the users’ trust in the grammar checker. Typically, full-fledged user oriented grammar checkers, e.g. DanProof focus on keeping false alarms low and precision high [8] because users’ experiences have shown that certain experiences will frustrate users and stop them from using the application.

For rule development, regression tests are used. These consist in error-specific YAML⁹⁹ 9 https://yaml.org/spec/1.2/spec.html tests and are manually marked up.

The regression test for compound errors contains 3,291 sentences (1,368 compound errors, used for development and regression) give the results as shown in Table 4.

Precision	Recall	$F_{1}$ score
94.95	86.22	90.80

Table 4: The rule-based model tested on the developer’s corpus (regression tests)

5 Results

We evaluate the models both quantitatively and qualitatively. We evaluate on accuracy, precision and recall, and do a linguistic evaluation. The measurements are defined in this article as follows: Accuracy $A=\frac{C}{S}$ , where C is a correct sentence (1:1 string match) and $S$ is corpus size in sentences, precision $P=\frac{tp}{tp+fp}$ and recall $R=\frac{tp}{tp+fn}$ , where $t p$ is true positive, $f p$ is false positive and $f n$ is false negative. The $F_{1}$ score is the harmonic mean of precision and recall $F_{1}=2\times\frac{P\times R}{P+R}$ . The accuracy is thus sentence level correctness rate—as used in the method section to probe model qualities—whereas precision measures how often corrections were right and recall measures how many errors we found. The word-level errors are counted once per error in the marked-up corpus. Thus, if a three-part compound contains two compounding errors it is counted towards the total as one error, but if a sentence has three separate compounds with wrong splits each, we count three errors.

The error marked-up corpus we used includes 140 syntactic compound errors (there are other compound errors that can be discovered by the spellchecker as they are word internal) and is from GT-Bound. We chose GT-Bound to make sure that the sentences had not been used to develop rules. It is part of our error-marked up corpus, which makes it possible to run an automatic analysis. This error corpus does only contain real world (as opposed to synthetic) errors.

Chunk	POS	Accuracy	Precision	Recall
2	no	0.781	0.794	0.980
3	no	0.707	0.720	0.974
4	no	0.726	0.747	0.963
5	no	0.727	0.757	0.950
2	yes	0.777	0.788	0.982
3	yes	0.761	0.775	0.976
4	yes	0.720	0.744	0.958
5	yes	0.751	0.765	0.976

Table 5: Sentence level scores for the neural models tested on a real world error corpus

Table 5 shows the results for the neural models on this corpus. The drop in results is expected as the models were trained on synthetic data, whereas this data consists of real world errors. However, the results stay relatively good, given that synthetic data was the only way to produce enough training data for North Sámi.

We ran the neural and rule-based model on two different corpora of compound error materials, i.e. synthetic and real world. Table 6 shows the evaluation on a real world error corpus.

Model	Precision	Recall	$F_{1}$
Rule-based model	81.0	60.7	69.3
Neural model	79.4	98.0	87.7

Table 6: Results for both models based on a manually marked-up evaluation corpus

The neural network performs well in terms of numbers, but has the following shortcomings that are problematic for the end users. It introduces new (types of) errors unrelated to compounding, like changing km² randomly either to kmy or km kind of unforgivable (because not understandable) for the end user. They introduce compounds like Statoileamiálbmogiid ‘Statoil (national oil company and gasstation) indigenous people’ as in ex. 6. The rule-based grammar checker presupposes that the compound is listed in the lexicon, which is why these corrections can easily be avoided.

\exg

. Statoil eamiálbmogiid eatnamiid billisteami birra
Statoil indigenous.people.acc.pl land.acc.pl destruction.gen about
‘about the destruction of the indigenous peoples’ territories by Statoil’

It also produces untypically long non-sense words like NorggaSámiidRiidRiidRiidRiidRiidRiidRiikasearvvi. In addition, there are false positives of certain grammatical combinations that are systematically avoided by rule-based grammar checker. These are combinations of attributive adjectives and nouns (17 occurrences) like boares eallinoainnuid in ex. 6 and genitive modifier and noun combinations (11 occurrences) like njealjehaskilomehtera eatnamat in ex. 6.

\exg

. boares eallinoainnuid ja modearna servodaga váikkuhusaid gaskii.
old life.view.acc.pl and modern society.gen impact.acc.pl between
‘between old philosophies and the impact of modern society’

\exg

. Dasalassin 137000 njealjehaskilomehtera eatnamat biđgejuvvojit seismalaš linnjáid
in.addition 137000 square.kilometre.gen landpl. split.pass.pl3 seismic line.acc.pl
‘In addition, 137,000 square kilometres of land are split by seismic lines’

The rule-based model, on the other hand, typically suggests compounding, where both compounding and two word combinations would be adequate, for example in the case of the first part of the compound having homonymous genitive and a nominative analyses. The suggested compound is not an error. However, the written form is grammatically correct as well. These suggestions still count as false positives. Other typical errors are cases where there are two accepted ways of spelling a compound/MWE as in ex. 6, where both Riddu Riđđu and Riddu-Riđđu are correct spellings, and the latter one is suggested as a correction of the former one.

\exg

. ovdanbuktojuvvojit omd. jahkásaš Riddu Riđđu festiválas.
present.pass.prs.pl3 e.g. annual Riddu Riđđu festival.loc
‘they are presented at the annual Riddu Riđđu festival.

The rule-based model also struggles predominantly with false negatives, like njunuš olbmot ‘leading people’ that are due to missing entries in the lexicon like in ex. 6.

\exg

. Sii leat gieldda njunuš olbmot.
they are municipality.gen leading people
‘They are the leading people of the municipality’

6 Discussion

In the future, we would like to look into hybrid grammar checking of other error types and other (Sámi) languages.

The neural approach gives us relatively high recall in the real world situation with lower precision, whereas the rule-based model is designed to give us high precision even at the cost of lower recall (user experience), which is why hybrid approaches that combine the best of two worlds are interesting.

Noisy data is to be expected in any endangered language context, as the language norms are to a lesser degree internalized. We will therefore need a way of preparing the data to train neural networks, which can either consist in creating synthetic data or automatically fixing errors and creating a parallel corpus.

When creating synthetic data for neural networks, the amount of data is hardly the main issue. Many generative systems are capable of over-generating data. The main question that arises is the quality and representatives ([12]) of the generated data. If the rules used to generate the data are not in line with the real world phenomenon the neural model is meant to solve, we cannot expect very high quality results in real world data.

Generated sentences can easily be less complex ‘text book examples’ that are not representative of real world examples. In the case of agreement errors between subjects and verbs, for example, there are long distance relationships and complex coordinated subjects including personal pronouns that can change the structure of a seemingly straightforward relation. Therefore, we advocate the use of high quality rule-based tools to prepare the data, i.e. fix the errors and create a parallel corpus.

While synthetic error data generation for compound errors is somewhat more straightforward as it only affects adjacent words, the generation of synthetic error corpora for other error types is not as straightforward, in part also because generating synthetic errors of other kind can potentially create valid and grammatically correct sentences with different meanings. We therefore predict that (hybrid) neural network approaches for other error types that either involve specific morphological forms (of which there are many in North Sámi) or changes in word order will be more difficult to resolve.

7 Conclusion

In this paper, we have developed both a neural network and a rule-based grammar checker module for compound errors in North Sámi. We have shown that a neural compound-corrector for a low-resource language can be built based on synthetic error data by introducing the compound errors using a high level rule-based grammar models. It is based on the rule-based tools to both generate errors and clean the data using both part-of-speech analysis, disambiguation and even the error detector.

The rule-based module is embedded in the full-fledged GramDivvun grammar checker and achieves a good precision of 81% and a lower recall of 61%. A higher precision, even at the cost of a lower recall, is in line with our objective of keeping false alarms low, so users will be comfortable using our language tools. The neural network achieves a slightly lower precision of 79% and a much higher recall of 98%.

However, the rule-based model has more user-friendly suggestions and some false positives are simply other correct alternatives to the ones in the text, while the neural network’s false positives sometimes introduce new and unrelated errors. On-the-fly fixes that avoid false positives are an advantage of rule-based models. Rule-based models, on the other hand, are not so good at recognizing unknown combinations. Hybrid models that combine the benefits of both approaches are therefore desirable for efficient compound error correction in the future.

Acknowledgments

Thanks to Børre Gaup for his work on the evaluation script. Some computations were performed on resources provided by UNINETT Sigma2 — the National Infrastructure for High Performance Computing and Data Storage in Norway.

References

[1] K. Alnajjar, M. Hämäläinen, N. Partanen, and J. Rueter (2020) Automated prediction of medieval Arabic diacritics. arXiv preprint arXiv:2010.05269. Cited by: §4.1.
[2] K. Alnajjar (2021-03) When word embeddings become endangered. In Multilingual Facilitation, M. Hämäläinen, N. Partanen, and K. Alnajjar (Eds.), pp. 275–288 (English). External Links: Document Cited by: §4.1.
[3] L. Antonsen and T. Trosterud (2011) Next to nothing–a cheap south saami disambiguator. In Proceedings of the NODALIDA 2011 Workshop Constraint Grammar Applications, pp. 1–7. Cited by: §2.
[4] L. Antonsen (2013) Cállinmeattáhusaid guorran.. University of Tromsø. Note: [English summary: Tracking misspellings.] Cited by: §1.
[5] K. R. Beesley and L. Karttunen (2003) Finite state morphology. CSLI publications. External Links: ISBN 978-1575864341 Cited by: §4.2.
[6] B. Behera and P. Bhattacharyya (2013) Automated grammar correction using hierarchical phrase-based statistical machine translation. In Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 937–941. Cited by: §2.
[7] E. Bick and T. Didriksen (2015) CG-3 – beyond classical Constraint Grammar. In Proceedings of the 20th Nordic Conference of Computational Linguistics (NoDaLiDa 2015), B. Megyesi (Ed.), pp. 31–39. External Links: ISSN 1650-3740 Cited by: §4.2.
[8] E. Bick (2015) DanProof: pedagogical spell and grammar checking for Danish. In Proceedings of the 10th International Conference Recent Advances in Natural Language Processing (RANLP 2015), G. Angelova, K. Bontcheva, and R. Mitkov (Eds.), Hissar, Bulgaria, pp. 55–62. Cited by: §4.2.
[9] D. Deksne (2019) Bidirectional lstm tagger for latvian grammatical error detection. In International Conference on Text, Speech, and Dialogue, pp. 58–68. Cited by: §2.
[10] J. Devlin, M. Chang, K. Lee, and K. Toutanova (2019-06) BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171–4186. External Links: Link, Document Cited by: §2.
[11] T. Ge, X. Zhang, F. Wei, and M. Zhou (2019) Automatic grammatical error correction for sequence-to-sequence text generation: an empirical study. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6059–6064. Cited by: §2.
[12] M. Hämäläinen and K. Alnajjar (2019) A template based approach for training nmt for low-resource uralic languages - a pilot with Finnish. In Proceedings of the 2019 2nd International Conference on Algorithms, Computing and Artificial Intelligence, ACAI 2019, New York, NY, USA, pp. 520–525. External Links: ISBN 9781450372619, Link, Document Cited by: §6.
[13] M. Hämäläinen, T. Säily, J. Rueter, J. Tiedemann, and E. M äkelä (2019) Revisiting nmt for normalization of early English letters. In Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Herita ge, Social Sciences, Humanities and Literature, pp. 71–75. Cited by: §4.1.
[14] M. Hämäläinen (2019) UralicNLP: an NLP library for Uralic languages. Journal of Open Source Software 4 (37), pp. 1345. External Links: Document Cited by: §3.
[15] D. T. Hoang, S. Chollampatt, and H. T. Ng (2016) Exploiting n-best hypotheses to improve an smt approach to grammatical error correction. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 2803–2809. Cited by: §2.
[16] S. Hochreiter and J. Schmidhuber (1997) Long short-term memory. Neural computation 9 (8), pp. 1735–1780. Cited by: §4.1.
[17] S. Huang and H. Wang (2016) Bi-lstm neural networks for chinese grammatical error diagnosis. In Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational App lications (NLPTEA2016), pp. 148–154. Cited by: §2.
[18] M. N. Jahan, A. Sarker, S. Tanchangya, and M. A. Yousuf (2021) Bangla real-word error detection and correction using bidirectional lstm and bigram hybrid model. In Proceedings of International Conference on Trends in Computational and Cognitive Engineering, pp. 3–13. Cited by: §2.
[19] Y. Kantor, Y. Katz, L. Choshen, E. Cohen-Karlik, N. Liberman, A. Toledo, A. Menczel, and N. Slonim (2019) Learning to combine grammatical error corrections. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 139–148. Cited by: §2.
[20] F. Karlsson, A. Voutilainen, J. Heikkilä, and A. Anttila (1995) Constraint grammar: a language-independent system for parsing unrestricted text. Mouton de Gruyter, Berlin. Cited by: §4.2.
[21] F. Karlsson (1990) Constraint grammar as a framework for parsing unrestricted text. In Proceedings of the 13th International Conference of Computational Linguistics, H. Karlgren (Ed.), Vol. 3, Helsinki, pp. 168–173. Cited by: §4.2.
[22] G. Klein, Y. Kim, Y. Deng, J. Senellart, and A. M. Rush (2017) OpenNMT: open-source toolkit for neural machine translation. In Proc. ACL, External Links: Link, Document Cited by: §4.1.
[23] A. Kunchukuttan, S. Chaudhury, and P. Bhattacharyya (2014) Tuning a grammar correction system for increased precision. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task, pp. 60–64. Cited by: §2.
[24] M. Luong, H. Pham, and C. D. Manning (2015) Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025. Cited by: §4.1.
[25] C. Moseley (Ed.) (2010) Atlas of the world’s languages in danger. 3rd edition, UNESCO Publishing. Note: Online version: http://www.unesco.org/languages-atlas/ Cited by: §1.
[26] S. N. Moshagen, T. Pirinen, and T. Trosterud (2013-05) Building an open-source development infrastructure for language technology projects. In Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), Oslo, Norway, pp. 343–352. External Links: Link Cited by: §4.2.
[27] S. Moshagen (2011) Tilgjengelegheit for samisk og andre nasjonale minoritetsspråk. In Språkteknologi för ökad tillgänglighet, Cited by: §2.
[28] W. Nekoto, V. Marivate, T. Matsila, T. Fasubaa, T. Fagbohungbe, S. O. Akinola, S. Muhammad, S. Kabongo Kabenamualu, S. Osei, F. Sackey, R. A. Niyongabo, R. Macharm, P. Ogayo, O. Ahia, M. M. Berhe, M. Adeyemi, M. Mokgesi-Selinga, L. Okegbemi, L. Martinus, K. Tajudeen, K. Degila, K. Ogueji, K. Siminyu, J. Kreutzer, J. Webster, J. T. Ali, J. Abbott, I. Orife, I. Ezeani, I. A. Dangana, H. Kamper, H. Elsahar, G. Duru, G. Kioko, M. Espoir, E. van Biljon, D. Whitenack, C. Onyefuluchi, C. C. Emezue, B. F. P. Dossou, B. Sibanda, B. Bassey, A. Olabiyi, A. Ramkilowan, A. Öktem, A. Akinfaderin, and A. Bashir (2020-11) Participatory research for low-resourced machine translation: a case study in African languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online, pp. 2144–2160. External Links: Link, Document Cited by: §2.
[29] K. Omelianchuk, V. Atrasevych, A. Chernodub, and O. Skurzhanskyi (2020) GECToR–grammatical error correction: tag, not rewrite. In Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 163–170. Cited by: §2.
[30] H. Outakoski (2013) Davvisámegielat čálamáhtu konteaksta [The context of North Sámi literacy]. Sámi dieđalaš áigečála 1/2015, pp. 29–59. Cited by: §1.
[31] N. Partanen, M. Hämäläinen, and K. Alnajjar (2019-11) Dialect text normalization to normative standard Finnish. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), Hong Kong, China, pp. 141–146. External Links: Link, Document Cited by: §4.1.
[32] T. A. Pirinen and K. Lindén (2010) Finite-state spell-checking with weighted language and error models. In Proceedings of the Seventh SaLTMiL workshop on creation and use of basic lexical resources for less-resourced languagages, Valletta, Malta, pp. 13–18. External Links: Link Cited by: §4.2.
[33] M. Rei and H. Yannakoudakis (2016-08) Compositional sequence labeling models for error detection in learner writing. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, pp. 1181–1191. External Links: Link, Document Cited by: §2.
[34] J. Rueter and M. Hämäläinen (2020) FST morphology for the endangered Skolt Sami language. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL), pp. 250–257. Cited by: §2.
[35] M. Schuster and K. K. Paliwal (1997) Bidirectional recurrent neural networks. IEEE transactions on Signal Processing 45 (11), pp. 2673–2681. Cited by: §4.1.
[36] SIKOR (2018) SIKOR uit norgga árktalaš universitehta ja norgga sámedikki sámi teakstačoakkáldat, veršuvdna 06.11.2018. Note: onlineAccessed: 2018-11-06 External Links: Link Cited by: §3.
[37] T. Trosterud and S. Moshagen (2021) Soft on errors? the correcting mechanism of a Skolt Sami speller. In Multilingual Facilitation, M. Hämäläinen, N. Partanen, and K. Alnajjar (Eds.), pp. 197–207 (English). Cited by: §2.
[38] T. Trosterud (2004) Porting morphological analysis and disambiguation to new languages. In SALTMIL Workshop at LREC 2004: First Steps in Language Documentation for Minority Languages, pp. 90–92. Cited by: §2.
[39] F. M. Tyers and M. Sheyanova (2017-01) Annotation schemes in North Sámi dependency parsing. In Proceedings of the Third Workshop on Computational Linguistics for Uralic Languages, St. Petersburg, Russia, pp. 66–75. External Links: Link, Document Cited by: §3.
[40] L. Wiechetek, S. Moshagen, and K. B. Unhammer (2019) Seeing more than whitespace—tokenisation and disambiguation in a north Sámi grammar checker. In Proceedings of the 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers), pp. 46–55. Cited by: §2, §4.2.
[41] L. Wiechetek (2017) When grammar can’t be trusted – valency and semantic categories in north sámi syntactic analysis and error detection. PhD Thesis, UiT The Arctic University of Norway. Cited by: §2.
[42] L. Wiechetek (22) Constraint Grammar based correction of grammatical errors for North Sámi. In Proceedings of the Workshop on Language Technology for Normalisation of Less-Resourced Languages (SALTMIL 8/AFLAT 2012), G. D. Pauw, G. de Schryver, M.L. Forcada, K. Sarasola, F.M. Tyers, and P.W. Wagacha (Eds.), Istanbul, Turkey, pp. 35–40. Cited by: §3.
[43] Z. Yuan and T. Briscoe (2016) Grammatical error correction using neural machine translation. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 380–386. Cited by: §2.

Rules Ruling Neural Networks — Neural vs. Rule-Based Grammar Checking for a Low Resource Language11 1 find original in ACL anthology

Abstract

1 Introduction

2 Background

3 Data

4 Methods

4.1 Neural Model

4.2 Rule-based Model

5 Results

6 Discussion

7 Conclusion

Acknowledgments

References

Rules Ruling Neural Networks —
Neural vs. Rule-Based Grammar Checking for a Low Resource Language¹¹ 1 find original in ACL anthology