|Lingsoft® FINSPELL is Lingsoft's high-quality spelling checker component for Finnish, designed for checking basic spelling errors in standard written Finnish. It adheres to commonly known and accepted spelling norms presented in established reference works available.|
Lingsoft endeavors to keep the subtle balance between recall (the rate of correctly spelled words recognized) and precision (the rate of errors detected) by performing rigorous regression testing when changes are made to the language model. Particular care has been taken to avoid masking, which means that a frequent spelling error is hidden by a rare word being spelled exactly as the erroneously spelled.
Based on Lingsoft's Model of Finnish
FINSPELL uses Lingsoft's comprehensive two-level model of Finnish morphology to recognize inflected, derivative and compound word forms, and to generate correction suggestions. The model contains more than 55 000 lexical entries, covering the central vocabulary of Finnish, including abbreviations, acronyms, proper names and numerals. Two-level rules take care of word transformation issues like "kirja, kirjoja" (book, books)
The inflectional mechanism recognizes all the morphologically correct inflected word forms. The derivational and compositional mechanisms allow for new words to be formed based on words known to the model. The generative mechanisms have been restricted to increase precision, meaning that not all morphologically acceptable compound or derivative words are recognized. Considering that Finnish is a highly agglutinative and freely compounding language, the amount of recognized words can be measured in up to billions.
The lexical content and two-level rules of the language model are compiled to a fast and compact finite-state transducer, which along with the program code and other data are included in a binary file of only about 1.5 MB.
A Suggestion Mechanism that Works
FINSPELL attempts to suggest corrections to words it doesn't recognize as correctly spelled. The basic suggestion mechanism suggests all recognized words with the editing distance of one (one-letter addition, deletion or transposition, except for the first letter of the word). More wide-ranging and more specific suggestions are given to common spelling errors. Some particular common spelling errors receive only the typically appropriate correction(s).
FINSPELL generally avoids suggesting words that may seem awkward or incomprehensible for the user. In particular generated compounds and derivatives are only suggested based on segment-specific correction rules. FINSPELL also endeavors not to suggest words that may potentially seem offensive for the user. If suitable suggestions are not found, no suggestions are given.
Stunning Performance and Precision
FINSPELL can analyze more than 30 000 words per second on an Intel Xeon @ 3.0 GHz running Linux, and recognizes more than 95% of the correctly spelled words in typical running text.
Software Integration Made Easy
FINSPELL can be integrated to provide spell-checking to almost any software application, including web-based services, with Lingsoft's proprietary LSPROOF-API application programming interface for Windows, Linux, Mac and Java. The character set used with LSPROOF is Unicode.
Lingsoft® FINSPELL: Copyright © Lingsoft, Inc. 1986-2010. Two-Level Compiler: Copyright © Xerox Corporation 1994. Lingsoft is a registered trademark, and FINSPELL and LSPROOF are trademarks of Lingsoft, Inc. All rights reserved. Details subject to change.
Copyright ©1986-2016, Lingsoft Ltd.