This section provides an overview of the Nepali Spell Checker app with a description of major features and documentation of its main functionalities.Contents
Nepali Spell Checker makes it easy to spell-check in Nepali by utilizing data from the official Nepali dictionary and incorporating Nepali grammar rules. Potential misspelled words that are not found in the dictionary are presented with relevant suggestions (powered by a comprehensive set of Nepali words and phrases.)
To spell-check, paste your Nepali text into the editor box and click the "Check Spelling" button. Nepali Spell Checker will analyze your text and if it finds a misspelled term, either a word or a phrase, it will provide you with relevant suggestions. You can, then, either replace the misspelled term with one of the suggestions or ignore the suggestions and not make the change.
While replacing the misspelled term with one of the suggestions, you can do so for just that particular occurrence of the misspelled term or have the change applied to all occurrences of the misspelled term. In a similar fashion, you can ignore a particular occurrence of the term or ignore all occurrences of the term. The Replace All, or Ignore All, operation replaces, or ignores, the term from the current position of the term to the end of the text.
How suggestions are made
Suggestions offered by Nepali Spell Checker are generated by an algorithm, which is based on a number of factors including the entries in the official Nepali dictionary, Nepali grammar and orthography rules and the guidelines from Nepal Academy. The order in which the suggestions are presented is based on objective factors, such as popularity of the suggestions.
The approach taken to predict suggestions can be categorized broadly into two steps. The first step includes the term being looked up against a comprehensive set of words and phrases, which includes a varieties of entries, like the entries from the official Nepali dictionary, new words not yet included in the dictionary, proper nouns, etc. If a term matches an entry in the set, the spell checker learns that the term is correct and moves on to the next term.
If the term does not exist in the dictionary, then, as a part of the second step, it processes the term against the grammar and orthography rules. Often times, a typical text contains many valid terms, like inflections and derived words, etc., that are not root terms. The spell checker decomposes such a term into different components. A lookup is performed for each component, and if the lookup succeeds, i.e., if all components exist in the set, it finally checks to see if all the components relate to each other according to the grammar and orthography rules.
The word "Gharma," meaning "in the house" and containing the root word, Ghar, meaning "house" and the case ending Ma, meaning "in," can be taken as an example to illustrate the approach. In the first step, a lookup is made for the entire term. Even though the term is valid, the lookup fails because the entire term does not exist as a single entry in the set, even though the root word and the case ending exist as separate entries. Then the second step kicks in. The term is decomposed into the root term and the case ending, after which a lookup for both components is performed. The lookup succeeds and then it proceeds to check the grammar and orthography rules. It passes the grammar rule that states that a case ending can be applied to a noun and also passes the orthography rules that state that a case ending should appear after the root word and that the case ending and the root word should be written as one single word. The spell checker then flags the term as a correct term.
At this point, if the spell checker cannot determine that the term is correct, it looks for closest matches in the set and comes up with a list of suggestions ranked based on popularity and how closely they match the term. The suggestions are then presented to the user. The suggestions may be entries from the set or other forms of entries, e.g., auto generated inflections.
Many other rules are considered. The rules get updated on a regular basis to reflect any changes to the Nepali Academy guidelines or to incorporate a wide range of use cases. The most common rules that catch common errors are the rules on conjugation, declension, derived words and grammatical categories, like aspect, number, gender, voice, etc.
The following list outlines examples of common rules.
- Number, case endings should appear as a suffix to a noun phrase.
- A word with a noun phrase in the beginning can have a postposition, a number and case endings as suffixes. If a case ending suffix has been applied, it must be the end of the word, e.g., Gharma (noun followed by case ending), but after a postposition or a number suffix, case ending suffix may still be applied, e.g., Gharharuma (noun followed by number followed by case ending.)
- Conjunctions and interjections must be written as standalone words.
- When multiple verbs combine to form one word, the ending vowel of all verbs except the last one must be Raswa.
Reasons you might not see suggestions
In order not to go overboard while analyzing the text, by default, Nepali Spell Checker ignores certain types of terms.
Terms containing characters that are not used to write Nepali are ignored and are not flagged as incorrect, e.g., the English word "cat" or terms containing characters of non-Devanagari scripts are ignored since they are not used to write Nepali.
Similarly, terms containing one or more Devanagari digits 0 to 9 are ignored. Finally, by default, terms for which the spell checker cannot find suggestions are also ignored.
Users, however, will have the option to enable strict checking to catch all these exceptions.
Auto correct common errors
Based on a popular enhancement request from our users, we are pleased to introduce the "Auto correct common errors" feature. With this feature, you are now able to auto correct common errors without having to accept the suggestions for such common misspelled terms.
A list of the most common misspelled terms and their correct forms powers this feature. While analyzing a term, Nepali Spell Checker looks at the list and if the term matches an entry in the list, it automatically replaces the term with the corresponding correct form of the matched entry. At times, there may be situations where a misspelled term could match multiple entries from the list. In such a case, the spell checker presents the user with all correct suggestions. The list is regularly updated to ensure that new common misspelled terms and their correct forms are added on a continual basis.
The "Auto correct common errors" feature is not enabled by default. To activate this feature, select the "Auto correct common errors" checkbox before spell-checking your text.
Benefits of using the app
By using the app, you will be contributing to make the app better as the app is designed to learn from its usage. The more it is used, the better the algorithm gets. Its usage will also contribute to the generation of new knowledge and insights on the usage of the Nepali language and its trends, reports of which are published by NLRC. Such reports contain non-personally identifying information, such as top words over a given time, common misspelled words, etc. Such information will be beneficial to the community interested in learning and promoting the Nepali language.
Using the app for checking a large amount of text
In order to make the service more equitable among our community, who uses the app on a regular basis, and given the infrastructure we have, currently without signing in to the app and using it as a guest, you can spell-check text containing up to 13000 characters at one time, but there is no limitation on how many times you can use the app. As a signed in user, the limit goes up, but if you need to spell-check a large amount of text in one go, please let us know and we'll try our best to accommodate your needs.