Textual analysis - The language of digitalisation is structured

We produce enormous amounts of information, but the life cycle of this information is often short-lived. Typically, information is in the form of free text, which is more suited for the human eye. However, the automated processing of non-structured information is challenging, particularly in Finnish and other strongly inflected languages.

Lingsoft's language technology allows you to write a message as free text and then convert it into a machine-readable format using Natural Language Processing (NLP).

Key use cases of language structure analysis include:

  • data enrichment and mining
  • data indexing
  • term indexing

Textual structure analysis enables a wide range of tools for managing and utilising great amounts of information. 

Natural Language Processing (NPL) was designed to facilitate the technological processing of data in text format. In practice, Lingsoft's textual analysis means teaching the word formation and inflection rules of a specific language to a machine. This allows the machine to recognise a word in all its inflected forms in free text, restore its basic form, carry out a grammatical analysis, and recognise the boundaries of words. For example, the MS Word spell-checkers developed by Lingsoft detect spelling errors in inflected and compound words.

In data retrieval, search engines use textual analysis to also find documents where the search word is inflected. 

Indexing can significantly improve the discoverability of information. The process involves adding detailed metadata – that is, information about information. This is particularly useful when organising and processing large datasets. 

Sometimes, data needs to be supplemented with semantic information, i.e. information about the meaning of concepts and the relationships between different concepts. Take for example an iPhone: it is a type of smartphone, which in turn is a mobile phone. And along the same line of thought, a mobile phone is a specific type of phone. At the very core of semantics, an iPhone is an inanimate, physical object. This is how information is connected.

