Microservices at Your Service Bridging the Gap Between NLP Research and Industry

Our mission is to simplify finding and using the many superb open source speech and language processing tools that the European Research Community has to offer.

The European Union’s Connecting Europe Facility has given us support to fill the ELRC-SHARE and European Language Grid with such tools! We are taking yet another step towards fulfilling the vision of a European digital single market!

See also the project description at the INEA site.

Gradiant

Lingsoft

 

University of Reykjavik

University of Tartu

Our plan 2021-2023

  1. Reach out to the European research community to help us identify suitable open source tools (2021)
     
  2. Help the researchers packaging the tools to facilitate re-use by other developers and researchers (2021-2022)
     
  3. Make the tools available on the European Union’s own platforms for language technology ELRC-SHARE and European Language Grid (2022-2023)
     

Do you know of open source tools that could be of interest to us? Perhaps tools you or your group have developed?

We are organizing workshops for how to add a server API and package your tool as an easily distributable docker image in March 2022. Come join us!

Would you like to be kept up-to-date with the progress of our project? Join our concluding seminar in February 2023 to see which tools we found and made available!

Lingsoft's Tiina Lindh-Knuutila presented the project and some of the tools we have made available in META-FORUM 2022 conference in the beginning of June. 

Lingsoft's Sebastian Andersson presented the Microservices project in the 6th ELRC Conference on March 31, 2022. Watch his presentation here!

Workshops

We organize workshops in which we present our work, the tools and possibilities to contribute in the ELG community.

Our newest workshop "ELG, a bridge for NLP development" was held in March. See the recording on our workshop site.

Workshops

Project results

Our project contributes easy-to use Docker containers and services in the ELG platform. This list is constantly updated when new tools become available.

Partner Tool name Language Docker image Original creator ELG catalogue
Lingsoft HeLI OTS Multilingual Docker image University of Helsinki  
Lingsoft Finto AI Finnish Docker image National Library of Finland ELG catalogue
Lingsoft Finto AI Swedish Docker image National Library of Finland ELG catalogue
Lingsoft Finto AI English Docker image National Library of Finland ELG catalogue
Lingsoft KB BERT NER SV Swedish Docker image National Library of Sweden (KBLab) ELG catalogue
Lingsoft KB BERT Senti SV Swedish Docker image Martin Malmsten (National Library of Sweden / KBLab) ELG catalogue
Lingsoft KB BERT NER NO Norwegian Docker image National Library of Norway /NbAiLab) ELG catalogue
Lingsoft Aalto-kaldi-align Finnish Docker image Aalto University ELG catalogue
Lingsoft Aalto-kaldi-align Estonian Docker image Aalto University ELG catalogue
Lingsoft Aalto-kaldi-align English Docker image Aalto University ELG catalogue
Lingsoft Aalto-kaldi-align Komi Docker image Aalto University ELG catalogue
Lingsoft MeMAD lidbox Multilingual Docker image Aalto University  
Lingsoft Lithuanian spaCy Lithuanian Docker image Explosion  
Lingsoft FinBERT NER Finnish Docker image University of Turku  
Tartu EstNLTK tokenizer Estonian Docker image University of Tartu ELG catalogue
Tartu Vabamorf morf Estonian Docker image Filosoft ELG catalogue
Tartu Vabamorf disambiguator Estonian Docker image Filosoft ELG catalogue
Tartu Est TTS preprocessor Estonian Docker image University of Tartu ELG catalogue
Tartu HTS Speech Synthesiser Estonian Docker image Institute of the Estonian Language  
Tartu Vabamorf generator Estonian Docker image Filosoft  
Tartu CG syntax parser Estonian Docker image University of Tartu  
Tartu spaCy tagger Estonian Docker image University of Tartu  
Tartu Grapheme-to-phoneme engine Estonian Docker image Tallinn University of Technology  
Reykjavík University Tokenizer Icelandic Docker image Reykjavík University ELG catalogue
Reykjavík University Icenip Icelandic Docker image Reykjavík University ELG catalogue
Reykjavík University Iceparser Icelandic Docker image Reykjavík University ELG catalogue
Reykjavík University NER Icelandic Docker image Reykjavík University  
Reykjavík University POS Icelandic Docker image Reykjavík University  
Reykjavík University ABLTagger Faroese Docker image University of Iceland ELG catalogue
Reykjavík University Icesum Icelandic Docker image Reykjavík University  
Gradiant TranslateAlignRetrieve - Spanish QA Spanish Docker image TALP - Center for Language and Speech Technologies and Applications  
Gradiant TWilBert Spanish Docker image ELiRF - Enginyeria del Llenguatge Natural i Reconeiximent de Formes  
Gradiant BETO: Spanish BERT Spanish Docker image ReLeLa - Departamento de Ciencias de la Computación Universidad de Chile  
Gradiant LM-SPANISH Spanish Docker image BSC - Barcelona Supercomputing Center - Text Mining Unit  
Gradiant Emoevales-iberlef2021 Spanish Docker image GSI - Grupo de Sistemas Inteligentes (UPM)  
Gradiant QAPTNET Portugese Docker image Independent Development  
Gradiant QueLingua Multilingual Docker image CiTIUS - Centro Singular de Investigación en Tecnoloxías Intelixentes  
Gradiant BERTimbau Portugese Docker image NeuralMind Inteligencia Artificial  
Gradiant Bertinho Galician Docker image LyS-CITIC - Lengua Y Sociedad de la Información  
Gradiant Nlpnet Portugese Docker image NILC - Interinstitutional Center for Computational Linguistics (ICMC - University of São Paulo)  
Gradiant Julibert Catalan Docker image SOFTCATALA  

Introducing our Consortium

Gradiant

Spanish ICT technology centre aims to improve the competitiveness of companies by transfering knowledge and technologies in the fields of connectivity, intelligence and security. With more than 100 professional and 285 R&D&i projects, they’re becoming one of the main engines of innovation in Galicia. 

Gradiant is backed by a board that includes representatives of the three Galician universities (Vigo, Santiago and A Coruña) and seven companies from the telecommunications industry: Altia, Arteixo Telecom, Egatel, Indra, Plexus, R, Telefónica, Televés; and INEO business association.

Gradiant is positioned as a technology partner for the industry, oriented to their needs in the ICT field. They are contributing with national and international experience in technologies for security and privacy; processing of multimedia signals; Internet of Things; Natural Language Processing, biometrics and data analytics; and advanced communications systems. 

Lingsoft

Lingsoft Oy and its sister company Lingsoft Language Services Oy are part of the Lingsoft Group with a consolidated turnover of about 12,5 million euros in 2019 making us one of the 100 largest language service providers in the world. Founded in 1986, Lingsoft is a reliable, experienced and innovative partner. Lingsoft makes available a wide variety of language technology solutions and language services, designed for the analysis, processing and utilization of written and spoken language. Our solutions are making the text FAIR - Findable, Accessible, Interoperable and Reusable in online society. Lingsoft's core technologies and solutions have been tested by tens of millions of users around the world as part of the Microsoft Office suite of proofing tools. Lingsoft is the coordinator of the “Microservices at Your Service”. 

Reykjavik University

Reykjavik University is a dynamic international university with 3800 registered students and 250 permanent faculty and staff. The university focuses on research, excellence in teaching, entrepreneurship, technology development, and co-operation with the business community. The Language and Voice Lab (LVL) was established in 2016 as a part of the research center in Artificial Intelligence with the aim of carrying out research and development in speech and language processing. LVL is part of the Icelandic National Language Technology Programme. This is a consortium of universities, institutions, associations, and private companies with the aim of ensuring that Icelandic can be supported in modern language technology  applications.

University of Tartu

University of Tartu is the leading centre of research and training in Estonia. It preserves the culture of the Estonian people and spearheads the country's reputation in research and provision of higher education. University of Tartu is the leading partner of the Center of Estonian Language Resources (CELR) consortium, other partners are Tallinn University of Technology, Institute of the Estonian Language and Estonian Literary Museum. The goal of CELR is to create and manage an infrastructure to make the Estonian language digital resources (dictionaries, corpora, various language databases) and language technology tools (software) available to everyone working with digital language materials. The main users of CELR are researchers from Estonian R&D institutions and Social Sciences and Humanities researchers all over the world via the CLARIN ERIC network of similar centers in Europe.
 

The contents of this publication are the sole responsibility of the Microservices project and do not necessarily reflect the opinion of the European Union.