MICROSERVICES AT YOUR SERVICE BRIDGING THE GAP BETWEEN NLP RESEARCH AND INDUSTRY
Our mission is to simplify finding and using the many superb open source speech and language processing tools that the European Research Community has to offer.
The European Union’s Connecting Europe Facility has given us support to fill the ELRC-SHARE and European Language Grid with such tools! We are taking yet another step towards fulfilling the vision of a European digital single market!
See also the project description at the INEA site.
OUR PLAN 2021-2023
- Reach out to the European research community to help us identify suitable open source tools (2021)
- Help the researchers packaging the tools to facilitate re-use by other developers and researchers (2021-2022)
- Make the tools available on the European Union’s own platforms for language technology ELRC-SHARE and European Language Grid (2022-2023)
Do you know of open source tools that could be of interest to us? Perhaps tools you or your group have developed?
We are organizing workshops for how to add a server API and package your tool as an easily distributable docker image in March 2022. Come join us!
Would you like to be kept up-to-date with the progress of our project? Join our concluding seminar in February 2023 to see which tools we found and made available!
Lingsoft's Sebastian Andersson presented the Microservices project in the 6th ELRC Conference on March 31, 2022. Watch his presentation here!
Workshops
We organize workshops in which we present our work, the tools and possibilities to contribute in the ELG community.
Our newest workshop "ELG, a bridge for NLP development" will be held in March. See details for signing up and materials of past workshops in our workshop site.
Project results
Our project contributes easy-to use Docker containers and services in the ELG platform. This list is constantly updated when new tools become available.
Partner | Tool name | Language | Docker image | Original creator | ELG catalogue |
Lingsoft | HeLI OTS | Multilingual | Docker image | University of Helsinki | |
Lingsoft | Finto AI | Finnish | Docker image | National Library of Finland | ELG catalogue |
Lingsoft | Finto AI | Swedish | Docker image | National Library of Finland | ELG catalogue |
Lingsoft | Finto AI | English | Docker image | National Library of Finland | ELG catalogue |
Lingsoft | KB BERT NER SV | Swedish | Docker image | National Library of Sweden (KBLab) | ELG catalogue |
Lingsoft | FKB BERT Senti SV | Swedish | Docker image | Martin Malmsten (National Library of Sweden / KBLab) | ELG catalogue |
Lingsoft | KB BERT NER NO | Norwegian | Docker image | National Library of Norway /NbAiLab) | ELG catalogue |
Tartu | EstNLTK tokenizer | Estonian | Docker image | University of Tartu | ELG catalogue |
Tartu | Vabamorf morf | Estonian | Docker image | Filosoft | ELG catalogue |
Tartu | Vabamorf disambiguator | Estonian | Docker image | Filosoft | ELG catalogue |
Tartu | Est TTS preprocessor | Estonian | Docker image | University of Tartu | ELG catalogue |
Reykjavík University | Tokenizer | Icelandic | Docker image | Reykjavík University | ELG catalogue |
Reykjavík University | Icenip | Icelandic | Docker image | Reykjavík University | ELG catalogue |
Reykjavík University | Iceparser | Icelandic | Docker image | Reykjavík University | ELG catalogue |
Reykjavík University | NER | Icelandic | Docker image | Reykjavík University | |
Reykjavík University | POS | Icelandic | Docker image | Reykjavík University | |
Reykjavík University | ABLTagger | Faroese | Docker image | University of Iceland | ELG catalogue |
Reykjavík University | Icesum | Icelandic | Docker image | Reykjavík University | |
Gradiant | TranslateAlignRetrieve - Spanish QA | Spanish | Docker image | TALP - Center for Language and Speech Technologies and Applications | |
Gradiant | TWilBert | Spanish | Docker image | ELiRF - Enginyeria del Llenguatge Natural i Reconeiximent de Formes | |
Gradiant | BETO: Spanish BERT | Spanish | Docker image | ReLeLa - Departamento de Ciencias de la Computación Universidad de Chile | |
Gradiant | LM-SPANISH | Spanish | Docker image | BSC - Barcelona Supercomputing Center - Text Mining Unit | |
Gradiant | Emoevales-iberlef2021 | Spanish | Docker image | GSI - Grupo de Sistemas Inteligentes (UPM) | |
Gradiant | QAPTNET | Portugese | Docker image | Independent Development | |
Gradiant | QueLingua | Multilingual | Docker image | CiTIUS - Centro Singular de Investigación en Tecnoloxías Intelixentes | |
Gradiant | BERTimbau | Portugese | Docker image | NeuralMind Inteligencia Artificial | |
Gradiant | Bertinho | Galician | Docker image | LyS-CITIC - Lengua Y Sociedad de la Información | |
Gradiant | Nlpnet | Portugese | Docker image | NILC - Interinstitutional Center for Computational Linguistics (ICMC - University of São Paulo) | |
Gradiant | Julibert | Catalan | Docker image | SOFTCATALA |
INTRODUCING OUR CONSORTIUM
Gradiant
Spanish ICT technology centre aims to improve the competitiveness of companies by transfering knowledge and technologies in the fields of connectivity, intelligence and security. With more than 100 professional and 285 R&D&i projects, they’re becoming one of the main engines of innovation in Galicia.
Gradiant is backed by a board that includes representatives of the three Galician universities (Vigo, Santiago and A Coruña) and seven companies from the telecommunications industry: Altia, Arteixo Telecom, Egatel, Indra, Plexus, R, Telefónica, Televés; and INEO business association.
Gradiant is positioned as a technology partner for the industry, oriented to their needs in the ICT field. They are contributing with national and international experience in technologies for security and privacy; processing of multimedia signals; Internet of Things; Natural Language Processing, biometrics and data analytics; and advanced communications systems.
Lingsoft
Lingsoft Oy and its sister company Lingsoft Language Services Oy are part of the Lingsoft Group with a consolidated turnover of about 12,5 million euros in 2019 making us one of the 100 largest language service providers in the world. Founded in 1986, Lingsoft is a reliable, experienced and innovative partner. Lingsoft makes available a wide variety of language technology solutions and language services, designed for the analysis, processing and utilization of written and spoken language. Our solutions are making the text FAIR - Findable, Accessible, Interoperable and Reusable in online society. Lingsoft's core technologies and solutions have been tested by tens of millions of users around the world as part of the Microsoft Office suite of proofing tools. Lingsoft is the coordinator of the “Microservices at Your Service”.
Reykjavik University
Reykjavik University is a dynamic international university with 3800 registered students and 250 permanent faculty and staff. The university focuses on research, excellence in teaching, entrepreneurship, technology development, and co-operation with the business community. The Language and Voice Lab (LVL) was established in 2016 as a part of the research center in Artificial Intelligence with the aim of carrying out research and development in speech and language processing. LVL is part of the Icelandic National Language Technology Programme. This is a consortium of universities, institutions, associations, and private companies with the aim of ensuring that Icelandic can be supported in modern language technology applications.
University of Tartu
University of Tartu is the leading centre of research and training in Estonia. It preserves the culture of the Estonian people and spearheads the country's reputation in research and provision of higher education. University of Tartu is the leading partner of the Center of Estonian Language Resources (CELR) consortium, other partners are Tallinn University of Technology, Institute of the Estonian Language and Estonian Literary Museum. The goal of CELR is to create and manage an infrastructure to make the Estonian language digital resources (dictionaries, corpora, various language databases) and language technology tools (software) available to everyone working with digital language materials. The main users of CELR are researchers from Estonian R&D institutions and Social Sciences and Humanities researchers all over the world via the CLARIN ERIC network of similar centers in Europe.
The contents of this publication are the sole responsibility of the Microservices project and do not necessarily reflect the opinion of the European Union.