Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
Show more facetsThese levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
The corpus contains issues of the Karjalan Sanomat newspaper published in 2012-2014. The corpus is available in Kielipa…
The corpus contains issues of the Karjalan Sanomat newspaper published in 2012-2014. The corpus is available in Kielipankki - the Language Bank of Finland (http://urn.fi/urn:nbn:fi:lb-2016112501). In case you are not a member of an academic institution please read the access rights instructions at https://www.kielipa…
This version of the The Magazine Corpus of the Institute for the Languages of Finland contains the data, where the OCR (…
This version of the The Magazine Corpus of the Institute for the Languages of Finland contains the data, where the OCR (optical character recognition) has been checked. The size of this sub-corpus is 670 000 tokens. It contains one 1935 issue from 'Historiallinen Aikakauskirja', 'Lakimies' and 'Suomi', as well as 4 iss…
The resource is available via Kielipankki – The Language Bank of Finland. This parallel dataset can be used for trainin…
The resource is available via Kielipankki – The Language Bank of Finland. This parallel dataset can be used for training simplification models and/or studying simplification strategies that experts apply for Finnish news articles. The languages of the dataset are Finnish and Easy-to-read Finnish. The articles of which…
The corpus is available in Kielipankki - the Language Bank of Finland, download: http://urn.fi/urn:nbn:fi:lb-2015040801 …
The corpus is available in Kielipankki - the Language Bank of Finland, download: http://urn.fi/urn:nbn:fi:lb-2015040801 (see there Suomi24-2015-10-29_VRT). License details: http://urn.fi/urn:nbn:fi:lb-20150304151 The corpus contains all the texts available in the Suomi24 API from the discussion forums of the Suomi24 o…
The corpus consists of 953 articles (193,742 word tokens) with six named entity classes (organization, location, person,…
The corpus consists of 953 articles (193,742 word tokens) with six named entity classes (organization, location, person, product, event,and date). The articles are extracted from the archives of Digitoday, a Finnish online technology news source. The data sets are available at https://github.com/mpsilfve/finer-data a…
This audiovisual dataset contains * audio files, subtitles and ground truth transcripts, speaker diarizations and NER a…
This audiovisual dataset contains * audio files, subtitles and ground truth transcripts, speaker diarizations and NER annotations of 16 factual programs in Finnish and Swedish * video files, subtitles, metadata and annotations for 8 factual programs that have been used for demonstration and test purposes in the MeMAD …
The corpus is available in in Kielipankki - the Language Bank of Finland (ling.helsinki.fi), download location: http://u…
The corpus is available in in Kielipankki - the Language Bank of Finland (ling.helsinki.fi), download location: http://urn.fi/urn:nbn:fi:lb-2015030301 This multimodal corpus, which consists of the tourist brochures produced by the city of Helsinki, Finland, is fully annotated using XML schema provided for the Genre an…
This version of the The Magazine Corpus of the Institute for the Languages of Finland contains data, where the OCR (opti…
This version of the The Magazine Corpus of the Institute for the Languages of Finland contains data, where the OCR (optical character recognition) hasn't been checked. It contains different volumes of four magazines: Suomen Kuvalehti's volumes: one issue from 1916 'sample', 1917, 1925, 1935, 1945, 1955, 1965, 1972 (app…
The corpus is available for download in Kielipankki - the Language Bank of Finland: http://urn.fi/urn:nbn:fi:lb-20170224…
The corpus is available for download in Kielipankki - the Language Bank of Finland: http://urn.fi/urn:nbn:fi:lb-2017022403 You should be able to download it by just logging in with your university credentials. In case you cannot log in, even though you are affiliated to a university, see instructions at https://www.ki…
Iijoki-sarjan kuvaus löytyy sivulta http://urn.fi/urn:nbn:fi:lb-2019041401. Sarjan 26 kirjaa on jäsennetty Kielipankiss…
Iijoki-sarjan kuvaus löytyy sivulta http://urn.fi/urn:nbn:fi:lb-2019041401. Sarjan 26 kirjaa on jäsennetty Kielipankissa kahdella eri jäsentimellä. Tämä versio on jäsennetty Turku Neural Parser Pipeline (TNPP) -jäsentimellä. Se on Turun yliopistossa TurkuNLP-hankeessa kehitetty neuroverkkojäsennin, tarkemmat tiedot l…