Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
Show more facetsThese levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
This resource is available for download in Kielipankki – the Language Bank of Finland. This is a parallel corpus create…
This resource is available for download in Kielipankki – the Language Bank of Finland. This is a parallel corpus created of the Yle news articles from 2014-2020 by aligning the standard Finnish versions with the easy-language versions. The dataset, created by Anna Dmitrieva and available in CSV format, is aligned on t…
The corpus is available for Download in Kielipankki - the Language Bank of Finland The data is annotated and identical …
The corpus is available for Download in Kielipankki - the Language Bank of Finland The data is annotated and identical to the data used as basis for lehdet90ff-v2. A short documentation of the VRT file format can be found via the Documentation section. Reference instructions: See Attribution Details under Documentati…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi). The Corpus of Contemporary America…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi). The Corpus of Contemporary American English (COCA) contains about 440 million words and 190 000 texts from the years 1990-2012. The corpus is evenly divided into spoken, fiction, magazine, newspaper, academic genres (~88 million words…
This corpus includes files for evaluating language identification efficacy on the suomi24-2018-2020 (http://urn.fi/urn:n…
This corpus includes files for evaluating language identification efficacy on the suomi24-2018-2020 (http://urn.fi/urn:nbn:fi:lb-2021101521) and the new part of the klk-v2 (http://urn.fi/urn:nbn:fi:lb-202009152) corpora. The lines are random "sentences" from the new material processed by the language bank of Finland d…
This corpus consists of the matriculation exam answers from spring 2008 in the Swedish essay test. For the time being,…
This corpus consists of the matriculation exam answers from spring 2008 in the Swedish essay test. For the time being, the corpus can only be accessed by a project at the University of Helsinki, but when the preparation of the material has been completed, it will also be possible for other researchers to use the corp…
The corpus is available via Korp in Kielipankki - the Language Bank of Finland (korp.csc.fi). This most recent version …
The corpus is available via Korp in Kielipankki - the Language Bank of Finland (korp.csc.fi). This most recent version of Corpus of Contemporary American English (COCA), released in March 2020, contains about 1 billion words in 485,000 texts from the years 1990-2019. The corpus is evenly divided into spoken, fiction, …
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2015040…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2015040103. The corpus contains text from discussions of the Ylilauta online discussion board from 2012 to 2014. Short fragments from the discussions, e.g. sentences or paragraphs, are publicly available in…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2014052…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2014052711. A 34-volume collection of Finnic oral poetry, lyric, short rhymes, incantations etc., collected and recorded from the 16th century to the 1930s and published mostly between 1908 and 1948, with a…
This corpus is available in Kielipankki, Korp service. The corpus contains all the discussion forums of the Suomi24 on…
This corpus is available in Kielipankki, Korp service. The corpus contains all the discussion forums of the Suomi24 online social networking website from 1st January 2018 to 31st December 2020 obtained via the Suomi24 API. Researchers can also download the entire corpus (for downloadable versions, see the resource g…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi). This dataset consists of the Yle S…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi). This dataset consists of the Yle Selkokieliset uutiset in Finnish (Yle Easy-to-read Finnish News). The dataset was created from the contents of the Yle News Archive for the language code "fi" for each month from the year 2011 to the y…