Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
Show more facetsThese levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
This dataset contains semi-automatically cleaned, parallel professional subtitles from 44 programs, containing 10.3k ali…
This dataset contains semi-automatically cleaned, parallel professional subtitles from 44 programs, containing 10.3k aligned sentence pairs for these language pairs: FIN-SWE, FIN-ENG, SWE-ENG. This dataset does not contain video or audio, but the total content length covered by the subtitles is 22,46 hours. --- Yle h…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2014052…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2014052711. A 34-volume collection of Finnic oral poetry, lyric, short rhymes, incantations etc., collected and recorded from the 16th century to the 1930s and published mostly between 1908 and 1948, with a…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi). The Corpus of Historical American …
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi). The Corpus of Historical American English (COHA) contains about 385 million words and 115 000 texts from the years 1810-2009. Each decade has roughly the same balance of fiction, popular magazine, newspaper, and non-fiction books. Ac…
This resource is available for download in Kielipankki – the Language Bank of Finland. The FinChat corpus consists of 8…
This resource is available for download in Kielipankki – the Language Bank of Finland. The FinChat corpus consists of 85 Finnish chat dialogs collected in 2019-2020. The participants (N=62) were native speakers of Finnish in three age-based user groups: high school students (16-19 years), university students (20-25 ye…
The Acquis Communautaire (AC) is the total body of European Union (EU) law applicable in the the EU Member States. This …
The Acquis Communautaire (AC) is the total body of European Union (EU) law applicable in the the EU Member States. This collection of legislative text changes continuously and currently comprises selected texts written between the 1950s and now. As of the beginning of the year 2007, the EU had 27 Member States and 23 o…
The corpus, which is the downloadable version of the years 2013-2015 of the Aalto University DSP Course Conversation Cor…
The corpus, which is the downloadable version of the years 2013-2015 of the Aalto University DSP Course Conversation Corpus 2013- (http://urn.fi/urn:nbn:fi:lb-2015101901), is available in Kielipankki - the Language Bank of Finland at https://korp.csc.fi/download/DSPCON. This version contains transcribed utterances fro…
The corpus, containing the articles from Svenska YLE https://svenska.yle.fi from 2012 onwards up to 2018 inclusive, is a…
The corpus, containing the articles from Svenska YLE https://svenska.yle.fi from 2012 onwards up to 2018 inclusive, is available at Korp. The licence is available at http://urn.fi/urn:nbn:fi:lb-2019120401
The resource is available via Kielipankki – The Language Bank of Finland. This parallel dataset can be used for trainin…
The resource is available via Kielipankki – The Language Bank of Finland. This parallel dataset can be used for training simplification models and/or studying simplification strategies that experts apply for Finnish news articles. The languages of the dataset are Finnish and Easy-to-read Finnish. The articles of which…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi). The Corpus of Global Web-Based Eng…
The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi). The Corpus of Global Web-Based English (GloWbE) contains about 1.8 billion words and 1 800 000 texts from web pages in United States, Great Britain, Australia, India, and 16 other countries. About 60 % of the texts come from blogs. A…