Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
Show more facetsThese levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
The Norwegian Spanish Parallel Corpus (NSPC) was created at the University of Bergen, Norway. The corpus is primarily co…
The Norwegian Spanish Parallel Corpus (NSPC) was created at the University of Bergen, Norway. The corpus is primarily constructed for research in Translation Studies, and is built to be roughly comparable to the Spanish-English P-ACTRES corpus. The NSPC is a parallel, unidirectional translation corpus of contemporary N…
Recording equipment The recordings were done by means of a digital recorder (Fostex FR-2LE) and two AKG C451 B microph…
Recording equipment The recordings were done by means of a digital recorder (Fostex FR-2LE) and two AKG C451 B microphones placed on the table in front of the speakers. The recording took place in one of the participants’ home, speaker 04. The speakers The set consists of four speakers, two women, born in 1929 a…
Recording equipment The recordings were done by means of a cassette recorder (Sony TC-D5M) and Sony lavaliere micropho…
Recording equipment The recordings were done by means of a cassette recorder (Sony TC-D5M) and Sony lavaliere microphones. The recordings took place in the speakers’ homes or in a hotel room. Sigurd Nordlie was recorded in his office at the University of Oslo. The tapes were digitized in the 1990s. The speakers …
The experiment was conducted in a quiet experimental room with an SR Research Eye-Link 1000 eyetracker desktop mount wit…
The experiment was conducted in a quiet experimental room with an SR Research Eye-Link 1000 eyetracker desktop mount with a 35 mm lens, 13 point calibration and 1k sample rate and pacing interval. A game pad and keyboard were used to navigate in the experiment. Participants viewed the stimuli on a 21 in monitor 70 cm a…
Word and tag embeddings trained on TüDP-D/W and TüPP-D/Z using Wang2Vec.
Word and tag embeddings trained on TüDP-D/W and TüPP-D/Z using Wang2Vec.
This content is available in Kielipankki. This collection contains two sets of Suomi24 data: "The Suomi24 Sentences Co…
This content is available in Kielipankki. This collection contains two sets of Suomi24 data: "The Suomi24 Sentences Corpus 2001-2017, Korp version" and "The Suomi24 Sentences Corpus 2018-2020, Korp version". Together, the two corpora cover all the discussion forums of the Suomi24 online social networking website fro…
The resource, containing entire newspaper and magazine articles, has been made available for Download in Kielipankki - t…
The resource, containing entire newspaper and magazine articles, has been made available for Download in Kielipankki - the Language Bank of Finland at http://urn.fi/urn:nbn:fi:lb-201712201 The data consists of source data in PDF form or as plain text and is not annotated. An annotated version (lehdet90ff-vrt-v2) is av…
The corpus is available via Korp in Kielipankki - the Language Bank of Finland (korp.csc.fi). This most recent version …
The corpus is available via Korp in Kielipankki - the Language Bank of Finland (korp.csc.fi). This most recent version of Corpus of Contemporary American English (COCA), released in March 2020, contains about 1 billion words in 485,000 texts from the years 1990-2019. The corpus is evenly divided into spoken, fiction, …
This version of the The Magazine Corpus of the Institute for the Languages of Finland contains data, where the OCR (opti…
This version of the The Magazine Corpus of the Institute for the Languages of Finland contains data, where the OCR (optical character recognition) hasn't been checked. It contains different volumes of four magazines: Suomen Kuvalehti's volumes: one issue from 1916 'sample', 1917, 1925, 1935, 1945, 1955, 1965, 1972 (app…
This corpus includes files for evaluating language identification efficacy on the suomi24-2018-2020 (http://urn.fi/urn:n…
This corpus includes files for evaluating language identification efficacy on the suomi24-2018-2020 (http://urn.fi/urn:nbn:fi:lb-2021101521) and the new part of the klk-v2 (http://urn.fi/urn:nbn:fi:lb-202009152) corpora. The lines are random "sentences" from the new material processed by the language bank of Finland d…