Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
Show more facetsThese levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
The treebank "Annotations of fiction text from 'Nynorskkorpuset ved Norsk Ordbok 2014' is a syntactically annotated corp…
The treebank "Annotations of fiction text from 'Nynorskkorpuset ved Norsk Ordbok 2014' is a syntactically annotated corpus which uses text extracts from Nynorskkorpuset ved Norsk Ordbok 2014 (no2014.uio.no). This treebank is part of INESS NorGramBank collection (see URL in metadata).; Text Preprocessing: When a corpus…
This collection consists of scripted recordings from different rural dialects spoken in Norway and Sweden, in total 33 r…
This collection consists of scripted recordings from different rural dialects spoken in Norway and Sweden, in total 33 recordings of 46 different speakers. The speakers’ year of birth ranges from 1909 to 1973. The sets of target words were designed to capture the quantity system and tonal system of the different dialec…
Recording equipment The recordings were done by means of a digital recorder (Fostex FR-2LE) and two AKG C451 B microph…
Recording equipment The recordings were done by means of a digital recorder (Fostex FR-2LE) and two AKG C451 B microphones placed on the table in front of the speakers. The recording took place in one of the participants’ home, speaker 04. The speakers The set consists of four speakers, two women, born in 1929 a…
The resource NOJU is a terminological database containing terms, definitions and other conceptual information in Norwegi…
The resource NOJU is a terminological database containing terms, definitions and other conceptual information in Norwegian and German within legal domains.
NOT-basen is a TBX-export of a terminology database developed by Norsk termbank. This termbase is to be considered an hi…
NOT-basen is a TBX-export of a terminology database developed by Norsk termbank. This termbase is to be considered an historical resource, and has not been updated for a while.
Recording equipment The recordings were done by means of a cassette recorder (Sony TC-D5M) and Sony lavaliere micropho…
Recording equipment The recordings were done by means of a cassette recorder (Sony TC-D5M) and Sony lavaliere microphones. The recordings took place in the speakers’ homes or in a hotel room. Sigurd Nordlie was recorded in his office at the University of Oslo. The tapes were digitized in the 1990s. The speakers …
The Norwegian Spanish Parallel Corpus (NSPC) was created at the University of Bergen, Norway. The corpus is primarily co…
The Norwegian Spanish Parallel Corpus (NSPC) was created at the University of Bergen, Norway. The corpus is primarily constructed for research in Translation Studies, and is built to be roughly comparable to the Spanish-English P-ACTRES corpus. The NSPC is a parallel, unidirectional translation corpus of contemporary N…
This is the dataset corresponding to the GermEval 2014 NER Shared Task. The data is sampled from German Wikipedia articl…
This is the dataset corresponding to the GermEval 2014 NER Shared Task. The data is sampled from German Wikipedia articles and online news. It is annotated following the NoSta-D guidelines which are included in the dataset. The guidelines suggest four NER categories (PER, LOC, ORG, MICS/OTH) and are extended to account…
LIA Norwegian is a speech corpus with old recordings (1939 - 1996) from four Norwegian universities: NTNU, UoB, UoO and …
LIA Norwegian is a speech corpus with old recordings (1939 - 1996) from four Norwegian universities: NTNU, UoB, UoO and UoT. The recordings are mainly made for dialect and onomastic research and the topics of the interviews and conversations are typically about old trades such as agriculture, fisheries, logging and lif…
NoTa-Oslo is a speech corpus with interviews and conversations from 166 informants born and raised in Oslo and the Oslo …
NoTa-Oslo is a speech corpus with interviews and conversations from 166 informants born and raised in Oslo and the Oslo area. The informants are carefully selected w.r.t. sociolinguistic variables and therefore representative in terms of age, gender, place of residence and education. NoTa-Oslo consists of approx. 957 0…