Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
Show more facetsThese levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
The TAXI dialog database was created in June 2001 in collaboration with the DFKI, Saarbruecken. TAXI contains 86 recorde…
The TAXI dialog database was created in June 2001 in collaboration with the DFKI, Saarbruecken. TAXI contains 86 recorded dialogues between a cab dispatcher and a client recorded over public phone lines (network and GSM). The dispatcher always speaks German, while the clients always speaks English. Starting from versio…
The SC10 corpus contains read and non-prompted German and mother tongue speech of 70 different speakers from 17 mother t…
The SC10 corpus contains read and non-prompted German and mother tongue speech of 70 different speakers from 17 mother tongues (L1) in a variety of speaking styles e.g. reading, retelling, free talk etc. Starting from version 1.5 (BAS CLARIN repository version 3), the corpus is distributed as an emuDB. BAS CLARIN repos…
The ZipTel telephone speech database contains recordings of people applying for a SpeechDat prompt sheet via telephone. …
The ZipTel telephone speech database contains recordings of people applying for a SpeechDat prompt sheet via telephone. For the SpeechDat data collection, calls for participation were published in "phone", the customer magazine of the mobile telephone provider "e-plus", and in numerous newspapers all over Germany. In t…
This corpus contains multi modal recordings of 73 actors who use the SmartKom system. SmartKom Mobil is a portable PDA e…
This corpus contains multi modal recordings of 73 actors who use the SmartKom system. SmartKom Mobil is a portable PDA equipped with a net link and additional intelligent communication devices. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact controlled by two huma…
The corpus contains read speech of 201 different speakers. Each speaket read a subcorpus of 450 different sentence equiv…
The corpus contains read speech of 201 different speakers. Each speaket read a subcorpus of 450 different sentence equivalents (including alphanumericals and two shorter passages of prose text); 8 speakers read the whole sentence corpus; 40 speakers read the subcorpora BR and MR; 112 speakers read 70 utterances of the …
The Verbmobil (VM) dialog database is a collection of German, American and Japanese dialog recordings in the appointment…
The Verbmobil (VM) dialog database is a collection of German, American and Japanese dialog recordings in the appointment scheduling task. The data were collected during the first phase (1993 - 1996) of the German VM project funded by the German Ministry of Science and Technology (BMBF). Starting with version 3, the cor…
The RVG-J Corpus (Regional Variants of German - Junior) was recorded in 2001 at the Institute of Phonetics and Speech Co…
The RVG-J Corpus (Regional Variants of German - Junior) was recorded in 2001 at the Institute of Phonetics and Speech Communication at the University of Munich, Germany. The corpus contains both read and non-scripted German utterances. It comprises the original RVG prompts (telephone numbers, sentences, commands, digit…
The CI_2 corpora contain synchronous speech recordings of 48 cochlear implant users (CI) and 48 speakers without hearing…
The CI_2 corpora contain synchronous speech recordings of 48 cochlear implant users (CI) and 48 speakers without hearing impairment (control group, KG). The data were analyzed in Veronika Neumeyer's dissertation "Akustische Analysen der Sprachproduktion von CI-Trägern" (2015). CI_2_VOT contains recordings used for the …
This corpus contains recordings of 162 speakers while being sober and intoxicated. Beginning with version 3, this corpus…
This corpus contains recordings of 162 speakers while being sober and intoxicated. Beginning with version 3, this corpus edition also contains an emuR compatible database version of the corpus (with a minor bugfix in the database in version 3.1).; Speech data collection of alcoholized speakers of German, age 21-75.
The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004 - 200…
The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004 - 2006. It comprises a collection of user queries to a naturally spoken Web interface with the main focus on the soccer world series in 2006. The recordings include field recordings using a hand-held UMTS …