Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
Show more facetsThese levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
The corpus contains read speech of 101 different speakers (50 female, 50 male, 1 unknown). Each speaker has read approx.…
The corpus contains read speech of 101 different speakers (50 female, 50 male, 1 unknown). Each speaker has read approx. 100 sentences from either the SZ subcorpus or the CeBit subcorpus. The language is German. The subcorpus SZ contains 544 sentences from newspaper articles ("Sueddeutsche Zeitung"). The subcorpus Ce…
The TAXI dialog database was created in June 2001 in collaboration with the DFKI, Saarbruecken. TAXI contains 86 recorde…
The TAXI dialog database was created in June 2001 in collaboration with the DFKI, Saarbruecken. TAXI contains 86 recorded dialogues between a cab dispatcher and a client recorded over public phone lines (network and GSM). The dispatcher always speaks German, while the clients always speaks English. Starting from versio…
The speech corpus aGender contains speech sample recordings over public telephone lines with read and (semi-)spontaneous…
The speech corpus aGender contains speech sample recordings over public telephone lines with read and (semi-)spontaneous speech. Native German speakers called a voice portal from their private phone, and read text + answered some open questions. The purpose of the corpus is the automatic detection of gender and/or age …
Hempels Sofa is a collection of more than 3900 spontaneous speech items recorded as extra material during the German Spe…
Hempels Sofa is a collection of more than 3900 spontaneous speech items recorded as extra material during the German SpeechDat-II project. Speakers were asked to report what they had been doing during the last hour: "Was haben Sie in der letzten Stunde gemacht?". This item was recorded as the last item of the recording…
NSC contains scripted, semi-spontaneous, and spontaneous human-human dialogs. In total, 300 speakers of German without n…
NSC contains scripted, semi-spontaneous, and spontaneous human-human dialogs. In total, 300 speakers of German without noticeable accent participated and were recorded in an acoustically-isolated room. Interactions between speakers and their interlocutor are provided in separate mono files, accompanied by timestamps an…
The corpus is a collection of more than 500 speakers of different dialect regions of Germany. The recordings were made u…
The corpus is a collection of more than 500 speakers of different dialect regions of Germany. The recordings were made using four different microphones (two in low and two in high quality) and consist of single digits, connected digits, phone numbers, phonetically balanced sentences, computer command phrases prompted o…
The CI_2 corpora contain German speech recordings of 48 cochlear implant users (CI) and 48 speakers without hearing impa…
The CI_2 corpora contain German speech recordings of 48 cochlear implant users (CI) and 48 speakers without hearing impairment (control group, KG). The data were analyzed in Veronika Neumeyer's dissertation "Akustische Analysen der Sprachproduktion von CI-Trägern" (2015). CI_2_Vowels contains recordings used for the an…
This corpus contains recordings of 162 speakers while being sober and intoxicated. Beginning with version 3, this corpus…
This corpus contains recordings of 162 speakers while being sober and intoxicated. Beginning with version 3, this corpus edition also contains an emuR compatible database version of the corpus (with a minor bugfix in the database in version 3.1).; Speech data collection of alcoholized speakers of German, age 21-75.
The Ph@ttSessionz speech database contains recordings of 1019 adolescent speakers of German (age range 12-20). The recor…
The Ph@ttSessionz speech database contains recordings of 1019 adolescent speakers of German (age range 12-20). The recordings were performed via the WWW in public schools (Gymnasium) in 45 locations in Germany. The speech material recorded is a superset of the German SpeechDat-II and RVG-I corpora. It is now also avail…
The CI_2 corpora contain synchronous speech recordings of 48 cochlear implant users (CI) and 48 speakers without hearing…
The CI_2 corpora contain synchronous speech recordings of 48 cochlear implant users (CI) and 48 speakers without hearing impairment (control group, KG). The data were analyzed in Veronika Neumeyer's dissertation "Akustische Analysen der Sprachproduktion von CI-Trägern" (2015). CI_2_Sibilants contains recordings used fo…