Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
Show more facetsThese levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
This corpus contains the audio recordings of all actors who use the SmartKom system; it covers the audio recordings (no …
This corpus contains the audio recordings of all actors who use the SmartKom system; it covers the audio recordings (no video) and annotations of all three original SmartKom corpora Public, Mobile and Home. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact controlle…
The MOCHA database was compiled as part of the Engineering and Physical Sciences Research Council grant number:GR/L78680 : "Speech recognition using articulatory data." It features a set of 460 short sentences designed to include the main connected speech processes in English (e.g. assimilations, weak forms ...). All recordings made in the same sound damped studio at the Edinburgh Speech Production Facility based in the department of Speech and Language Sciences, Queen Margaret University College, UK. The database contains audio files, laryngograph waveforms, electromagnetic articulograph (EMA) tracks and electropalatograph (EPG) tracks. It is distributed as an emuR compatible EMU database. Conversion into this format was done at the Bavarian Archive for Speech Signals. The original database is available here: http://www.cstr.ed.ac.uk/research/projects/artic/mocha.html
Name
|
|
Collection
|
|
Language
| |
Modality
|
|
Country
|
|
Genre
|
|
Subject
|
|
Organisation
|
|
National project
|
|
The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004 - 200…
The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004 - 2006. It comprises a collection of user queries to a naturally spoken Web interface with the main focus on the soccer world series in 2006. The recordings include 156 field recordings using a hand-held U…
This corpus contains multi modal recordings of 86 actors who use the SmartKom system. SmartKom Public is comparable to a…
This corpus contains multi modal recordings of 86 actors who use the SmartKom system. SmartKom Public is comparable to a traditional public phone booth but equipped with additional intelligent communication devices. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact …
The CI_2 corpora contain German speech recordings of 48 cochlear implant users (CI) and 48 speakers without hearing impa…
The CI_2 corpora contain German speech recordings of 48 cochlear implant users (CI) and 48 speakers without hearing impairment (control group, KG). The data were analyzed in Veronika Neumeyer's dissertation "Akustische Analysen der Sprachproduktion von CI-Trägern" (2015). CI_2_Cluster contains recordings used for the a…
Verbmobil 2 contains the speech of 401 speakers participating in 810 recordings. The emotional tagged recordings are not…
Verbmobil 2 contains the speech of 401 speakers participating in 810 recordings. The emotional tagged recordings are not part of this edition but are collected inthe corpus 'BAS VMEmo'. The total VM2 corpus amounts to 17.6GB of data containing 58961 conversational turns distributed on 39 CD-R. VM2 contains dialogs in G…
The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004 - 200…
The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004 - 2006. It comprises a collection of user queries to a naturally spoken Web interface with the main focus on the soccer world series in 2006. The recordings include field recordings using a hand-held UMTS …
This corpus contains multi modal recordings of 65 actors who use the SmartKom system. SmartKom Home should be an intelli…
This corpus contains multi modal recordings of 65 actors who use the SmartKom system. SmartKom Home should be an intelligent communication assistant for the private environment. Naive users were asked to test a 'prototype' for a market study not knowing that the system was in fact controlled by two human operators. The…
This corpus contains tasks, where one subject (the instructor) describes different Tangram figures to another subject (t…
This corpus contains tasks, where one subject (the instructor) describes different Tangram figures to another subject (the receiver) so that the receiver can recreate the same order of figures that the instructor has in front of them. The subjects initially don't know each other and work together to solve these tasks i…
This corpus contains tasks, where one subject (the instructor) describes different Tangram figures to another subject (t…
This corpus contains tasks, where one subject (the instructor) describes different Tangram figures to another subject (the receiver) so that the receiver can recreate the same order of figures that the instructor has in front of them. The subjects initially don't know each other and work together to solve these tasks i…