Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
These levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
The database is a collection of semantically related word pairs in German which was compiled via human judgement experim…
The database is a collection of semantically related word pairs in German which was compiled via human judgement experiments hosted on Amazon Mechanical Turk. We address the three paradigmatic relations antonymy, hypernymy and synonymy. The database consists of three parts: A representative se…
Feature norms are short descriptions of typical attributes for a set of objects. They often describe the visual appearan…
Feature norms are short descriptions of typical attributes for a set of objects. They often describe the visual appearance (a firetruck is red), function or purpose (a cup holds liquid), location (mushrooms grow in forests), and relationships between objects (a cheetah is a cat). The underlying features are usually eli…
This corpus contains 3847 sentences, taken from 125 documents annotated for Sentiment Relevance. The data is a subset of…
This corpus contains 3847 sentences, taken from 125 documents annotated for Sentiment Relevance. The data is a subset of the v2.0 movie polarity dataset (Pang & Lee, 2004).
The TreeTagger is a tool for annotating text with part-of-speech and lemma information. It was developed by Helmut Schmi…
The TreeTagger is a tool for annotating text with part-of-speech and lemma information. It was developed by Helmut Schmid in the TC project at the Institute for Computational Linguistics of the University of Stuttgart. The TreeTagger has been successfully used to tag German, English French, Italian, Dutch, Spanish, Bul…
This dataset is the result of a corpus study for German verbs (anfangen (mit), aufhoren (mit), beenden, beginnen (mit), …
This dataset is the result of a corpus study for German verbs (anfangen (mit), aufhoren (mit), beenden, beginnen (mit), geniessen), based on data obtained from the deWaC corpus. We built a dataset of logical metonymies, which were manually annotated and compared with the qualia structures of their objects, then we cont…
The TIGERSearch software let's you explore syntactically annotated text corpora. If you are a grammar engineer who is de…
The TIGERSearch software let's you explore syntactically annotated text corpora. If you are a grammar engineer who is developing a grammar, you might use TIGERSearch to obtain sample sentences for the syntactic phenomena you are interested in. If you are a lexicographer or terminologist, you can employ TIGERSearch to f…
The SRCMF is a dependency treebank for Old French. It consists of syntactically annotated parts of two text corpora of M…
The SRCMF is a dependency treebank for Old French. It consists of syntactically annotated parts of two text corpora of Medieval French: Base de Français Médiéval (BFM), see Guillot et al. (2007) and http://bfm.ens-lyon.fr/ Nouveau Corpus d'Amsterdam (NCA), see Kunstmann and Stein (2007); Ste…
TüPP-D/Z is a collection of articles from the taz newspaper ("die tageszeitung") which have been …
TüPP-D/Z is a collection of articles from the taz newspaper ("die tageszeitung") which have been automatically annotated with clause structure, topological fields, and chunks, in addition to more low level annotation including parts of speech and morpholog…
A trained parameter file for the TreeTagger (part-of-speech tagger) to tag Middle High German text.; DHd 2017; gzip file…
A trained parameter file for the TreeTagger (part-of-speech tagger) to tag Middle High German text.; DHd 2017; gzip file; trained on semi-automatically annotated data
The IMS HOTCoref system is a data-driven coreference resolution system. It models coreference within a document as a dir…
The IMS HOTCoref system is a data-driven coreference resolution system. It models coreference within a document as a directed rooted tree. For learning it adopts the idea of latent antecedents and exploits the tree structure for the purpose of non-local (with respect to a single pair of mentions) features.