The acronym OVIS stems from “Openbaar Vervoer InformatieSysteem” (Dutch for: Public Transportation Information System). This was a speech-input and speech-output system for providing information about train travel over the telephone.
The system was used experimentally for a short period, during which the OVIS corpus was collected at the University of Groningen — see www.let.rug.nl/vannoord/Ovis
The DIAMOND dialogues form a small corpus of problem-solving dialogues in which the user interacts through speech using high-quality microphones with a helpdesk in order to deal with problems in using a fax machine to which the user is new.
The dialogues were transcribed and annotated by groups of students using the DIT annotation scheme. Inter-annotator agreements were calculated and reported in the literature, see Geertzen, Petukhova & Bunt (2008).
The dialogues were collected by Jeroen Geertzen, Roser Morante, Hans van Dam, Yann Girard, Ielka van der Sluis, Barbara Suijkerbuijk, Rintse van der Werf and Harry Bunt; see Geertzen et al. (2004).
Before inclusion in the DialogBank the dialogues were re-annotated according to the ISO 24617-2 standard; the annotations are represented in the DiAML-MultiTab format. The functional segments in the MulTab representation refer to the tokenisation of the transcription, which is also made available.
Below is a short annotated fragment of dialogue ‘TRAINS 2’ represented in DiAML-XML format and the DiAML-MultiTab and DiAML-TabSW formats (click to enlarge). The dialogue was collected and annotated in the TRAINS project, and re-segmented and re-annotated according to ISO 24617-2 for inclusion in the DialogBank.
The following dialogue fragment is covered:
S: Hello, can I help you?
U: Yes, I have a problem I need to transport two tankers of OJ to Avon and three boxcars to Elmire, the Bananas must arrive in Elmire by nine p.m.
S: Okay
The DBOX dialogue corpus was collected and annotated at the University of Saarland, in Saarbrücken, in the context of Eureka project 7152 “D-Box, A generic dialog box for multilingual conversational applications”. This project’s main goal is to develop and test an innovative architecture for conversational agents whose purpose is to support multilingual collaboration. The project develops interactive games based on spoken natural language human-computer dialogues, in three European languages: English, French and German. The first D-Box game scenario is a quiz game, in which a player may ask any type of question, such as “What are you famous for?”, in order to guess the name of a famous person. For this game situation, dialogues have been collected in a Wizard-of-Oz setup with a human Wizard who simulates the system’s behaviour by acting according to a pre-defined script. For further details see Petukhova et al. (2014).
Dialogue material
Five annotated DBOX dialogues are included on the DialogBank. For more see the website of the D-Box project: see www.idiap.ch/project/d-box/
The annotations follow the ISO 24617-2 standard, making use of the possibility that this standard offers to add extra dimensions (with dimension-specific communicative functions); in particular, the dimension Contact Management has been inherited from the DIT++ annotation scheme, and an additional dimension called “Task Management” (also present in DAMSL) has been added for the annotation of utterances that discuss the rules of the quiz game. Moreover, three communicative functions are used that are not part of the ISO standard but that have been defined in DIT++, namely Dialogue Act Announcement (announcing the next dialogue act), Threat, and Pre-Closing (indicating the immanent closing of the dialogue), and the extra communicative function Congratulation has been added in order to account for those dialogue acts where a player is congratulated for correctly having guessed the identity of the famous person and thus having won the game.
The series of Joint ACL-ISO workshops on Interoperable Semantic Annotation (ISA)
ISA-18, the 18th Joint ACL-ISO Workshop on Interoperable Semantic Annotation, Marseille, France, June 20, 2022, at the LREC 2022 conference
ISA-17, the 17th Joint ACL-ISO Workshop on Interoperable Semantic Annotation, at IWCS 2021, the 14th International Conference on Computational Semantics (Groningen/online).
The DIT++ annotation scheme is the result of two converging lines of research:
the development of a semantic theory of dialogue acts, called Dynamic Interpretation Theory (DIT);
the study of alternative systems of dialogue acts and dialogue annotation schemes, with the aim of defining a comprehensive taxonomy of dialogue acts, useful both for the design of natural-language based dialogue systems, and for the analysis and annotation of spoken and multimodal human dialogue.
Work in the former line resulted in the definition of a multidimensional taxonomy of dialogue acts for which a dynamic update semantics was defined (see Bunt 1989; 1995; 2000; 2013; 2014). Work in the latter line resulted in the definition of the DIT++ taxonomy and annotation scheme (Bunt 2009), which incorporated ideas from a variety of annotation schemes, notably DAMSL, SWBD-DAMSL, HCRC Map Task, Gothenburg IM, TRAINS, Verbmobil, and AMI. The DIT++ scheme Release 5.0 served as the basis for defining the ISO 24617-2 standard, and conversely benefited from the establishment of the latter.
The DIT++ taxonomy with the update semantics of its dialogue acts has in a preliminary version been applied in the multimodal dialogue PARADIME system (Keizer & Bunt, 2006; 2007) and is currently being applied in the multimodal Metalogue system.
The DIT++ annotation scheme was tested for its usability in the European project LIRICS and in PhD studies involving the manual annotation of dialogues in several European languages (see e.g. (Geertzen, 2005; 2006; Petukhova (2009; 2011). Petukhova & Bunt (2010) showed that the scheme can be applied in the automatic annotation of raw speech in human dialogue with very high accuracy.
A new version of the DIT++ scheme with some improvements and extensions has been released in April 2019 (Release 5.2) and is the basis for a proposed second edition of ISO 24617-2 (November 2019), which is currently under review.
For full documentation and explanation of the communicative functions, dimensions, qualifiers, and relations among dialogue acts see the DIT++ home page.