IVR Technical Glossary

Below, Interactions has compiled a list of some of the more technical terms frequently used in discussions related to IVR – please feel free to review/pass around to others who might find this information useful. As always, you can contact us at contact -at- interactions -dot- net with any questions!

A-law - A-law is a lossy method of audio encoding that will compress 16-bit linear PCM audio samples into 8-bit samples thereby reducing bitrate by 50%. Like its encoding cousin, u-law, a-law assumes that the audio stream contains predominantly voice data (as opposed to, say, music data) which has a low dynamic range. It then uses a pseudo-logarithmic algorithm to compress the data, favoring detail for samples in the “middle” of the range and effectively glossing over samples at the extremes.

Apache - The Apache web server group formed in 1995 to build an open-source web server. The name was a play on the phrase “A patchy web server” referring to initial state of the code base.

Application – Software created to help a user perform a given task.

Application server – An application server in the context of IVR is a server that generates VoiceXML to drive the progress of a callflow. Besides generating VoiceXML, it may access a database, call webservices, read and write from files, send emails, or perform other useful actions. When an IVR server (such as an on-site Plum Voice Platform or our hosting platform) receives or initiates a call, it requests VoiceXML from the application server via an HTTP request. Typically a start URL pointing to your application server is configured for each inbound DNIS.

Array – A data type offered by most programming languages that provides indexed access into a collection of variables.

ASR - ASR stands for Automatic Speech Recognition. It represents a collection of technologies required for an IVR system to listen for specific words and phrases, detect the speech from within an audio sample, analyze the speech relative to what is being listened for, and provide the final text representing what the ASR system believes was uttered.

Assign – The assign element assigns a value to a predefined variable.

Attribute - Attributes are options that you can set for your VoiceXML tag. There is a specific set of attributes for each VoiceXML tag of all sorts of varieties. Some attributes allow you to set a maxage or maxstale value for your VoiceXML tag for caching purposes. Other attributes allow you to set the scope or a specific mode for your VoiceXML tag.

Audio – The audio element retrieves and plays the specified audio file.

Audiofetchint – This attribute tells the platform whether or not it can attempt to optimize dialog interpretation by pre-fetching audio.

Audiomaxage – This attribute tells the platform the maximum acceptable age, in seconds, of cached audio resources. The audiomaxstale attribute tells the platform the maximum acceptable staleness, in seconds, of expired cached audio resources.

Automated attendant application – An interactive voice response (IVR) application that replaces the role of a human attendant by answering incoming telephone calls with a recorded or synthesized greeting or message.

Barge-in – Interactive voice response (IVR) applications use the barge-in feature to limit (or not in some cases) when a user would be able to put in information into the application. In an automated telephone system, the bargein feature allows users to move rapidly through prompts, interrupting the system to quickly navigate to the next prompt.

Bit – A bit is a single binary digit. Unlike a single decimal digit representing a value of 0 through 9, a bit holds a value of either 0 or 1. Similarly, while decimal numbers are, implicitly, base 10, binary numbers, which are composed of many bits, are base 2.

Block – The block element is a form item.

Boolean Built-in Grammar – Boolean built-in grammar inputs include affirmative and negative phrases appropriate to the current language.

Break – The break tag in a VoiceXML application instructs the TTS engine to insert a pause into synthesized text, with the length of the pause being a user-specified time value determined either in milliseconds or as a predetermined size value.

Cache – A cache is a temporary storage area where data that is accessed constantly can be stored for quick access. The caching feature of the Plum IVR operates like a proxy cache. However, for on-site Plum IVR systems, this caching feature must be enabled or else nothing will be cached.

Callee type detection - Callee type detection is a feature for the IVR Outbound System that is used to figure out whether the receiving end that picks up an outbound call (the “callee”) is a human or an answering machine. For outbound calls that are not picked up, callee type detection marks whether the failed call is a result of going to a fax machine, getting a busy signal, getting a no answer, or going to an operator.

Calling Campaign – A Calling Campaign is used to contact a specific population over the phone by placing a series of outbound calls through an interactive voice response system (IVR).

Catch - The catch element associates a catch with a document dialog, or form item.

Child tag – VoiceXML, which is an XML format, is structured as a tree of elements, or text enclosed in markup tags. Nesting tags creates a hierarchy of elements that is traversed by the Plum VoiceXML platform to direct your callflow. A child tag is a tag that is enclosed by another tag. VoiceXML has limitations on the valid types of child tags for each VoiceXML tag.

Choice - The choice element serves several purposes: It specifies a speech and/or dtmf grammar fragment that determines when that choice has been selected, the contents are used to form the prompt string, and it specifies the URI to go to when the choice is selected.

Clear - In VoiceXML applications, the clear element is used to reset one or more form items.

Completetimeout - The completetimeout property determines the length of silence required following user speech before the recognizer finalizes a result.

Confidencelevel – Confidencelevel is a VXML property that specifies the speech recognition confidence level.

Data - The data element allows a VoiceXML application to fetch arbitrary XML data from a document server without transitioning to a new VoiceXML document.

Database – An organized collection of data that allows for fast storage, retrieval, and deletion. They are also known as a data bank. Databases use various languages like SQL to interact with applications attempting to modify the database.

Datamaxage – The datamaxage tells the platform the maximum acceptable age, in seconds, of cached data documents.

Datamaxstale - The datamaxstale property tells the platform the maximum acceptable staleness, in seconds, of expired cached data documents.

DNIS - DNIS stands for Dialed Number Identification Service. It is the number that the caller dialed. The DNIS, along with the ANI, are delivered to the IVR by the telco carrier for every phone call.

Document Object Model - An interface for accessing and manipulating the elements of an XML, HTML, or XHTML document, standardized by the World Wide Web Consortium and typically offered as an API within the JavaScript engine of a web browser. A read-only subset of this interface is available for processing fetched XML within the Plum Voice Platform.

Documentfetchhint - The documentfetchhint tells the platform whether or not documents may be pre-fetched.

Documentmaxage - The documentmaxage property tells the platform the maximum acceptable age, in seconds, of cached documents.

DOM – An abbreviation for Document Object Model.

DTMF - Dual-Tone Multi-Frequency is a standard that defines the tones generated when pressing keys on your telephone keypad.

DTMF Grammar - VoiceXML can be programmed to detect caller input through the use of Dual Tone Multi-Frequency (DTMF) grammars. DTMF is the system used by the public switched telephone network (PTSN) to recognize tones produced by a telephone keypad.

ECMAScript - ECMAScript is a scripting language that is supported by the Voice VoiceXML IVR Platform. ECMAScript, strictly speaking, refers to the standard developed by ECMA International in ECMA-262, but it is more commonly known by its most popular dialect: JavaScript.

Else - The else element is used in conjunction with the if tag.

Elseif - The else if element is used to specify additional content when all other else or if statements have evaluated to false.

Enumerate - The enumerate element is used to read a list of menu choices to a caller, using either TTS or a user-defined audio file.

Error Handling - The ability to prepare, find, and fix any errors that may be located within a website or application.

Event Handler - The subroutine called to handle inputs by a user or system. Also known as ‘listeners’.

Exit - The exit element returns control to the interpreter context that determines what to do next.

External Grammar - An external grammar is an element of grammar that is defined in a file separate from the VXML application file. The grammar is defined in a separate external file and then is incorporated back into the VXML document through referencing.

Fetchaudio - This property indicates the URI of the audio while waiting for a document to be fetched.

Fetchaudiodelay - Fetchaudiodelay indicates the time interval to wait at the start of a fetch delay before playing the fetchaudio source.

Fetchaudiominimum - The fetchaudiominimum specifies the minimum time interval to play a fetchaudio source, once started, even if the fetch result arrives in the meantime.

Fetchtimeout - The interval to wait for the content to be returned before throwing an error.badfetch event.

Filled Block - In VoiceXML, a filled block is an element of coding language that allows the developer to properly execute an application.

Foreach - The foreach element allows a VoiceXML application to iterate through an ECMAScript array and to execute the content contained within the foreach element for each item in the array.

Form – Forms are the key components of VoiceXML documents. A form contains a set of form items, which are elements that are visited in the main loop of the form interpretation algorithm.

Form Interpretation Algorithm - The form interpretation algorithm (FIA) drives the interaction between the user and a VoiceXML form or menu.

Goto - In a VoiceXML application, the goto element is used to transition to another form item in the current form, transition to another dialog in the current document, and transition to another document.

Grammar - Grammars are used by speech recognizers to determine what the recognizer should listen for, and so describe the utterances a user may say.

Grammarfetchhint - The grammarfetchhint specifies whether or not grammars may be pre-fetched.

Grammarmaxage - The grammarmaxage tells the platform the maximum acceptable age, in seconds, of cached grammars.

Grammarmaxstale - The grammarmaxstale property tells the platform the maximum acceptable staleness, in seconds, of expired cached documents.

Hosted Environment - Hosted environment, as it relates to Interactive Voice Response (IVR) services, is a facility or data center where a third-party has established technical infrastructure to run its IVR systems.

Hosting – Providing a managed service to many clients from a centralized infrastructure. Plum offers world-class IVR hosting service to businesses of all sizes.

HTTP GET - The method used for requesting a resource from a web server, optionally submitting some data along with the request. A part of the HTTP protocol.

HTTP POST - A method for submitting data to a web server and a part of the HTTP protocol. This method is typically used instead of GET when the submission is expected to change resources on the server or cause side effects.

If Conditional - Used to express branching based on specific criteria. The most common use of this would be If-Then-Else. If the condition evaluates to true the application then the application runs another set of code. Else implies that the condition evaluated to false and runs a different set of code.

Incompletetimeout - The incompletetimeout property gets or sets the duration of silence at the end of an utterance, after which a recognition is considered finished.

Initial - In a typical mixed initiative form, the initial element is visited when the user is initially being prompted for form-wide information, and has not yet entered in the directed mode where each field is visited individually.

Inline Grammar - A grammar element specifies a permissible vocabulary for user interaction and an inline grammar is a list of phrases and subgrammars included within the grammar element.

Inputmodes – The inputmodes property is able to both enable and disable DTMF and voice.

Interdigittimeout - In an IVR application, interdigittimeout is the amount of time an application should allow for a user to enter a numeric value of their keypad.

Internet Protocol – A protocol used for communicating data across a packet-switched internetwork using the Internet Protocol Suite, also referred to as TCP/IP.

IP - An abbreviation for Internet Protocol.

Item - Items are the fundamental structural units used to configure VoiceXML applications.

IVR - Interactive Voice Response, a technology that allows a computer to detect voice and keypad inputs, typically used to automate dialogue during a phone call.

JavaScript - An dynamic, weakly typed, prototype-based language with first-class functions that is in widespread use within web browsers and other client-side programmable environments. More formally, it is a dialect of ECMAScript, the most widely supported version being JavaScript 1.5 which parallels the ECMA-262 standard (3rd edition).

JavaScript engine - A JavaScript engine is a software component that runs JavaScript for a host environment, usually a web browser. The Plum IVR platform includes a JavaScript engine for running scripts and code embedded into VoiceXML documents.

JSGF Grammar - The Java Speech Grammar Format (JSGF) is a textual representation of grammars used to create speech recognition applications.

Leaf Document - Leaf documents are VoiceXML documents that make reference to a root document, which is also a VoiceXML document. Leaf documents are used in conjunction with a root document to organize VoiceXML scripts in a structured manner.

Link - A link element may have one or more grammars that are scoped to the element containing the link.

Log – The log element allows an application to generate a logging or debug message, which a developer can use to help in application development or post-execution analysis of application performance.

Markup Language - Markup language is an editing system that allows users to create structured applications by syntactically distinguishing edits to a program from original content.

Maxage - Attribute within certain tags that sets a maximum amount of time before the platform should pull a new copy for caching. Several similarly-named properties can set this interval on a more global level.

Maxspeechtimeout - The maxspeechtimeout property indicates the maximum duration of users speech input.

Maxstale - The maxstale attribute relates to the maximum amount of time that is allowed before a cached file becomes stale.

Maxtime - The maxtime attribute is a VoiceXML attribute used in IVR applications to set a maximum amount of time (usually in seconds) for a specific action to last.

Menu - The menu construct in VoiceXML gives the programmer the ability to quickly build simple IVR dialogs that ask the caller to select from a list of choices.

Mixed Initiative - Mixed initiative forms are created when both the computer and the human can direct the flow of the phone call. To make a form mixed initiative, it must have one or more form-level grammars.

Noinput - The noinput element is used when a VoiceXML application expects to receive voice or DTMF input, but the caller has neither spoken nor entered anything via the keypad.

Nomatch - In a VXML application, a nomatch indicates a user input element that is not recognized as part of the active grammar.

One-of - A one-of block specifies a set of alternative legal rule expansions, each of which is contained within an item block.

Onsite System - An onsite system is an internally maintained IVR setup that includes an integrated hardware and software package that is deployed at the purchaser’s premises.

Option – An option list is represented by a set of option elements contained in a field element.

Outbound - An outbound call is one that is initiated by an in place IVR system internally, and is used to contact the customer on behalf of the client.

Paragraph - The paragraph tag tells the TTS engine to change the prosody element to reflect the end of a paragraph, regardless of the surrounding punctuation.

Param - The param element is used to specify values that are passed to subdialogs.

Phoneme - A phoneme is the smallest segmental unit of sound that can be used to form contrasts between utterances.

PHP - A scripting language that can be used to generate dynamic VoiceXML and to process and store data on a server; also commonly used for generating dynamic webpages.

Prompt - The prompt element controls the output of synthesized speech and prerecorded audio in a VXML application.

Property - The property element sets a property value, and is used to define any setting which impacts the behavior of the application.

Prosody Element - A prosody element is a feature of VoiceXML that allows developers to control the pitch, contour, range, speaking rate, duration, and volume of an IVR application’s speech output.

Proxy Server - A proxy server is a server that acts as an intermediary for requests from client seeking resources from other servers.

Query - The method to store or retrieve distinct content from or to a database based on specific criteria. These storage or retrieval statements are presented as questions to the database in a specific format.

Queue - A queue is a data structure for processing items in sequence; in the context of the Plum IVR Hosting Platform, it is a buffer that allows Outbound IVR users to line up outbound calls to a phone number or list of phone numbers.

Record – The record element is an input item that collects a recording from the user in a VXML application.

Recordcall - The recordcall property enables call recording for your script.

Recordcallappend - If the recordcallappend property is set to true when call recording transitions from disabled to enabled, any previous call recorded audio will be appended to instead of being overwritten.

Regular expression - A regular expression is a programming construct for matching strings of text. Implemented by many programming languages and frameworks, regular expressions (also known as regexes or regexps) are useful because of their combination of brevity and expressivity.

Return - The return element ends execution of a subdialog and returns control and data to a calling dialogue in a VoiceXML application.

Root Document – Root documents can be used to pass variables from one VoiceXML page to another VoiceXML page. Root documents are specified per VoiceXML script and are themselves a VoiceXML script.

Rule - The rule element defines the named rule expansion of an XML grammar.

Ruleref - New to VoiceXML 2.0, a ruleref tag can reference another defined rule within the same local grammar.

Say-as – Say-as tags provide contextual hints to the TTS engine about how the text should be pronounced.

Scope - Scope is the boundary in which variables and expressions are associated. In VoiceXML, we generally refer to local scope and global scope when referring to IVR code. Scope is used to determine which information (i.e. variables) is available to different part of your application.

Scratchpad - One of the features included in Plum’s Voice Platform v. 3.0 is a scratchpad, which allows users to test application prototypes and new code ideas.

Script - The script element allows the specification of a block of client-side scripting language code, and is analogous to the HTML SCRIPT element.

Sensitivity – The sensitivity level in a VXML application is an element that is used by ASR (automatic speech recognition engines) to detect a decibel of sound.

Sentence – The sentence element is used to format a particular region of TTS to be read out in sentence structure, rather than paragraph structure.

Session Initiation Protocol – Session Initiation Protocol is a signaling protocol, widely used for controlling multimedia communication sessions such as voice and video calls over Internet Protocol (IP).

SIP - An abbreviation for Session Initiation Protocol.

Speak - The speak tag tells the text-to-speech engine to synthesize the specified text.

SRGS+XML – SRGS+XML is a grammar format for use with speech recognition technology that allows a programmer to specify a list of words that the speech recognition engine will identify.

String - A string is a sequence of characters, often represented in memory as an array of bytes. They may contain encoded text, binary data, or formatted data. Strings are a fundamental data type in almost all programming languages.

Sub - The sub tag suggest a substitute text to the TTS engine.

Subdialog – A subdialog tag allows users to execute a new VoiceXML document in a new context. Subdialogs are mechanisms for reusing common dialogs and building libraries of reusable applications.

Submit - The submit element is used to submit information to the origin web server and then transition to the document sent back in the response.

SynApps - An web-based and graphical IVR application development environment created by Plum which can be used to rapidly prototype, create, share and deploy IVR applications. Currently in private beta.

SynApps - SynApps, later titled QuickFuse, is Plum’s Voice’s new intuitive cloud telephony service that is used for creating smarter voice applications.

Tag - A tag is an arbitrary string that may be included inline within any legal rule expansion.

Telco - Short for telephone company, a service provider of telecommunications services such as telephony and data communications access.

Telecommunication - Telecommunication is the transmission of messages over significant distances, for the purpose of communication.

Telephony - Relates to the technologies around which telephones, faxes, and other electronic audio transmission equipment is based.

Throw - The throw element throws an event. These can be predefined or application defined events.

Token – Also known as a terminal symbol, a token is a part of grammar in a VoiceXML application that defines words or other entities that may be spoken.

Transfer - The transfer element directs the interpreter to connect the caller to another entity (e.g. telephone line on another voice application).

TTS – TTS is the acronym for text-to-speech; it is a speech synthesis feature of VoiceXML applications that converts human speech into artificial language. Read more…

U-law - U-law is a lossy method of audio encoding that will compress 16-bit linear PCM audio samples into 8-bit samples thereby reducing bitrate by 50%. Like its encoding cousin, a-law, u-law assumes that the audio stream contains predominantly voice data (as opposed to, say, music data) which has a low dynamic range. It then uses a pseudo-logarithmic algorithm to compress the data, favoring detail for samples in the “middle” of the range and effectively glossing over samples at the extremes.

UNIX - UNIX is an operating system, a set of the software that controls a computer’s basic function.

User Interface - A user interface is a tool that allows for interaction between the human user and a machine. User interfaces allow for a user to perform certain tasks more easily and efficiently.

Value – The value element is used to insert the value of an expression into a prompt.

Variable - A variable is a programming facility for storing a piece of data while associating it with a memorable name.

Voice – The voice tag allows the application to change the voice of the TTS speaker from the input text.

VoiceXML – A language designed to allow a user to interact with an application through voice recognition software. The language is based off of the W3C’s XML.

Other Relevant Industry Terms:

IVR Technical Glossary

VOIP Industry Terms