Industry attributes
Human-robot language interaction refers to the ability of humans to use spoken language to interact with robotic systems and for robotic systems to generate speech and use spoken language amalgams to interact with humans. As the field of robotics continues to advance, some researchers believe spoken language interaction is becoming increasingly necessary, as it improves the ability of human operators to interact more seamlessly and naturally with robotic systems. And for consumers, the ability to interact with robotic systems through natural language can increase the adoption of those systems. However, incorporating speech processing capabilities in robotic systems has proven difficult for researchers, especially as it can be difficult to incorporate the specific needs of robot applications in speech applications.
Human-robot language interaction and related research is working to move robotic language interaction beyond a template-based approach, which requires users of robotic systems to phrase requests or command in a specific language. The alternative is to develop robotic systems capable of fluent, flexible, linguistic interaction, which would also offer robotics systems increased social intelligence.
Natural language understanding has long been a goal of artificial intelligence but has proven challenging and has been abandoned in some cases. However, as robotics has increased in use and more people consider language interaction in robotics important, natural language understanding for robotics systems has been emphasized.
A spoken dialogue system is one type of system developed to help users with spoken language commands. A spoken language system is usually comprised of six components. Speech input is processed by a speech recognizer, which converts the speech to a written form that is passed along to a language analyzer, which constructs a logical representation of the speech. Using this representation, information on prior discourse and task knowledge allows the robotic system to understand what task is to be performed. This system can also include the robot conveying a follow-up message or confirmation message to the user.
Another development in robotic understanding that is capable of being embedded in other systems, such as spoken dialogue systems, is natural language algorithms that work to provide a chance for natural language understanding between humans and robots. The algorithmic models are designed to bridge the semantic gap between high-level concepts in language and their low-level metric representations. To do so, researchers have developed generalized grounding graphs and distributed correspondence graphs to infer a grounding for language descriptions for perceived representation. Another development has been an adaptive disruptive correspondence graph for different reasoning about abstract spatial concepts. These all work toward reaching a semantic understanding for robotic systems that can ground more natural human language into factual knowledge that robotic systems can act on.
One of the difficulties in human-robot language interaction that has to be overcome is understanding human semantics in speech. Most verbal commands in human-robot interaction tend to be direct and frequently include specific keywords that allow the robotic system to understand the command. However, this is not natural for human communication, which tends to include verbal and visual semantics. These can change human intention in a command, as instructions or commands can be clear, vague, or feeling-based, and a robotic system capable of understanding through these different lenses can increase the ability to satisfy human intentions.
Another approach to developing an understanding for robotic systems is through acoustic communication, in which unique, covert, tonal languages can be used to extract semantic understanding. Part of this research has been used to generate the potential for semantic understanding between a given robot-human pair, as each human has a near-unique semantic approach to language. This can include different potential acoustic applications of language based on the scenario, such as social or tactical, and could also be used to generate tonal languages for robots to strengthen robot-human relationships.