Pint of Science Edinburgh 2017: Future of human-robot interaction and communications

Last Wednesday night, at the cosy downstairs space of the Safari Lounge, we watched the presentations of two passionate scientists discussing their work on how machines can learn to talk like humans, and how robots can learn to behave like humans. We played games and won prizes for the best questions and tweets. All this while enjoying a pint!

Do you know that behind the voice on your phone and the voice on your car’s GPS there is a real human? Dr Korin Richmond from the University of Edinburgh described the timeline of the development of speech synthesis. People think that the achievement of producing artificial sound and speech is a relatively new concept, but that is not the case. The first attempt to mechanically produce sound was made by Kratzenstein in 1779, followed by Von Kempelen (1791) and Wheatstone (mid 1800s). Dudley was the first to produce sound electrically, but the procedure was still manipulated by humans. Dr Richmond played a sample of a video documenting that in Edinburgh in the 1950s, Lawrence managed to build a machine that created sound automatically, without anyone operating it. The mid-1980s saw the synthesis of coherent speech and the use of units of sound, and in 1996 Black, Taylor and Coley developed this technique by creating an audio database, selecting units and putting them in the right sequence.

Today, current research focuses on getting machines to talk more like humans, by observing the human speech movements and then training a machine learning model to interpolate voice. An interesting application of this procedure is the digital cloning of the voice of patients with motor neurone disease or other medical conditions such as Parkinson’s disease, multiple sclerosis, cancer or stroke, who tend to lose their voice as a result of the disease. The idea here is preserving the identity of the patient by replicating the voice before the patient loses it. This, together with speech data of donors, create an average model that can be used by the patient after voice loss. So, after all this, and the recent reproduction of the most natural sounding speech by Google’s ‘Tacotron’, the production of synthetic sound has come on leaps and bounds since the first attempt in 1779. However, there is still a lot of work to be done to achieve greater efficiency, more controllability and to go beyond the sentence to express the environment and the context of speech.

After the break and a game with winners and prizes, it was the time to learn about the social interaction of humans and robots. Dr Frank Broz from Heriot Watt University explained how his interdisciplinary research on social robotics brings together robotics and artificial intelligence with psychology, sociology, ethnography, arts and design, in an effort to put robots in a human environment and make them fit in. It is important for robots to understand human behaviour and for humans to understand robots. He showed us the graph of the ‘Uncanny Valley’ hypothesis, in which people favour robots that resemble humans to a certain extent, but not those that are too accurate. He discussed a possible approach of bringing the behaviour of robots closer to human behaviour. Observation is the key to this approach: observe the human-robot interaction, produce a model, evaluate the interaction and improve the interaction to more natural. An example of such an approach is the ‘mutual gaze’ experiment. Mutual gaze, or simply eye contact, is correlated to other parameters (social engagement, conversation, familiarity). It can be measured with two people wearing cameras that detect the line of gaze. Then, a minimal computational model is created and placed in a robot, which enables the robot to engage in mutual gaze similar to that of the human pattern. And the observation continues: is human interaction towards the robot the same as with another human?

Robots with social behaviour can be used to help people with autism spectrum disorder (ASD) to overcome their difficulty to interpret social signals and social interaction. Facial expression data of basic emotions, such as happiness, anger and fear, are collected from many volunteers to understand what is needed for the robot to be able model that emotion, allowing it to be used for an intermediate training of social interaction for people with ASD. This discussion brought up concerns about the possibilities of how current knowledge of human psychology can be questioned while teaching the robots.

Being astonished by both presentations, the night can be concluded with two phrases from the discussions: “If you can’t communicate, you are isolated”; and “We put the psychology into the technology, and the technology into the psychology”. If you’re feeling like you’ve missed out, don’t worry: ‘Pint of Science’ is back again next year, so watch this space.


This article was written by Athina Frantzana and edited by Bonnie Nicholson.

Leave a Reply

Your email address will not be published. Required fields are marked *