When talking with people face-to-face, we may experience the "Cocktail Party Effect": even in a crowded, noisy room, we can use our binaural hearing to focus our listening on a single person speaking. With current telepresence technologies, however, we lose this important ability. Thankfully, there are already researchers giving us new tools for effectively bridging these remote distances.
Kyoto University's Professor Hiroshi Okuno and Assistant Professor Toru Takahashi, Honda Research Institute-Japan's Dr. Kazuhiro Nakadai, and four Kyoto University and Tokyo Institute of Technology graduate students spent a week at Willow Garage, integrating HARK with a Texai telepresence robot. HARK stands for Honda Research Institute-Japan (HRI-JP) Audition for Robots with Kyoto University. The robot audition system provides sound source localization, sound source separation, acoustic feature extraction, and automatic speech recognition.
The HARK system integrated well with ROS and our Texai. The Texai was outfitted with a green salad bowl helmet embedded
with eight microphones, and there is now a hark package for ROS. Using this setup, their team put together three demos showing off the potential for telepresence technologies.
In the first demo, four people, including one present
through a second Texai, talk over each other while the HARK-Texai
separates out each voice. The
second demo shows that sound is localized and that sound direction and power can be displayed in a radar
chart. The final presentation puts these two demos together into a powerful new interface for the remote operator: the Texai pilot can
determine where various sounds and voices are coming from, and select
which sound to focus on. The HARK system then provides the pilot with the desired audio, cutting out any background noise or additional
voices. Even in a crowded room, you can have a one-on-one conversation.
HARK came out of
close collaboration between HRI-JP and Kyoto University, and Professor
Okuno's passion to make computer/robot audition helpful for the hearing
impaired. HARK is provided free and open source for research purposes and can be licensed for commercial applications.
Leave a comment