Tech News, Magazine & Review WordPress Theme 2017
  • Blog
  • Der Digital Schamane
    • Ikigai: Das japanische Geheimnis für ein erfülltes  Leben
    • Entfesseln Sie Ihr innovatives Potenzial mit den Denkhüten von de Bono
    • Enthüllen Sie die Geheimnisse Ihres inneren Teams: Eine einfacher Leitfaden
    • Die Kunst der kollegialen Fallberatung: Förderung einer Kultur der Zusammenarbeit und des Lernens
    • Vom Träumen zur Wirklichkeit: Die Kraft der Walt Disney Methode!
  • Spiele
Montag, 13. Oktober 2025
No Result
View All Result
  • Blog
  • Der Digital Schamane
    • Ikigai: Das japanische Geheimnis für ein erfülltes  Leben
    • Entfesseln Sie Ihr innovatives Potenzial mit den Denkhüten von de Bono
    • Enthüllen Sie die Geheimnisse Ihres inneren Teams: Eine einfacher Leitfaden
    • Die Kunst der kollegialen Fallberatung: Förderung einer Kultur der Zusammenarbeit und des Lernens
    • Vom Träumen zur Wirklichkeit: Die Kraft der Walt Disney Methode!
  • Spiele
No Result
View All Result
Arbeit 4.0 und KI: die Zukunft ist jetzt!
No Result
View All Result

A way to let robots learn by listening will make them more useful

by James O'Donnell
3. Juli 2024
143 7
Home Digitalisierung
Share on FacebookShare on Twitter

Most AI-powered robots today use cameras to understand their surroundings and learn new tasks, but it’s becoming easier to train robots with sound too, helping them adapt to tasks and environments where visibility is limited. 

Though sight is important, for some of our daily tasks, sound is actually more helpful, like listening to onions sizzling on the stove to see if the pan is at the right temperature. Training robots with audio has only been done in highly controlled lab settings, however, and the techniques have lagged behind other fast robot-teaching methods.

Researchers at the Robotics and Embodied AI Lab at Stanford University set out to change that. They first built a system for collecting audio data, consisting of a gripper with a microphone designed to filter out background noise, and a GoPro camera. Human demonstrators used the gripper for a variety of household tasks, then used this data to train robotic arms how to execute the task on their own. The team’s new training algorithms help robots gather clues from audio signals to perform more effectively. 

“Thus far, robots have been training on videos that are muted,” says Zeyi Liu, a PhD student at Stanford and lead author of the study. “But there is so much helpful data in audio.”

To test how much more successful a robot can be if it’s capable of “listening”, the researchers chose four tasks: flipping a bagel in a pan, erasing a whiteboard, putting two velcro strips together, and pouring dice out of a cup. In each task, sounds provide clues that cameras or tactile sensors struggle with, like knowing if the eraser is properly contacting the whiteboard, or if the cup contains dice or not. 

After demonstrating each task a couple hundred times, the team compared the success rates of training with audio versus only training with vision. The results, published in a paper on arXiv which has not been peer-reviewed, were promising. When using vision alone in the dice test, the robot could only tell 27% of the time if there were dice in the cup, but that rose to 94% when sound was included.

It isn’t the first time audio has been used to train robots, Liu says, but it’s a big step toward doing so at scale. “We are making it easier to use audio collected ‘in the wild,’ rather than being restricted to collecting it in the lab, which is more time-consuming.” 

The research signals that audio might become a more sought-after data source in the race to train robots with AI. Researchers are teaching robots quicker than ever before using imitation learning, showing them hundreds of examples of tasks being done instead of hand-coding each task. If audio could be collected at scale using devices like the one in the study, it could provide an entirely new “sense” to robots, helping them more quickly adapt to environments where visibility is limited or not useful.

“It’s safe to say that audio is the most understudied modality for sensing” in robots, says Dmitry Berenson, associate professor of robotics at the University of Michigan, who was not involved in the study. That’s because the bulk of robotics research on manipulating objects has been for industrial pick-and-place tasks, like sorting objects into bins. Those tasks don’t benefit much from sound, instead relying on tactile or visual sensors. But, as robots broaden into tasks in homes, kitchens, and other environments, audio will become increasingly useful, Berenson says.

Consider a robot trying to find which bag contains a set of keys, all with limited visibility. “Maybe even before you touch the keys, you hear them kind of jangling,” Berenson says. “That’s a cue that the keys are in that pocket, instead of others.”

Still, audio has limits. The team points out sound won’t be as useful with so-called soft or flexible objects like clothes, which don’t create as much usable audio. The robots also struggled with filtering out the audio of their own motor noises during tasks, since that noise was not present in the training data produced by humans. To fix it, the researchers needed to add robot sounds–whirs, hums and actuator noises–into the training sets so the robots could learn to tune them out. 

The next step, Liu says, is to see how much better the models can get with more data, which could mean more microphones, collecting spatial audio, and adding microphones to other types of data-collection devices. 

James O'Donnell

Next Post

Salesforce proves less is more: xLAM-1B ‘Tiny Giant’ beats bigger AI Models

Please login to join discussion

Recommended.

How Highmark Health and Google Cloud are using Gen AI to streamline medical claims and improve care: 6 key lessons

27. Juni 2025

Nuclear fusion: Delivering on the promise of carbon‑free power with the help of AI

26. März 2025

Trending.

KURZGESCHICHTEN: Sammlung moderner Kurzgeschichten für die Schule

24. März 2025

We’ve come a long way from RPA: How AI agents are revolutionizing automation

16. Dezember 2024

Gartner: 2025 will see the rise of AI agents (and other top trends)

21. Oktober 2024

Spexi unveils LayerDrone decentralized network for crowdsourcing high-res drone images of Earth

17. April 2025

UNTERRICHT: Mit dem Growth Mindset ins neue Schuljahr

11. August 2024
Arbeit 4.0 und KI: die Zukunft ist jetzt!

Menü

  • Impressum
  • Datenschutzerklärung

Social Media

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Home
  • Review
  • Apple
  • Applications
  • Computers
  • Gaming
  • Microsoft
  • Photography
  • Security