Boston Dynamics is turning its robots into talking tour guides

Boston Dynamics has released a video showing its Spot robot, which has a hat, mustache and big eyes, speaking to employees with a British accent and walking around the company's facilities.

"Let's start our journey at the charging station, where the Spot robot is resting and charging," Spot says in the video. This is our first point of interest. Follow me, gentlemen.”

As seen in the video, Spot is able to answer questions and even open his mouth to appear as if he is talking.

Boston Dynamics used OpenAI's ChatGPT API as well as some large open source language models to make Spot talk and train it to respond.

The company equipped the robot with speakers, added text-to-speech capabilities, and moved its mouth to imitate speech.

“The team gave Spot a very short script for each part,” said Matt Klingensmith, senior software engineer at Boston Dynamics. The robot then combined this text with images obtained from body cameras, and combining its understanding allowed it to learn more. Information about what he sees before composing a reply. "

According to the company, Spot uses a “visual Q&A” form to annotate images and answer questions about them.

In the video, Spot not only imitates a tour guide, but the four-legged robot also imitates a 1920s archaeologist, a teenager, a time-traveling Shakespeare, and even a satirical character.

Boston Dynamics points out that during Spot's experience as a tour guide, it discovered some surprises when the team asked Spot about his parents, which led to the robot moving to a location at the company's headquarters where Spot's previous model had been standing.

The company explained that there are still cases of manufacturing large-scale audio models, such as Spot, which indicated that the stretchable robot is designed for yoga even though it is designed to move boxes.

“We look forward to continuing to explore the intersection between artificial intelligence and robotics,” Klingensmith wrote in a blog post on the Boston Dynamics website. “These large language models help provide cultural context, shared knowledge, and flexibility that are useful for many automated tasks.”


Previous Post Next Post