TNS
VOXPOP
How has the recent turmoil within the OpenAI offices changed your plans to use GPT in a business process or product in 2024?
Increased uncertainty means we are more likely to evaluate alternative AI chatbots and LLMs.
0%
No change in plans, though we will keep an eye on the situation.
0%
With Sam Altman back in charge, we are more likely to go all-in with GPT and LLMs.
0%
What recent turmoil?
0%

Off-The-Shelf Hacker: Hedley the Robotic Skull Speaks

May 30th, 2018 9:28am by
Featued image for: Off-The-Shelf Hacker: Hedley the Robotic Skull Speaks

We’ve covered adding a servo-controlled jaw to Hedley the robotic skull a while back. As it turns out, making a jaw bone realistically track audible speech is pretty challenging. The human voice is generated by air moving over the vocal chords, the shape of the mouth, the position of the tongue and the volume inside the mouth, which we all know is influenced by the opening and closing the jaw. Well, not exactly.

Open your mouth and sing a song or talk without moving your jaw. You can make an almost infinite number of sounds without any jaw movement whatsoever. How does that work? Pretty complicated, right?

What we do with the robot is give visual queues for the speech sounds by moving the skull’s mandible up and down. It ends up being kind of an illusion that humans WANT to believe.

One way to do it is by what’s known as the flapping jaw method. This technique makes the jaw movement follow sound pressure peaks and valleys recorded from a small microphone. The effect is acceptable as a first cut at robot verbalization. It’s certainly not perfect. Playing with the mechanisms, Arduino code and other factors paves the way for how to further refine the illusion. With a reasonably good illusion, human observers start to willingly suspend reality and instead start to think that the skull is actually talking.

Making the Jaw Flap

My skull design uses a standard full-sized hobby servo to move Hedley’s jaw up and down. The old Big Mouth Billy Bass talking fish used DC motors and a controller board, in much the same role.

In the case of the skull, the standard Arduino potentiometer-to-servo example sketch was modded to use an analog microphone to convert the sound pressure of my voice into values from 0 to 1,023. The base code is in the Arduino integrated development environment (IDE) under the Examples -> Servo -> Knob topics. The Arduino then converts those values into a pulse width value that is used to move the servo arm to a corresponding higher or lower angle, as the sound changes. The total angular movement of the jaw is about 25 degrees.

Here’s the code.


Notice that I added some serial communication code. While a microphone is fine for testing, the real goal is to eventually use a pre-recorded audio file to play through a speaker while sending data to the Arduino to move the jaw. Likely, I’ll use a processing script on Hedley’s Raspberry Pi, one which reads an MP3 file and does some filtering and outputs a data stream that the Arduino will read over the serial line. The serial line data will replace the values generated from the microphone input. I’m still working on that part and as things come together, and will put it in future how-to articles.

Hooking up the electret microphone to the Arduino was pretty easy since it has a little pre-amplifier built in. Attach the VCC pin to 3.3 volts and the GND terminal to ground, respectively on the Arduino. The output then connects to the Arduino analog pin 0 (potpin).

Prototyper Pro Tip: Make Mechanisms Adjustable

As I worked with the Arduino code, the servo angles and value scaling, it became apparent that the servo response was a little too slow. You can’t have any noticeable disconnection between the sound and the jaw movement, otherwise, the visual queuing effect is ruined.

Reducing the delay in the main program loop, while making the servo respond more quickly, also causes the jaw to look “nervous.” The value of 90 gave a pretty good response without any twitchiness. Eventually, it occurred to me that I could just change the ratios of the servo and jaw pivot levers, to make the jaw respond faster. The standard-size hobby servo has plenty of power for this level of torque loading.

Adjustable servo and jaw pivot levers, with connecting rod

Changing ratios meant that I had to make the actuation lever attached to the jaw pivot adjustable. With these fast-moving prototype projects I usually just guestimate the sizes and lengths of levers and mechanisms, based on my past fabrication experience. My guess is occasionally off a bit, so an equally fast-moving solution is to make provisions for mechanical adjustments. In this case, I simply removed the jaw pivot lever and drilled a few holes spaced out along the arm. The lever was reinstalled (and re-soldered) to align parallel with the servo lever. The servo lever also has adjustment holes, although readers will notice that I’m using the top hole already.

Want to move the jaw quicker? Keep the servo-side connecting rod end in the same hole and move the jaw lever pivot end down closer to the pivot shaft.

Did It Work?

With the current setup, I can speak into the microphone and the jaw will follow my voice fairly well. Sensitivity to air pressure while talking is pretty noticeable. Hard sounds like “p” and “k” cause the jaw to move a lot. More experimentation might yield better performance.

I’m going to try feeding the audio from the Raspberry Pi into the analog pin of the Arduino and see how that works. One set of instructions uses a Wave Shield attached to the Arduino, as its skeletal talker. Take a look at the video and you’ll see a little lag in the servo response. While this is a nice job, I’d like to make Hedley move more naturally.

Ultimately I’ll convert speech recorded in a file and into values that the Arduino will interpret and use for servo jaw control. Canned robot routines and an Alexa-like experience would be pretty slick.

Stayed tuned for further developments…

Check back for a new edition of Off The Shelf Hacker each week, only on The New Stack.  

Group Created with Sketch.
TNS owner Insight Partners is an investor in: Shelf, The New Stack.
THE NEW STACK UPDATE A newsletter digest of the week’s most important stories & analyses.