It’s always inspiring to see new technologies making a real difference in our everyday life. And programmer Emily Shea provided another example at this month’s Perl conference 2019 in Pittsburgh. She writes her Perl programs … by voice command.
Didn’t think that face being numb from dentist would make taking to my computer difficult… but now I know.
— Girl with a Perl Earring (@yomilly) June 26, 2019
Shea is a senior software engineer at Fastly Inc., the San Francisco-based content delivery network, where she works on their platform for delivering code CDN configurations. And she had a good answer for the question of why she wants to write code by voice, describing herself as “an unfortunate soul who ended up with RSI symptoms,” which impaired her ability to type. After trying every possible remedy — including physical therapy and specialty keyboards — she still found that whatever hours of typing she could do during the day were still broken up by frequently-needed breaks which she describes as “constantly being interrupted by your body.”
“Because I was feeling so limited, I started looking into alternative ways that would not involve my hands.”
She was eager to move beyond the traditional combination of a keyboard and a mouse and showed the audience an inspiring demo of how it worked — in which five lines of perfectly written and punctuated code require just 45 seconds of dictation.
Her speedy demo drew a round of applause.
Shea starts with a good microphone and speech-recognition software. Her toolkit includes NaturallySpeaking from Dragon, but also Talon, “the technology that I’m really excited to tell you all about today,” which is supported by Patreon contributors. Using Dragon’s dictation API, Talon offers hands-free input for computers (responding to voice commands), “So you can get off your keyboard, but still control your computer.”
But best of all, the whole solution can be configured using Python code in locally-hosted files. “So I can customize this thing like crazy,” she said.
For example, if there are language-specific commands, she can indicate that they should only be active when working on files with language-specifying file extensions (like .pl). But she’s also made it easier to dictate letters. All the letters of the alphabet rhyme with other letters, she points out, which could lead to accuracy issues. She showed her audience a slide with one familiar alternative, the NATO phonetic alphabet (Alpha, Bravo, Charlie…) But rather than using that — saying “November” every time she wanted the letter N — Shea simply customized the software to recognize her own set of 26 words to represent letters of the alphabet.
In a second demo, she quickly blurts out “Pit Each Red Look” — and the word “perl” appears on the screen… Within roughly 10 seconds, she’s typed out the whole phrase “Perl is awesome” — including the correcting of two typos.
Perl code includes a lot of symbols, but “these are pretty easy. It’s something I didn’t have to spend a whole lot of time solving. All the symbols are what you’d think they’d be,” she said. So the dollar sign is “dollar,” an at sign is “at sign,” and the pipe symbol is “pipe.” There’s also support for adding your own words — like undef.
I mean on one hand it makes sense he’s interested, he’s a linguist. But on the other hand it’s like, WOOOOOAH!!
— Kris Foster (@transitorykris) June 19, 2019
One of the unique problems of working by voice: homophones. These are two words that sound exactly the same, but are different — like “bite” and “byte”. She’s solved that problem with the voiceword “phones.” When followed by a homophone, it pulls up a menu with all possible choices for that word. So saying “phones byte” and “pick three” will quickly select the third possible spelling of byte from the pop-up window. And if there are only two choices, saying “phones” (after highlighting the first word) will just swap in its alternate spelling.
She found a good starter set of words with a children’s board book “Llamaphones”
She’s also coded in voicewords to format her text (for example, with uppercase letters) when the voiceword precedes a phrase. Saying “snake” puts underscores between the words (a format often referred to as “snake case”), while “kabob” puts hyphens between them, in the style known as “kabob case.”
Saying “allcaps” before a phrase will enter it in all capital letters, but the most interesting one was probably “pac title.” It displays the words that follow as being separated by two colons — punctuating them the way you would if you were trying to identify a namespace using Perl’s package-identifying syntax.
About 20 minutes into the presentation, she played a video where she writes a longer application. Within three and half minutes she’s written a 17-line Perl program — and set its permissions so it can be run in her web browser. First Shea says “phrase iterm,” which pulls up the Mac terminal-emulating program iTerm2, then starts typing letters into its command line by saying each of their “phonetic alphabet” equivalents out loud.
Here’s what happens when Shea says “Mad Krunch Dip sIt Red space snake perl demo enter.”
> mkdir perl_demo
And then she immediately changes to that directory by saying “cd snake perl_demo enter”
> cd perl_demo
There are some handy shortcuts along the way. The first line of a Perl program identifies where the interpreting program is located — typically something like #!/usr/bin/env perl — but instead of typing all that, Shea simply says “Perl hash bang.”
It soon becomes too complicated to follow — there’s a word or phrase for every symbol that needs to be typed, and the code quickly appears on the screen, line after line. Shea types the string “html” by saying “Harp Trap Mad Look” — and “Yank Air Yank space Perl” translates into “yay perl.” The all-important Unix command chmod becomes “Cap Harp Mad Odd Dip” — which she uses to toggle her file’s executable flag. Daemon is “Dip Air Each Mad Odd Near,” and soon she’s ready to run the program in her web browser.
It displays the phrase “YAY PERL,” bookended on both sides with a randomly-changing emoji.
And then her next demo shows her committing to Git, pushing to GitHub, and making a pull request.
Obviously, there’s also challenges if you’re working in an Open Office. (Your co-workers will hear you coding — while your own microphone may pick up the conversations of your co-workers.) One solution Shea points out is a Stenomask — a soundproof enclosure that straps over your mouth, with a small microphone inside to pick up your commands.
For a larger soundproof enclosure, there are acoustic office booths, also known as “acoustic pods.”
But of course, there’s a third option: working remotely. She tells the audience that her team has been supportive, giving her a flexible schedule and insisting she take care of herself. Asynchronous communication also helped, she said, and if a real-time conversation is necessary, a video call can be better than a long conversation on Slack.
In fact, there’s one unexpected advantage to coding by voice. “Because I don’t need the keyboard, I can lay on my couch and talk to the computer. Or lay on the floor, you know?” She even tells the audience that if she wants to, she can deploy software while doing a yoga pose. “It’s great.”
Met some incredibly welcoming folks this week. It can be a bit intimidating walking into a small, tight-knit conf, especially as as part of an underrepresented group, but I felt very welcome and accepted. Some really wonderful folks here at perl conf. ❤️ #TPCiP
— Girl with a Perl Earring (@yomilly) June 21, 2019
Shes acknowledges that the learning curve is steep, likening it to a customized keyboard. “Take that, but imagine now, you’re using a different part of your body. You’re now hearing sounds you weren’t used to hearing, so you might not be able to as quickly think while you’re working because now you’re hearing stuff. So that can be difficult.”
But another big problem is tools, apps, or web sites that don’t have good accessibility features. “It’s a web site where there’s a button, and I can’t click it because it’s not the actual HTML button and it’s not a link, so I can’t get to it. That’s really frustrating… Because then I have to spend a lot of extra time working around the thing, or finding a new tool, learning how to use a new tool…
“So this is my call, for all of us developers out there… There’s a lot of great talks about accessibility. Keep yourselves up on the accessibility space and find some great talks and help build tools that you someday might appreciate having access to.”
Someone this week told me they felt a little more at ease knowing that they wouldn’t have to give up writing code if they couldn’t use a keyboard. This is why I wanted to give this talk.
— Girl with a Perl Earring (@yomilly) June 21, 2019
But throughout her talk, there’s one unmistakable message that comes through loud and clear — that technology really can make a difference. “I would be doing a disservice if I didn’t reiterate the impact that this kind of technology has for people who can otherwise not use a keyboard or mouse,” Shea tells her audience.
“I was facing some pretty career-limiting or potentially career-ending injuries, feeling like the door is closing. And then Talon comes along, and it really helped saved my career,” she said. “It helped me get quality of life back.”
- Laptops of the 1980s: a journalist remembers his TRS-80 Model 100.
- New report warns of our invisible internal security threat: cognitive bias.
- Players send messages in a bottle in a new dystopian video game.
- The board game Monopoly gets a voice-controlled AI banker.
- Adam Savage tries to build a flying “Iron Man” suit.
- Simone Giertz is back, converting a Tesla into a pickup truck.
The New Stack is a wholly owned subsidiary of Insight Partners. TNS owner Insight Partners is an investor in the following companies: MADE, Real, Bit.