This article examines the common vocal techniques employed by popular YouTubers to enhance audience engagement. A linguist, Naomi Baron, identified key features including overstressed vowels, sneaky extra vowels between consonants (epenthetic vowels), and long vowels and consonants. These techniques emphasize words and create a 'bouncy' rhythm.
The study highlighted the use of:
These techniques, while not exclusive to YouTube, are used strategically to maintain audience attention in a medium lacking physical presence.
The article suggests that this style might be influenced by the informality in TV newscasting, comparing it to the style of shows like 'The Daily Show'. It's described as a 'talking to the audience' voice, or an 'intellectual used-car-salesman voice', implying its persuasive nature. The effect is less intense than other high-energy sales pitches due to audience choice. However, its effectiveness has led to its adoption by many YouTubers.
The article concludes that while attempting to sound unique, many YouTubers converge on a similar style of speech, showcasing how trends emerge and spread in online communication.
Hey guys! Whatâs up? Itâs Julie. And today I want to talk about YouTube voice.
So the other day, I was watching this YouTube video from the PBS Idea Channel about whether Ron Weasley from Harry Potter is really a time-traveling Dumbledore (as you do), and I realizedâthe guy talking sounds exactly like the Vlogbrothers. The Vlogbrothers are John and Hank Green, and their combined YouTube channel, on which they post videos of themselves musing on and explaining everything from world politics to farts, has more than 2 million subscribers.
And the guy in the PBS Ronbledore videoâMike Rugnettaâwas talking just like the Green brothers do. It wasnât a matter of their accents, or the sound of their voices, it was the way they were talking. The only word that came to mind was ⌠bouncy.
I found more examples in other popular YouTube channels. Tyler Oakley does it. Franchesca Ramsey does it. Hannah Hart of My Drunk Kitchen does it (when sheâs not drinking, or using weird voices). This Game of Thrones fan-theory guy does it.
But I had a hard time putting my finger on exactly what âitâ was, beyond a vague sense of similarity. So I asked a linguist.
Naomi Baron is a professor of linguistics at American University who studies electronically mediated communication. She watched some videos that I sent her, and was very patient with my continued pleas of, âNo, but I feel like something is going on here.â And so here, thanks to Baron, are the linguistic components of YouTube voice:
Overstressed vowels: A lot of the time, people are lazy about pronouncing certain vowelsâtheyâre un-emphasized and neutral, and just sort of hang loosely in the middle of the mouth, making an âeuhâ sound, regardless of which vowel it actually is. This âeuhâ is called the schwa. (Hear it pronounced here.) When you make the effort to actually pronounce a vowel that is usually a schwa, thatâs a way of emphasizing the word. For example: âIf I say the word âexactly,â you donât really know what that first vowel is. âEuh,ââ Baron says. âIf I say âeh-xactly,â you have the sound âeh,â like in the word âbet.ââ
Sneaky extra vowels between consonants: Listen to the way Rugnetta says âtrappingâ at 35 seconds here. âTerraping.â âIâve added a little vowel between the t and the r,â Baron says. âIt elongates the word, it adds an extra syllable to the word, it emphasizes the word. Thereâs a name for this: epenthetic vowel.â
Long vowels: Stretching out vowels is a common way of emphasizing wordsâsometimes itâs obvious, and clearly done on purpose (listen to the word âfiveâ in this Franchesca Ramsey video). But sometimes in these videos the vowels are just sliiightly longer than normal (see what I did there?), resulting in the kind of emphasis and âbounceâ I wasnât able to put my finger on until Baron pointed it out to me. (See: every time Rugnetta says âmagic,â or when Ryan Higa says âchannel.â)
For that matter, long consonants as well: Especially those at the beginning of words. Take the word âfascinatinglyâ from this Vlogbrothers video as an example.
Aspiration: This was the part of our phone call where things got interactive.
âIf you put your finger in front of your mouth, I'll teach you a very quick phonology lesson,â Baron said. I did. âAre you ready? Say âkeep.ââ
âKeep.â
âNow say âgeep.ââ
âGeep.â
âWhen you said keep, did you feel a breath of air on your finger?â She asked. (Indeed I did.) âThatâs called an aspiration.â Thereâs normally an aspiration on the K, even if you say it normally, but if you huff and puff a little more, that makes the word stand out.
For an extreme example, see how Hank Green says âcouldnât care less.â A more subtle example is how Australian vlogger Natalie Tran says âfakeâ and âsick.â You hear this sometimes with âpâ and âtâ sounds, too. Like when Charlie McDonnell says âsalt.â
* * *
So it turns out the âYouTube voiceâ is just a variety of ways of emphasizing words, none of which are actually exclusive to YouTubeâpeople employ these devices in speech all the time. But they generally do it to grab the listenerâs attention, and when youâre just talking to a camera without much action, it takes a little more to get, and keep, that attention. All the videos I used as examples in this article come from popular YouTube accounts, with hundreds of thousands or millions of subscribersâin other words, from people who know how to engage an audience.
There are other factors at play here, too. YouTubersâ monologues often speed up and slow down, for example. âChanging of pacingâthat gets your attention,â Baron says. And elongating certain words helps change up the pace. People also tend to move their heads and hands a lot in these videos, raise their eyebrows, and open their mouths wider than necessary.
Baron says she suspects that this style comes at least in part from a trend toward informality in TV newscasting. Influential âinfotainmentâ programs like The Daily Show use some of the same linguistic stylesâin this clip of Jon Stewart, I detected aspiration, and elongated consonants and vowels.
It's a âtalking to the audienceâ voice. Another linguist called it âintellectual used-car-salesman voice.âOf course, people adopt these tricks to varying degrees. But I think there is something to the newscasting comparisonâI notice this âYouTube voiceâ the most in videos where people are just talking to the camera as themselves, with no acting, no props, no action. And in videos where people monologue for a minute, and then break away into a sketch or a scene (such as this Natalie Tran video), the tics, if theyâre there, seem far less pronounced than when the person was speaking directly to the camera. Itâs a âtalking to the audienceâ voice. Another linguist, Mark Liberman of the University of Pennsylvania, who runs the Language Log blog, called it âintellectual used-car-salesman voice.â
âYou get the same kind of thing in other high-energy sales pitches,â he wrote to me in an email. âI guess the purest form of this style is the carnival barker.â Itâs less intense on YouTube, of course, where the audience chooses to click on a videoâthe speaker isnât trying to grab the attention of people who just happen to be walking by, chewing on their cotton candy.
But itâs a style video-makers have likely picked up on, because it works. And it may be a particularly popular way of speaking at the moment.
âThings become stylish. That happens with language all the time,â Baron says. âWhat I think you have is an Internet platform that many people are taking to, and what theyâre doing is taking models they recognize from elsewhere. [Theyâre] dressing up their language through particular kinds of spoken emphases, gestures, and facial expressions. Whatâs interesting is how similar people end up sounding, rather than sounding like themselves. In an attempt to make yourself sound special, you end up sounding like this whole genre of other people.â
Share this article with your
friends and colleagues.
Earn points from views and
referrals who sign up.
Learn more