The Complete Guide to Captioned Videos

Born profoundly deaf, I’ve depended on captioned videos since getting my first big clunky decoder in 1983. It was a box about the size of the older VHS and DVD players.

Back then, captioned shows and movies were hard to find. Few things excited me more than seeing the caption symbol on the cover of a movie or at the start of a TV show.

Thankfully, they’re easy to find when it comes to TV networks, streaming services, and movies.

But the same can’t be said for the many, many videos companies and individuals put out on YouTube, LinkedIn, Facebook, and elsewhere.

In recent years, captioning has gotten so much easier and more affordable. This guide will dive into why captioning matters, how it provides a huge ROI, 10 rules for creating great captioned videos, and how to caption your videos.

7 Reasons Why You Want to Caption Your Videos

World Health Organization reports that more than 5 percent of the world’s population is deaf or hard of hearing. [Source WHO]

And guess what.

The main users of captioned videos are NOT the deaf and hard of hearing! So, captioning your videos reaches far more than those who are deaf like me.

For some businesses, accessibility isn’t enough of a reason to add captions. But you know what? Accessibility can make you a profit as you can learn in this video.

Though it’s the right thing to do and good for business, they still want reasons that help them reach as many people as possible and yield ROI.

Here are some reasons you want to caption your videos. There’s more, but these are the ones that tend to change someone’s mind about captioning.

Here are 6 reasons to caption videos

1. Reach far more than the deaf and HoH

Two separate surveys (OfCom and Verizon Media) have found that 80 percent of the people who use captions are not deaf or hard of hearing. The OfCom survey is older than I’d like it to be. So, I conducted two separate polls. Eerily, they had the same results: 86 percent who use captions aren’t deaf or HoH. It’s not surprising the number has climbed.

Many people who aren’t deaf or hard of hearing tell me they don’t watch videos with sound. Part of this is because more people are watching videos in public without their headphones. LinkedIn finds that 80 percent of video views occur with the sound off.

A study by Verizon Media and Publicis Media reveals that 69 percent of people view videos without sound in public places. What about when they’re in private places? Still, one in every four watch videos without sound.

Plus, everyone can experience temporary or situational hearing differences. A person with an ear infection or cold may not hear as well. People talking while wearing masks can also affect the sound.

Why do people watch videos without sound? Here’s what Verizon Media has found:

  • In a quiet space.
  • Didn’t have headphones.
  • Waiting in line.
  • Multitasking.

Video creators can expand their video’s reach by adding #Captioned in their post with the video. This helps caption viewers find captioned videos without scrolling through their feed and feeling disappointed in seeing non-captioned videos.

2. Clarify what the speaker says

Sometimes other noises get in the way or the person may have an accent like I do.

Friends and colleagues tell me they turn on the captions when watching videos because they catch more of what’s said. They say that some programs have actors or speakers who are hard to understand. Or there’s a lot of noise interfering with the voices.

3. Capture more viewers

Just like everyone has a different learning style that works, everyone has a different preference for how they like to watch videos.

  • Some prefer sound only.
  • Some prefer captions only.
  • Some prefer the text only. (The text that accompanies the video.)
  • Some like two or all three options.

You give people options for consuming content, thus giving you greater reach.

4. Helps viewer focus

In Why Gen Z Loves Closed Captions, Lance Ulanoff explains how Gen Z tends to multitask. Captions help them focus. This is especially true for people with ADHD and auditory or language processing disorders. Some people with no disabilities say they use captions for focus, too.

Gen Z also likes captions because they can text while they watch. For many folks, captions help them catch things they miss without going back.

5. Boosts overall brand awareness by almost 19%

Captions increase brand awareness by 18.8 percent according to research from AdColony and Millward Brown. Furthermore, a LinkedIn survey reports that marketers’ No. 2 priority is to build brand awareness, only 1 percent behind the No. 1 — no surprise! — the priority of driving more leads.

Several companies have been hit by not one, but two lawsuits. It hurts their reputation as customers boycott their brand. And adding captions costs very little compared to the cost of a lawsuit.

6. Avoid lawsuits and reputation damage

What do Hulu, Netflix, Harvard, MIT, and PornHub have in common? Yes, that PornHub. They’ve all been sued over the lack of captions. A man sued PornHub because some videos were not captioned. He’s probably one of those who reads Playboy for the articles.

Harvard and MIT lawyers said the videos had captions and pointed to automatic captions. This wasn’t good enough. They ended up settling and now provide high-quality captions.

Usablenet accessibility lawsuit report has found more than 20 percent of accessibility lawsuits are against companies that have been previously sued. And the rate of filing for ADA website lawsuits is one per hour. Contrary to what these companies may think, it’s NOT cheaper to deal with lawsuits over baking accessibility into the organization.

Caption users are a passionate lot. They will speak loudly and boycott brands. It can hurt the company’s reputation. So, the lack of captioning costs more than the price of a lawsuit.

7. Benefit from search engine optimization

This only applies to videos that contain a text file — such as .SRT — with the captions. You can caption one of two ways: open or closed. And in some cases, SEO does not matter. For example, posting a video on LinkedIn, Instagram, or Tik Tok.

Open captions

Open captions are also known as burned-in or permanent captions. They always show up on the video. You can’t turn them off and on. The captions turn into an image. The viewer can control the captions. If the captions aren’t readable, they’ll move on.

The advantage of open captions is the captions always work. You don’t have to upload a second file (the SRT text file). Not everyone has captions turned on, so they may not realize a video is captioned.

Closed captions

Closed captions only appear if you have the captions turned on. Unlike with open captions. anyone who doesn’t want captions can turn them off. Many video services allow you to change the font, color, and style of captions.

They also require a text-based file like the .SRT file to work. The text file has time codes to tell the video when to show each line of captions. Search engines can read this text file, which optimizes your video for search engines.

For a deeper dive into this topic, see What You Need to Know About Types of Captions.

Which one is better? Open or closed. Closed captions win because they give the viewer more control. Everyone has their preferences. Closed captions let them customize the captions. Some video platforms have more customization options than others.

I use open captions in a few instances like Instagram. I cannot edit the automatic closed captions on Instagram. So, I use open captions while following best practices.

There are many more reasons, such as helping second language learners. When a company says its target audience has no deaf people or second language learners, here are six reasons that do apply to them.

Statistics About Captions

  • Report from Preply has more statistics. Here are the highlights:
    • 50% of Americans watch content with subtitles most of the time.
    • 55% say it is harder to hear the dialogue in shows and movies than it used to be.
    • 72% watch captions because the audio is muddled.
    • 70% of Gen Z use captions.
  • 61% of 18- to 25-year-olds and 31% of 25- to 49-year-olds watch TV with subtitles. [YouGov]
  • 86% of 156 people polled who use captions are not deaf or HoH. [Source: Meryl poll]
  • 80% of the people who use captions are not deaf or HoH. [Sources: OfCom and Verizon Media]
  • 80% of members use subtitles at least once a month. [Netflix]
  • A survey of U.S. consumers found that 92% view videos with the sound off on mobile and 83% watch with the sound off. Source: Verizon Media and Publicis Media]
  • 10% of broadcast viewers use subtitles regularly, increasing to 35% for some online content. [BBC]
  • 80% of viewers react negatively to videos autoplaying with sound. So now, many social media outlets now autoplay videos on silent. [Facebook]
  • Almost 80% of video views on LinkedIn occur with the sound off. [LinkedIn]
  • Video content designed for silent viewing is 70% more likely to be watched all the way through to the end. [LinkedIn]
  • 69% view videos without sound when they are in public places and 25% in private [Verizon Media and Publicis Media]
  • Video Captions Benefit Everyone [Policy Insights from the Behavioral and Brain Sciences] [PDF]
  • About 90% of all students who use closed captions find them at least moderately helpful for learning. [Educause]
  • Increases overall brand awareness by 19%. [AdTechDaily]
  • 40% using Instastories have the sound off. [AdWeek]
  • 80% more likely to watch the entire video with captions available [Verizon Media]
  • Increases engagement. [Instapage]
  • Avoid potential lawsuits and negative publicity. [The Hollywood Reporter]
  • Gen Z prefers captions: no stats, but here’s why.
  • Captions help reading according to a study [Kids Read Now]
  • Orgs filled petition with FCC about ASR captioning [Vitac]
  • Why Your Brain Loves Closed Captions [Salon]
  • “More than 100 empirical studies document that captioning a video improves comprehension of, attention to, and memory for the video.” [HHS Manuscript]

Caption10 Guidelines: Create Great Captioned Videos

With more do-it-yourself software hitting the market to caption videos, we’re seeing a new problem. And that’s the quality of the captions. Yes, we want more videos with captions. However, it sometimes interferes with the user experience.

When you watch captions for decades, you learn what works and what doesn’t work. The key to great captions is invisibility.

Invisibility? Meryl, are you kidding me? If they’re not there, how can you read them?

By invisibility, I mean that I won’t notice anything about them.

If I notice something about the captions, that’s usually not a good sign.

Captions up top? I notice. (Unless it’s when the credits appear.)

Captions hard to read? I notice and quit watching.

Captions riddled with typos and clearly inaccurate? I notice and quit watching.

Captions that disappear. Ditto.

You get the idea.

Even though it’s been more than 30 years since I got hooked on the Carrington and Colby family feuds on “Dynasty,” the components of excellent captions haven’t changed much.

That’s because simple works.

Great captions are boring!

And simple leads to effortless captions. Adding captions is only half of the equation for great captions. The other half is quality. You can do a lot of things right and still make these common caption mistakes.

Why Good Captioned Videos Are Important shows you some of these Caption 10 guidelines and best practices in side-by-side videos. One video follows the rule. One breaks it. See what a difference great captions make.

10 guidelines for accessible captions: Readable, accurate, synchronized, length, position, sound, credits, voice changes, speaker identification, and motion with one or two sentences describing each one.
10 guidelines for accessible captions: Readable, accurate, synchronized, length, position, sound, credits, voice changes, speaker identification, and motion with one or two sentences describing each one.

(Or you can watch all the videos of the Caption 10 guidelines and best practices for high-quality, accessible captions.)

1. Readability

If you only follow one rule, make it this one:

*** Captions should be readable ***

If they’re not readable, nothing else matters.

The accuracy, the position, nothing.

Four things factor into readability: size, color, background, font, and case.


Small captions can be tiring to read even for people who don’t wear glasses.

Large captions can cover up too much of the video.

Goldilocks learned the hard way. You want to aim for just-right, not too big or too small.


The key is contrast.

White text with no background can make or break the readability.

Check out Contrast Ratio, a useful tool for visually showing the readability of colors. [Hat tip: Marguerite Efdé]

The thin, white font strains the eyes because it tends to blend with the background. Like in this example.


The interesting thing about this example is that if you expand the video to full size, the readability is so much better.

Remember to check your captions on small and large screens. People may be watching on mobile devices. Or they don’t expand the screen to full size.

A little bigger, a little bolder white font can work because there’s enough contrast like this one. But it’s a far better experience to have a background for contrast.

I’ve seen some videos with more than one color in the captions.

The recommendation is to use one color mainly because many people are colorblind.

According to the National Institute of Health, about 8 percent of men have red-green color blindness.

Using one color ensures a consistent and effortless reading experience.

Remember, captions need to be invisible. People will notice the captions when it has more than one color. You want viewers to grasp what the captions say, not what they look like.


The background is the color behind the text. Standard captions use a black background with white text. It works well.

It’s my personal favorite.

But some think it hides too much of the video.

A possible compromise is to use a transparent background like this one.

After creating the video, I saw a video with transparent captions. The action behind the transparency from the person’s movements distracted me so much that I missed the content.

If you use transparency, use it wisely. Verify nothing behind the captions distracts the viewer.

White font and black background work well because of the strong contrast and muted colors. Some colored backgrounds like red and yellow feel harsh.

In short, A solid background *beats* a transparent one. Those using standard captions with an .SRT file may have slight transparency. And that’s OK.


The Goldilocks rule applies to fonts too. The font needs balance in that it’s neither too thin nor too thick that the letters blend together.

As for the font style, aim for simplicity because fancy fonts can hurt readability.


Case wasn’t originally part of readability. However, there’s a bad trend happening in captions. It’s not progress over perfection.

It’s a step backward.

When I first discovered captions, they were in UPPERCASE. They also had a black background with white text.

The black and white captions continue today. That’s because they work. At one point, the captioners experimented with other colors, but that didn’t last long. For good reason.

Subtitles are meant to be plain and readable. They’re not for branding or getting creative.

Anyway, one thing changed from the early days of captioning. The captioners switched to mixed case or sentence case. That’s because they’re far more readable than uppercase.

Let’s look at an example with my name.



All caps show zero shape variation. MERYL has a flat top edge and a flat bottom edge. That’s it. No change in shape. It’s like a rectangle block.

Meryl shows three variations on the top:

  • Once for the M
  • Once for e, r, y,
  • Once for the l (little L).

Meryl also has three variations on the bottom:

  • Once before the y
  • Once for the y itself
  • Once after the y

Visual differences in mixed case captions boost readability. The lack of visual differences in UPPERCASE adds friction to the reading experience.

People with disabilities and even no disabilities struggle with uppercase captions.

Side-by-side of the same video. Left shows mixed case captions and right shows uppercase captions

Some live shows resort to uppercase captions to increase speed. The purpose is to increase delays between when the words are spoken and when they appear in the captions. So with live programs, the choices are speed vs. delay, which affects synchronization.

Personally, I try to avoid live captions and catch the recording. But that’s not always possible. Sometimes the recorded version isn’t cleaned up, so it’s the same captions as the live version.

It’d be great if the networks would re-air live shows with cleaned-up captions. Or at least, post it on their streaming networks with cleaned-up captions and free access.

Uppercase captions are also a problem in recordings as I see more and more recorded shows using uppercase captions.

Some captioners use uppercase letters for speaker identification like this:

MERYL: Please use mixed case captions.

Here’s a better way that improves readability:

[Meryl] Please use mixed case captions.

Subtitle viewers know that words in brackets aren’t spoken or represent sound. The brackets are a better differentiator than all caps.

Please use mixed case captions, including on Instagram and other social media networks. One of the options there is all caps and it’s animated. This isn’t a good option. Pick one of the latter two options out of the four available.

Video showing all the caption styles available in Instagram and Facebook. Most are not accessible because they’re uppercase, change size, and move. There’s one style that’s most accessible because it uses mixed case and doesn’t move.

Captions will never win beauty contests because pretty and colorful defeat their purpose.

The readability of great captions is effortless. It allows the viewer to quickly read the captions while watching the action on the screen.

That’s why readability is the No. 1 rule of great captions.

2. Accuracy

Why does readability get dibs on No. 1 over accuracy in captions???

Because accuracy won’t matter if it’s not readable.

So, No. 2 of #Caption10 to create great captions is Accuracy.

Great captions correctly capture everything that’s said word-for-word. Even bad words. Would you believe some captions have swapped out bad words with clean words?

Accuracy also means completeness. I’ve seen captions that paraphrase what’s actually said instead of showing every word. The paraphrasing misses important things.

Why should the viewer reading captions get a different script than those listening?

Accuracy means no disappearing captions. One TV channel is notorious for having one or two lines disappear. A character was reading a note she found. To this day, I don’t know what it said. Captions disappeared.

That’s why autocraptions have earned their bad reputation. Autocraptions are rarely better than no captions. Even W3C Web Accessibility Initiative says automatic captions are not sufficient. They need to be edited.

Automatic captions lead to a confusing and tiresome user experience because the viewer must work harder to figure out what’s said. It’s not simply filling in the blanks of a word here or there.

In her awesome weekly Marketing Minute tip email, Marcia Yudkin shared a story about an ad that claimed: “Learn 80% of Chinese in 3 months.” Even if this is true, this is a terrible accuracy rate.

She gave the following example:

“That’s two out of every ten _____ that are _____.”

“It’s the 20 percent missing words there that indicate the real meaning, the critic said, and if that’s as far as you’ve gotten, you’re barely out of the gate,” Yudkin wrote. “You’re not amazingly near the finish line at all.”

Example of automatic captions.

I assure you I did not say birth disrupting the utopian. Sounds like a bad scene from The Handmaid’s Tale.

That’s why it’s better to skip watching a video with automatic captions otherwise a person walks away frustrated from a lousy captioning experience.

You can use automatic caption files as a starting point. They can save time for some folks and reduce editing time.

Damn (sorry, y’all!) good captioning is about capturing everything that’s said correctly. When captions show everything that’s said and happening, we enjoy the experience knowing that we’re getting the same experience as those listening.

Muddled and incomplete captions create a crummy experience.

That’s why Rule No. 2 of good captions is accurate.

*** Just say no to autocraptions ***

3. Synchronized

Yes, we can tell when the captions are not in sync with what’s being said even with the sound off. That’s why the third element of #Caption10 for creating great captions is synchronized.

It could be the captions do not match what’s happening on the screen.

It could be the speaker’s lips say something completely different than what the captions show.

You want to ensure the timing follows the action as closely as possible. YouTube makes it easy to set the timings with its slider tool.

If you watch the video … wait until you see the example clip. Boy, talk about a headache. If you want to challenge yourself, turn the sound off.


I tell ya, when the captions and the video are out of sync, it’s dizzying whether or not you can hear it. Your eyes see one thing happening, but the captions say something else.

That’s why Rule No. 3 of great captions is synchronized.

You want to ensure the timing follows the action as closely as possible.

Clip source: ABC’s 20/20 “The Dropout”

4. Length

Yes, size matters …

Really, really matters …

… in great captions

Oh … my … goodness! I’ve seen some videos that use an app, which shows the captions …

One …

word …

at …

a …

time …

Here’s a short clip.

Mine is slower than some of the one-word-at-a-time captioned videos I’ve seen. Picture a two-minute video showing captions one word at a time … FAST.

Exhausting! Even for fast readers.

It takes away from the video. You’re so focused on following the captions that you ignore the action on the screen. It also makes it harder to take in the message.

Then there’s the opposite problem. Captions that run 3, 4, 5 lines long and take up the entire length of the screen. It’s easy to lose your place. They disrupt the experience.

Check out the next example.

This one block of captions would work better when split up into four blocks.

You also want to avoid making them so wide that they take up the full length of the screen.

The reason you want them shorter is that it keeps you from losing your place or reading from the far left side to the far right side like you’re watching a long tennis match.

Good captions tend to run about one or two lines long with up to 60 characters. Viewers follow along better. Bonus points if you make the two lines about even in length depending on the text as per the next tip.

Another tip discovered after the filming series: watch where you cut off the captions aka breaking points or line division.

Let’s say you have one or two lines of captions. And the last word is of, to, the, and, or something similar that leaves you feeling like you’re facing a cliffhanger. Try to avoid leaving people hanging by ending with a preposition, conjunction, or another short word.

Keep names together. In other words, I wouldn’t break “Meryl” and “Evans.” They stay together.

You can find a detailed guide on line division at Caption Key. Aside from these points, I’m not quite that picky about it as a reader.

Sometimes, the top line may be longer. And sometimes the bottom line is longer. It depends on the cut-off. They don’t have to be identical in length, just balanced.

Here are two examples where the two lines are NOT balanced.

It would be better to have the second line start with “by”

You can make these two lines more even by inserting a break to split the captions like this:

I overcame the first one
by making a short video.

Better … and it can be even better by moving “about” to the second line

The second one is better but could use a little more tweaking by moving “about” to the second line. First, because it’s a preposition. Second, because it’ll make the two lines more even.

Also, check your captions on a mobile device. (I need to work on this!) They sometimes turn a two-line caption into three. Here’s an example of how two-line captions went to three on mobile. It was not like this on a monitor.

Two lines turn into three on a mobile device

I now make both lines shorter to avoid the mobile problem of it turning into three lines. Then, I check it on mobile to ensure it looks right.

How do you make sure your captions aren’t too long or have too many characters?

I just finished re-captioning one of my first videos because it had too many characters. I’ve learned a lot about captioning since I first started creating videos.

How long should captions be? One line to two lines at 32 characters each.

But who has time to count all the characters in all the captions?! Copy, paste, and do word count take too long!

Great news! Here are two shortcuts to figuring out if your captions are short enough.

Get a 3″ x 5″ index card and mark one inch from both sides. Or cut a strip of paper that’s three inches (7.62 cm) long. (You may have to test and adjust based on your caption app.)

Put the strip of paper on the screen where the captions are to see if they stay inside. If they hang over, then it has too many characters or the line is too long.

Tested this and it worked great. The longest line was 34 characters. Good deal!

Deborah Edwards-Onoro uses another method. She set up a text editor to hold 60 characters and uses that as her guide. Great tip, Deborah! Often, I’ll open a caption file in a text editor to check the length and breaking points. It can be faster than editing it in a caption editor.

So yes, size matters in good captions? That’s why length is Rule No. 4 in the #Caption10.

5. Position

Do you like the bottom or top position?

In captions, of course!

Like size … position matters … in great captions!

Where the captions appear on the video matters. And It needs to be consistent.

I’ve seen videos with captions shifting for no reason. This hurts the user experience.

When you use a text file with captions, the text appears on the bottom. It’s standard because it works.

It places the captions below people’s faces. So, the captions are closer to the mouth. And the eyes naturally look upward to view the entire screen.

But when captions show up top, When eyes look upward, they’ll see nothing.

You won’t find data on this. It’s not scientific. When you’ve watched captions for decades, you’ve seen it all.

Captions at the top disrupt the experience. I notice them when they’re up top and feel like I’m missing out on the video action.

Here’s an example.

Yes, there are valid exceptions as you’ll see in No. 7: Credit.

You want to stay consistent and keep the captions in one place for the entire video. And the best place tends to be on the bottom.

In an informal survey, everyone has picked the bottom. No one has said top. Some people didn’t have an opinion until seeing the video.

Recently, I caught a video with captions in the middle! The middle! It totally wreaked havoc on the viewing experience!

Not because it’s always been done that way, but our eyes get the best experience reading on the bottom while catching the action.

6. Sound

People viewing your video may have the sound off.

Or they can’t hear it.

Sound plays a critical role in videos.

  • It foreshadows.
  • It reveals what’s happening.
  • It explains why someone reacts.
  • It makes us dance.
  • It lets viewers know the captions didn’t break.
  • And so much more.

Take a look at this video showing what a difference captioning sound makes.

A simple way to indicate a sound is to put it in [brackets].

Silence is also important to note. The opening of one TV show’s episode went on and on without nary a caption. Did the captions muck up? Is it music? What?! After putting on my bionic ear, I watched it again. It was music. The captions should’ve noted that and any [silence].

This clip shows laughter, music, a phone ringing, and even moves the captions up top during the credits.

It also shows two speakers in one scene.

Take a look.

See how much the captions communicated just by highlighting all the sounds?

Would you believe there are times when a song plays with lyrics, but the captions show nothing or just musical notes? <headdesk>

That’s a surefire way to have unhappy viewers.

But of course, you want to make them happy.

Just caption those sounds and it’ll be music to their ears!

7. Credits / Chyron

Although I covered this a little bit in the fifth rule, which is position — credits deserve their own rule!

Position covers where the captions show up. Some videos have the captions shifting up and down, left and right, for no reason. There are no chyrons.

Even when reading the captions, the viewer wants to see credits and any other non-captioned text too.

Way back in 1980-something, I recorded a musical special.

When I watched it, my heart sank.

Fort Worth had bad weather that night.

Naturally, the local station scrolled the weather report on the bottom of the screen.

The captions disappeared.

Now when I watch it all these years later, I still get this weather report and no captions!

The trick is to ensure the captions and the text don’t overlap each other.

A lot of captioning apps put the captions on the bottom.

In this case, you’ll want to put credits or text up top or somewhere that’s logical without hiding behind the captions.

Generally, the credits end up in the bottom and the captions move up top … temporarily.

Here’s how many TV shows do it. It works great.

People often ask if the spoken text appears on the screen as a chyron, do you caption it? No. This is common when a show announces guest stars and displays their names on the screen. No captions are needed here.

The important thing is to let viewers see both ✌️

Clip source: ABC’s Modern Family

8. Voice

[Clears throat]

I shall attempt to speak with a British accent. [Does not sound British. #AccentFail]

That’s why voice plays an important role in captions. Voice refers to voice changes. Not who is speaking, but how a person speaks.

Voice changes can reveal when someone …

  • Imitates a person, character, or something else
  • Talks hoarsely
  • Changes thandreir tone of voice
  • Uses a different accent
  • Screams. A big deal in horror films

The best way to explain it is by sharing these two clips in which one character claims she can do five ranges. I cannot tell much of a difference between the ranges. Thanks to captioning, I understand what happens. You might want to turn your sound off for the full effect.

See what a difference captions make to someone watching it without sound?

If it weren’t for captions, I would’ve never known what this performer did in this clip.

A voice changes for a reason. And viewers need to know when this happens. That’s why the eighth rule of the #Caption10 series on how to create great captions is voice. So caption those voice changes to keep everyone in the loop.

What are some other ways you might use captions with voice changes?

Clip sources: TV Land’s Hot in Cleveland and ABC’s America’s Got Talent (May 26, 2020)

9. Speaker

Who said that?!

There are times when it’s not clear who’s talking on the video.

Sometimes the speaker isn’t on-screen or two people speak and the captions show dialog from both in one shot.

Here are two different ways the captions show this.

The first clip uses a person’s name.

Take a look.

The second one puts the captions underneath the speaker.

Letting viewers know who speaks the line is important.

You have different ways to show this with captions.

  • Use a person’s name like this:
    • [Caesar: To be or not to be]
    • [Brutus sighs]
  • Put the captions under the speaker.
  • Indicate two different speakers with a dash as in:
    • Are you hungry?
    • Yes.

Speaker identification tells us …

  • Who apologizes.
  • Who is singing.
  • Who cracked a joke.
  • Who is acting like a villain.

Clip source: ABC’s Modern Family

10. Motion

Live show captions …

Well …

Suck … (I’m sorry for such a strong word, but it’s the truth.)

Every year, I look forward to one live TV show. The Macy’s Thanksgiving Parade. What I love about it is watching the performances from the musicals playing on Broadway.

While watching, I do this almost every year.



Because the captions almost always mess up.

Part of the problem is that it’s live. Live shows come with their own captioning challenges. The other part is that it uses rollup captions, a caption style commonly used in live programs. I call it scrolling because people understand what it means better than rollup.

This is one of two styles of how captions move.

The other is more common and that’s the pop-in. [Crowd cheers]

This video uses pop-in. The entire line or two of captions pops in and stays until I finish saying what you see. Then, it disappears.

In talking to other caption users, the consensus is we prefer pop-in.

But we know sometimes … scroll happens.

A lot of us quit watching live TV because of the flaws and the frustration with rollup captions. They tend to create a bad user experience.

It’s very hard to get it right.

The problem with rollup captions is they create a bad user experience.

I looked high and low for a good example of rollup captions. I couldn’t find one when I originally published this. Most of them had uppercase.

Then, I came across one from Microsoft’s Ability Summit. The live event captions not only used mixed case but also pop-in! A miracle! Granted, the captions aren’t perfect because some of the lines are long. But it’s a far better experience than what scrolling captions provide.

The video has four clips. The first is a side-by-side video with two of the same clips of me talking. One side shows pop-in captions. The other side shows moving captions. It was so hard to make because not many automatic captioning tools like my accent!

The second clip is a scene from ABC’s “The Good Doctor.” One side shows pop-in captions with mixed case. I viewed this on Hulu. The other side aired on TV and shows scrolling captions in uppercase AND ridiculously out of sync.

The third clip is from “Something’s Coming: West Side Story — A Special Edition of 20/20” from ABC. This is a recorded program yet it’s using live captions. One side shows pop-in captions with mixed case that I viewed on a streaming network. The other side aired on TV and shows scrolling captions in uppercase, and with a notable delay from the audio.

The fourth clip is from Microsoft Ability Summit. Its captions shocked me. First, they use mixed case! [Cheers] Second — and the miracle — they used pop-in captions! Granted, the captions aren’t perfect because some of the lines are long. Still, this is a huge improvement in the user experience.

For those of us not in the entertainment industry, a lot of videos contain motion because they’re choosing the most inaccessible caption styles. Instagram has two styles of motion. Facebook has three. The fifth clip shows all three.

The only time rollup captions are acceptable is in live programming. There’s just not always a way around it. But it’s possible as the last clip in the previous video shows.

If you use rollup captions, ensure they follow these guidelines:

  1. Capture audio accurately. (The biggest problem in live captions.)
  2. Use mixed case.
  3. Follow readability recommendations.
  4. Move from left to right and top to bottom. (Get this! There have been times when the captions moved in the opposite direction. It was dizzying!)
  5. Keep up with the sound. (Little delay between the audio and the caption.)
  6. Scroll smoothly.
  7. Do not fade-in or fade-out.
  8. Do not block anything like credits and chyrons.

Why no fading? Because the animation from fading adds friction to the caption experience.

Many times in this series, I mention that good captions are simple and effortless.

The animation from fading adds friction to the caption experience.

As for examples of bad rollup. They weren’t hard to find.

As you watch these clips, think about the problems you see in the captions.


Here we go.

Can you handle another one?!?

Are you sure?!

Pretty painful, eh? Bet you spotted a lot more problems than the rollup.

To do rollup effectively requires satisfying a lot of requirements.

Make it easier on yourself with good old-fashioned pop-in captions.

As hard as the captions try, I won’t let them stop me from watching that Thanksgiving parade!

Make it easier on yourself and stick with good old-fashioned pop-in captions.

Whichever you choose (hint: pop-in!), thank you for captioning your videos. That’s what matters most.

Apps and social media captioning tools that add open captions don’t provide accessible captioning styles. Here’s an example.

When you post a captioned video online, be sure to add #Captioned so we can find it!

I hope you’ve found the #Caption10 series helpful.

Caption Video FAQ

Here are the captioning videos frequently asked questions in a separate article.

List of Captioning Software

Here’s what’s available. If anything is missing, contact me so I can add it to the list.


Audio or Video to Text





Special Caption Topics

Use #Captioned When Posting Videos

Expand your video’s reach by adding #Captioned when you post your captioned videos. This hashtag is unique in that doesn’t tell you the topic of the video. Rather, it tells you the video has captions. This allows people to find videos that are captioned.

Yes, YouTube’s search tool has a “Captioned” as an option. However, if a video has open captions, it will not likely show up as a captioned video. And most social networks do not have a search tool to find captioned videos. Thus, #Captioned is the way to search for captioned videos.

Why not #Captions? Because it refers to a lot of things besides captioned videos.

What’s the Difference Between Subtitles and Captions?

Although some people say “subtitles” when they mean “captions,” they are and they are not the same thing. Confusing right?

English spoken outside of the U.S. typically refers to captions as subtitles, which is one reason why so many use captions and subtitles interchangeably.

The names don’t matter. However, to explain the important difference, subtitles refer to captioning foreign language in the viewer’s language. An example of subtitles is a film with the dialog in Spanish that contains English subtitles.

Subtitles do not show sounds or voice changes. They don’t identify speakers when it’s not obvious. And they don’t show the words when the speaker speaks the same language as the viewer.

Subtitles only show the translation of what the speaker says. Nothing more.

Here’s a unique situation that’s a reverse of what usually happens with subtitles.

So, viewers still need captions in subtitled programs to cover the gaps.

Fortunately, many TV programs and movies are good about it.

Still, captions and subtitles run into problems:

  • Captions overlap the subtitles.
  • Subtitles are not always readable.
  • Subtitles cannot always work alone.

Three things that make great subtitles and captions:

  1. Are easy to read
  2. Play well with each other
  3. Leave no gaps (Captions should cover anything subtitles don’t)

When you apply these three things, your viewers will be happy!

Open Vs. Closed Captions

There’s no right or wrong answer here as long as the captions are readable, accurate, synchronized, and all the other factors in the #Caption10 Rules.

You want to know about both to help you decide which is better for your video.

They have a little war going on like the Cola war between Coke and Pepsi wars. Some folks are passionate about one over the other.

So, the two types of captions are open and closed.

Open is burned-in. It’s permanent.

Closed only appears if you have captions turned on.

Before we dig in, a couple of points. One is that the apps that add captions to a video vary in their features.

Some give you more options than others.

Some give you more control than others.

So, this will cover general differences.

Open Captions

Let’s start with open captions.

One plus is that these always show up on the video.

You won’t have to worry about whether the captions work. Sometimes, a social network or website has a glitch that won’t let you upload the caption file. This isn’t a problem with open captions.

Depending on the app you use, it may give you more control over the placement and look of captions.

Another plus is that you don’t have to take the extra step of uploading the text file along with the video. I once had a caption file disappear weeks after I originally uploaded it. (It worked for at least two weeks.)

Viewers can see the captions no matter what. But for some people, this is a drawback. Not everyone wants captions. You can’t turn off open captions.

It’s also not optimized for search engines. Search engines can’t read burned-in captions like they can read a text file.

The other downside is that open captions may not be readable. Remember, some people may view your captions on a mobile device.

The No. 1 rule for great captions is readability. So, check your captions on various screen sizes as well as when the video is small and full-screen.

Closed Captions

Text-based captions known as closed captions don’t run into this problem.

The default is a white font with a black background. They get bigger and smaller when you resize the screen.

Closed captions require a second file. It’s a text file with a time code. The time code tells the captions when to appear on the video.

Closed captions also have one huge advantage.

Search. Engine. Optimization.

Because the captions appear in a text file, the search engine can read the captions. That’s music to marketing’s ears.

Of course, SEO is not always important. For me, captions showing up is No. 1. LinkedIn has lost my caption SRT file at least twice. (And it worked for at least one or two days.) This is unacceptable. So, I started posting videos with open captions.

The other upside of closed captions is that some networks or services allow you to pick the font, style, and color. Not all of them do.

LinkedIn doesn’t. But Netflix, Hulu, Facebook, and all the major video networks do.

Closed captions have a standard look and tend to show up at the bottom of the screen. Since you know where the captions show, you can make sure any credits and text don’t overlap them.

Closed captions are adaptable like responsive websites. These websites automatically resize for the screen or device they’re viewed on. Closed captions resize and move based on the screen or device. If you’re working with the player controls, the captions will shift up. If you flip your phone sideways, the captions will adapt and get larger.

The disadvantage of closed captions is that sometimes people may not realize the video is captioned because they don’t have captions turned on. To view captions on mobile requires going into the mobile device’s Settings > Accessibility and turning on the captions.

Not everyone knows to do this. Hopefully, the mobile operating systems will receive an update so that people can see the CC button even if their Accessibility settings aren’t set for closed captions.

Here’s a clip that uses both open and closed captions. Can you figure out which is which? Take a look.

Did you figure it out?

So, back to the Cola Wars. Which do you prefer? Open or closed captions and why? Please let me know in the comments below.

What You Need to Know About Captioning Live Events

Read this before you hire a company or freelancer to caption your live event.

Some reported a captioning service went downhill after another company acquired them.

I didn’t think anything of it other than not to hire them.

Then, a colleague shared her experience at a conference. The captions didn’t work. The company blamed the equipment. She stopped the conference, which forced them to fix it.

Ai Media bought the captioning service that went downhill. I have not heard one positive story about this company. Ai uses different methods for captioning. One of which is respeaking, which greatly lengthens the captioning process. Here’s a quick demo of how respeaking live captions work. The company labels it as “live” or “human-generated” captions.

Here’s the full process as Samantha Evans explains:

  1. Human listening to the conversation
  2. Human respeaking the words
  3. Machine-generated captions being generated
  4. Human editing those machine-generated captions
  5. Delivering the captions back to the “live” audience

Can you imagine the delay in this process?!

I’d pick autocraptions over this! This will have errors and the delay will be too long.

Ai calls these “human-generated captions” and “live captions.” This confuses people.

Also, be aware of C-Print and Typewell for captioning. Their purpose is better served elsewhere. Mirabai Knight explains it better than I can.

“C-Print and Typewell are non-verbatim text expansion systems typed on a regular QWERTY keyboard, and their speed is much lower than steno. Steno captioners can type up to 260 words per minute or more. Text expansion systems don’t often get faster than 160 or so, so they tend to paraphrase and summarize quite a lot in order to keep up.”

As you search for a captioning service, ask questions. Find out exactly how the captioning company does its captions. Transcripts are NOT a substitute for captions.

Quality captions matter. Here’s a list of recommended live captioning providers in a separate article.

More special video captioning topics are in a separate article.


Here are resources on captioning that you might find useful.

Want More Content Like This?

Did you like this content? To get an occasional email delivered to your inbox that will boost your accessibility and disability awareness efforts, subscribe to the newsletter.

Originally published on April 9, 2019

Updated on August 6, 2023

24 thoughts on “The Complete Guide to Captioned Videos”

  1. Nice!

    Don’t know if it’s universal, but when I caption,
    I pay attention keeping phrasing, and pause,
    and how they affect where I put line breaks.

    Kinda related to your length discussion.

    Don’t know how if it helps others, it helps me.

    • Thanks, Bill! I’m thinking about revising the length to mention this as I discovered this in the middle of the series. If there are two lines, I try not to end the first line on a preposition, conjunction, or other short words that feels like hanging. Also, avoid ending the captions (whether one or two lines) with said items.

      Another item I’ve learned is to try to keep two lines close in length. One may be a little longer than the other. Depends on logical breaks. The challenge with these is there are no hard and fast rules. But worth mentioning.

  2. But I also make sure I take my medicine first, and review my captions before submitting.

    I pay attention *to keeping phrases together, and add line breaks for pauses*…

    • Ha! No worries, Bill. That’s another thing I learned from the series … to watch my videos with captions on mobile. I’ve seen it turn 1-2 liners into 3-4 liners! And the split (where it cut-off the lines) was terrible!

  3. Meryl- this is wonderful. I’ve shared with my co-workers because it will help them understand the benefits of captions and the importance of quality. On the question of length and where to break captions, refer to The Caption Key from the Described and Captioned Media Program [] Everything you ever wanted to know about wrapping prepositions, when to spell out numbers, etc. They show good examples of each case.

  4. Hello, thank you for taking the time to explain this topic.
    But in a accessibility point of view I was wondering why it is recommended to have subtitles via an SRT or VTT file instead of having them directly in the video (e.g. like Adobe Premiere could do). I don’t find out the difference between these two ways of adding subtitles in a video. For translating OK I understand why, but for accessibility what it does?

    • Farouk, great question. I don’t recommend one method over the other as many variables come to play. The advantage of using closed-captions via SRT is that it gives the user control over the captions: turning them off and on, resizing them, etc. But they’ve caused problems for me on LinkedIn (disappearing captions), so I’ve been using the open captions method.

  5. Hi Meryl, a very thorough and in depth work trough for many issues, some I have been troubled with but not the greatest one I am experiencing.
    I am subscribed to many of the Asian channels, for my wife and I have certainly grown fond of some of the genre’s of TV shows they provide. I am an avid Chinese History buff and whenever a new series is out with historical connotation, I have to rely on my 75% language skill, (Verbal). However they also use a mix of dialects and old words, which make it a little harder.
    There is text in the released video, whether Youtube or other release and they encourage language conversion, heck I even signed up direct to one last week to see if I could get direct access to the vitals of the sub-titles, (no luck).
    I have tried all manners of ways to access the sub titles, so I can run them in outside translators, then with my knowledge of the language, I feel confident that I can do as well as the current translations or possibly better. I use my wife as a sounding board as well, while I am still studying.
    The base problem is getting this encoded/embedded chinese sub-title out in a format that I can read into my translator one line at a time and then translate first then interpret second.
    Can you guide me in this task?

  6. Awesome information Meryl!

    Question – I’m currently trying to optimize videos for a company and they only use onscreen text. No video content. Can I still add closed captioning of the text? I feel like search engines can’t fully understand the videos without it.

    • Howdy, Ann. That’s an intriguing question! Generally, if the text appears on screen, it should not be captioned unless it’s hard to read. Jeopardy! never captions the answers shown on the screen — only the questions.

  7. Great to see a guide to captions that actually gives recommendations AND is written by someone who would arguably need them the most – I will definitely be saving this and keeping it in mind.

    On a bit of a tangent, do you have any particular advice, tips, and/or guidelines regarding how to format audio transcripts? Unfortunately (and annoyingly) there are virtually no good ones on the Internet, as far as I can tell.

    • One of the biggest problems with audio transcripts is the lack of paragraphs. They often contain long blocks of text. Like online content, they should be short paragraphs (3 to 5 sentences). They should clearly identify the speaker. Those are the two biggest tips.

      • Thanks! After thinking about it, I have some more questions:

        Q1: I presume that the guidelines given in this article for sounds and voices apply (more or less) to audio transcripts as well?

        Q2: Say that I’m making an audio transcript of a Minecraft SMP (Survival Multiplayer) Let’s Play video, with no facecam. In these types of videos, the dialogue is more or less spontaneous. Is it still necessary to capture as much detail as possible? In other words, would I still have to try to capture all the um’s, ah’s, self-corrections etc.; or would I only need to capture the um’s and ah’s that contribute to the story/enjoyment of the content, such as a person hesitating because they’re scared or want to hide something from someone else? (I’m going for the latter.)

        Q3: In a YouTube video by Vox titled “Putin’s war on Ukraine, explained”, there are several points in the video (e.g. where a “new” person (i.e. the point is the person’s first appearance in the video) is shown speaking and onscreen text is given to show who the person is. In this case, would it still be necessary to identify the speaker in the captions? (I’m presuming the answer is “no”.)

        Sorry for asking so many questions! Also, to answer a question that you asked somewhere else on this website, the Contents dropdown definitely helps a lot, given the length of some of these articles – however, it is a bit tedious scrolling all the way back to the top to get to the Contents. Is it possible to add a “Top” button somewhere? (I don’t use GeneratePress so I’m not sure if it’s possible or not.)

        • Q1. More or less, yes.

          Q2. ums and uhs can be skipped unless it’s part of a character — it demonstrates nervousness or reveals something about the person.

          Q3. No need to identify the speaker unless you don’t see the speaker.

          Bonus: GREAT catch on “Back to top.” I’ve enabled it! Thank you for thinking about this. Obviously, I don’t read my own stuff often to realize the tediousness of getting back up top! At least, my footer doesn’t scroll endlessly! HA!


Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.