Murphy was a Captioner
Oregon's Deaf and Hard of Hearing Services (ODHHS)-Technical Assistance Center
ODHHS Information and Technical Assistance Series

(Source: Kevin Daniel, CRR-RDR, Bay Area Captioning, 7/96)
Murphy´s Law: If anything can go wrong, it will.
As the co-owner of a closed captioning business, I´m constantly reading postings on deaf usegroups, newsgroups and listservs on the Internet, as well as articles in NAD Broadcaster and Silent News. I also receive occasional feedback from the viewers of our captions. From all that I´ve read and heard, it is apparent that most viewers of captioning do not understand the process or rigors of realtime captioning. In order to answer some of the questions I´ve seen, over the past few months I have kept a diary of some of the most common mistakes, along with an explanation for their occurrences. I have distilled the information in that diary for this article. Hopefully, readers will also come away with some appreciation for the skills necessary to perform realtime captioning.
"When it comes to humility, I´m the greatest."
- Bullwinkle Moose -
Before continuing, I feel it´s necessary to list some of my qualifications to speak about the difficulties of being a captioner. I was a verbatim court reporter for 22 years before starting a captioning business in 1991. I hold several certificates associated with court reporting, including the Certified Realtime Reporter certificate (the highest captioning certification in the nation) and the Registered Diplomate Reporter certificate (arguably the most prestigious reporting certificate in the nation). Fewer than 100 people nationwide hold both certificates. I also hold a speed certificate issued by the National Shorthand Reporters Association for two-voice dictation at 260 words per minute, and a speed certificate issued by the California Court Reporters Association for two-voice dictation at 270 words per minute. These are among the highest speed certifications available in the nation. For the last three-plus years, I have been engaged full-time as a captioner for Bay Area Captioning, Inc., providing captioning several hours a day for San Francisco´s NBC affiliate KRON-TV, as well as captioning for several professional sports teams. I am also a past member of the Closed Captioning Subcommittee of the National Court Reporters Association.
Next, it´s important to understand that my business provides only realtime captioning and does not provide post-production services, so this article will only address issues having to do with realtime captioning. The difference is, post-production captioning works with broadcasts already on videotape, while realtime is captioning live broadcasts using captioners to write what is being said, as it is being said.
Here´s a brief description of how realtime captions are created. In a nutshell, in order to convert speech to text virtually instantly, it requires the skills of a highly trained stenotype writer, sometimes referred to as a stenocaptioner, or simply "captioner." It is usually someone who had a career as a verbatim court reporter in the legal field before becoming a captioner. The same skills used to take verbatim testimony are even more finely tuned for accuracy and vocabulary. For the most part, the captioner views the broadcast on television, just as all the other viewers do. The captioner writes the broadcast on a specially equipped stenotype machine, which is connected to a computer that translates the captioner´s writing, then sends the English text to the station. It is then encoded on the television signal. When you consider the process, from spoken word to encoded text appearing on your television screen at home, a 1.5- to 2-second delay doesn´t seem very long.
Realtime captioning can be further broken down into several categories, depending on type and amount of realtime in a particular broadcast. One category is the 100% realtime broadcast. That would include sporting events, news conferences, talk shows and emergency broadcasts. Some talk shows and national news programs are originally written all or partly in realtime, then later scripted for replay in a different time zone. Most local news is a hybrid or blended mix of live and scripted programming. Captioners download teleprompter scripts from the stations and then control the feeding of the scripted captions, only providing realtime captions when scripts are not available or reporters deviate from the script. For example, realtime would be routinely required for late-
breaking stories, weather and sports.
Often you will see stories where the captions read "NO CAPTIONING IS AVAILABLE FOR THIS STORY." That is an indication that the station you are watching only provides "passive" captioning, where the teleprompter scripts read by the anchors are provided, rather than true realtime captioning involving a captioner.
So far, everything I´ve written has just been background for what follows, namely, the reasons for some of the different errors seen in realtime captioning. I´ve identified at least 9 different categories of errors.
"There is no mistake; there has been no mistake; and there shall be no mistake."
- Duke of Wellington -
MISTRANSLATES - Unlike a typist who types a character at a time to create words and sentences, the captioner presses several keys at a time to create either syllables, words, or in some cases entire phrases. The entire process of machine stenography is based on phonetics and the way words sound. But words that sound alike must be differentiated when written by the captioner in order for the computer to correctly translate the steno strokes into English. For instance, the words "nun" and "none" sound exactly alike, but have entirely different meanings. They must be written differently in realtime by the captioner. Sound easy? Consider that my captioning dictionary currently consists of over 100,000 separate entries. I must not only remember how to differentiate the words in my dictionary, but I must also be aware of what words are not in the dictionary, that must be constructed or fingerspelled, letter by letter.
The keys of the stenotype machine are in an order that is confusing to the layperson. The keys are arranged in this order:
The keys to the left of the "A" are controlled by the left four fingers, the keys to the right of the "U" are controlled by the right four fingers, and the vowels are controlled by the thumbs. To write the word "POP" a captioner presses the "P" on the left, the "O" in the middle, and the "P" on the right, all at the same time. Since not all the keys appear on the stenotype keyboard, arbitrary letter combinations are assigned for sounds. For the beginning "L", the letters HR are pressed. In order to write "LOP", a captioner presses the combination of keys HR O P. If the captioner doesn´t come down hard enough on the "R" key, the resulting stroke is H O P, another word, but the wrong one. If instead the captioner doesn´t come down hard enough on the "H" key, the result is a nonword, R O P. It´s possible that a nonword like R O P could be a "brief" for an entire phrase, such as "RATE OF SPEED." So you can see how a subtle change in a stroke can result in vastly different words when translated into English text.
Captioners are sometimes pressing as many as 20 keys simultaneously. Mistranslates or misfingerings can be compared to hitting the wrong key on a typewriter during a speed test. It´s been gauged that the dexterity required to write 200 words per minute on a stenotype machine is equal to typing 70 words per minute on a typewriter. Imagine typing 70 words a minute for 30 minutes (a news broadcast), to four hours at a time (a long baseball game), and you get some sense of the difficulty. Factor in speeds exceeding 260 words per minute and overlapping speakers, and the difficulty increases. At only 200 words per minute for a 22-minute newscast (30 minutes, minus commercials), that comes to 4400 words written. Writing with only a 1% error rate would still allow 44 errors during a half-hour broadcast, or two errors per minute. Keep in mind, this is new, unrehearsed material, with speeds outside the control of the captioner. Most captioning agencies consider 1.5% error rate "acceptable" for beginning captioners.
PHONETIC TRANSLATES - When a word is not in the computer´s dictionary, the computer falls back on a list of rules to attempt to translate the word phonetically. Often apparent incorrect spellings are actually phonetic translates from the computer.
WORD BOUNDARY - These are the most difficult errors to correct because they are often not apparent to the captioner until they occur as errors. An example of a word boundary problem I personally experienced occurs in this sentence: "FOR BREAKFAST, WE WILL HAVE EGGS AND APPLE SAUSAGE." I had "applesauce" in my dictionary, and the computer translated this sentence as "FOR BREAKFAST WE WILL HAVE EGGS AND APPLESAUCEAGE." Another time I "cheated" and put "LITTLE WOMEN" in my dictionary with quotation marks around it because it was a popular movie at that time and was frequently referred to in broadcasts. One night in a sports broadcast, it came up as:
Shortcuts, such as defining movies with quotation marks, always come back to haunt the captioner.
Here´s another example of a word boundary problem. The name "Beauchamp" is sometimes pronounced "BEECH UM", and it´s in my captioning dictionary that way. One day while captioning a baseball game, it happened to be Beach Umbrella Day. The computer interpreted the first two strokes for "Beach Um" as "BEAUCHAMP" then phonetically translated the remaining strokes. The resulting translation read something like this: "TODAY IS BEAUCHAMP BREL A DAY AT THE BALL PARK."
Still another word boundary example is two words that are sometimes joined and sometimes separate words. "Wildlife/wild life" is one such example that must be written differently, depending on which spelling is correct. First the captioner must discern which spelling is appropriate, then write the correct strokes for that spelling.
PILING or STACKING STROKES - At the high rates of speed that captioners write, we often barely have one finger leaving the keyboard, while another finger is approaching the keyboard. At 260 words per minute, the captioner is writing approximately four strokes per second! Sometimes two strokes are so close in succession that the stenotype machine registers two strokes as a single stroke. That is known as "piling" or "stacking". Often the "ed" or "ing" endings of words are stacked with other words and the computer translates them as another word or phrase.
"Zounds! I was never so bethump´d with words."
- William Shakespeare -
NEW WORDS - Some words are not in our dictionaries because they are new terminology in the language, or we just have not encountered them yet. Despite the fact that my dictionary contains over 100,000 separate entries, common words that we think are in our dictionary may not be, and as a result they come out badly spelled. "MOUNTAINEERING" came up in a recent broadcast for me. The first time it came up, it translated "MOUNTAIN EARRING". The second time I tried a different way of writing it, and it translated "MOUNTAIN NEARING". Many new high-tech words first appear in news broadcasts, and we either construct these words from other word parts or "fingerspell" them a letter at a time, each letter requiring a separate stroke. That can be difficult for long, technical words, especially when you consider we have to "carry" the rest of the story in our heads until we can finish spelling and then catch up to the speakers. Names of foreign leaders and cities fall into this category. In fact, you could include in this category all proper names.
"Rembrandt´s first name was Beauregard, which is why he never used it."
- Dave Barry -
While there are several hundred common names, captioning is full of uncommon names. Take sports, for example. One three-minute sports broadcast can conceivably cover a number of different sports, including baseball, basketball, football, tennis, golf, hockey, automobile racing and the Olympics, to name but a few. Basketball alone can be broken down into professional, college, high school, Olympics, and both men´s and women´s teams.
There are several hundred college teams, each with 15 or more players on each team. Among professional baseball players currently active, there are approximately 20 players with various spellings of the first name SHAWN/SHAUN/SHAWON/SEAN, all pronounced exactly the same. There are two baseball players with the names SCOTT SERVICE and SCOTT SERVAIS. Both last names are pronounced the same. One is a catcher; one is a pitcher. The Oakland A´s have a catcher named Terry Steinbach. The name is common enough, but announcers mispronounce the name an incredible number of different ways. STINE BOK, STEEN BOK, STINE BEK, STINE BAK, STEEN BEK, STEEN BAK, and so on. The computer has to be programmed to accept all those different pronunciations and still spell the name correctly.
"We experience moments absolutely free from worry. These brief respites are called panic."
- Cullen Hightower -
HEARING - Captioners can have problems correctly hearing what is being said. We can simply mishear someone. During a recent cooking segment, they were going to prepare "not whole eggs." It turned out they were saying "knothole eggs" where a hole (knothole) was cut in toast and the egg cooked in the hole. We sometimes have to deal with heavy accents and distraught people who are crying or screaming, usually in very stressful situations. Questions from the audience at news conferences are especially difficult to hear. Usually the questioner is not within range of a microphone and it´s impossible to hear the question.
At times we try to identify an off-camera sound and we are mistaken. I once captioned a report from a farm and wrote [ MOOING ]. As it turned out, the animals were sheep! Captioners who cannot see the broadcast are at even more of a disadvantage. The new mayor of San Francisco donned a cap that had imprinted on the front, "DA MAYOR", but the captioners who could not see the printing on the cap continued to write "THE MAYOR".
Mishearing is bad enough, but we also have to deal with people misspeaking. I have had sportscasters announce the wrong team as the winner. I am then presented with the dilemma of writing what was said, which I know is incorrect, or "editing" and writing what I know to be correct. What happens when you write the correct information, rather than what was said? More often than not the broadcast can revolve around the mistake. If I have written anything other than what the speaker actually said, then the banter referring to the error is lost on the viewers who depend on the captions.
A hot topic among caption viewers is the bleeping out of certain words, usually of the four-letter variety. There are two ways bleeping occurs. The first is actually a reflection of what was said in the broadcast. Frequently swear words are bleeped from the audio portion, and in that case, the captioner does not dare lip-read and insert what they think was said. They are captioning the audio of the broadcast. The other possible occurrence of bleeping could come at the insistence of the station by policy. In other words, the captioner and caption company could be directed that in the event of certain unacceptable words being spoken, the captioner is to bleep the word. In that case, don´t blame the captioner. Write the station to complain. If the captioner disobeys the station´s policy, the station will just hire a caption company that will follow the policy. The solution is to get the station to change its bleeping policy, if you´ll pardon the play on words.
One message from a caption viewer showed a complete lack of understanding of how captioners do their job, and I want to mention it specifically. This particular writer was irate that captioners display [ SPEAKING SPANISH ] when someone is speaking in Spanish. He demanded that the captioner write the actual Spanish words or other foreign language being spoken so that he and others who understand and read the foreign language could see the words for themselves. Speaking for myself, I am barely able to discern which foreign language is being spoken, let alone have foreign words in my dictionary. I also would have no idea how to fingerspell them, even if I had the time to fingerspell an entire sentence in a foreign language.
"The speed of the boss is the speed of the team."
- Lee Iacocca -
SPEED - Speed creates more errors than any other single factor. Many of the errors listed above are speed-related. In court settings, if anyone speaks too quickly or cannot be heard, the reporter can interrupt the proceedings and ask them to repeat what was said. With broadcast captioning, the captioner has no control of the speed. The most debated aspect I see in deaf discussions of captioning is the topic of editing versus verbatim captions. Most people who rely on captioning are adamant that they want verbatim captions with no editing. Make no mistake about it: No captioner in the world can write verbatim realtime captions 100% of the time. They all try to write as much verbatim as possible, but true verbatim realtime captioning is not possible.
There are several reasons for that. One is the speed of the newscast in general. Most stations are trying to pack a lot of information into a short span of time (especially the poor sports anchors who get about three minutes total air time). Some stations stake claim to the most information presented in each broadcast. They achieve that by having anchors and reporters who speak even faster. Overlapping speakers must be factored into speed as well. If two speakers are both speaking at the same time, each at 200 words per minute, for the period of the overlap, the word-per-minute rate then escalates to 400 words per minute. The captioner is required to remember what both speakers said and attempt to sort out the sense of what each said, and then write it in a logical order for reading -- all at 400 words per minute. I recently captioned an interview with a hat shop owner and three anchors, all trying on hats and remarking on the different styles, all while I was trying to fingerspell "Borsolino."
Also, while captioners try to include background sounds such as sirens and screaming, etc., sometimes the pace of the speakers prevents any attempt to convey ambiance. I recently had a story about a sound library, but I was unable to describe any of the sounds because the pace of the interview prevented it. I could either describe the sounds or write the speakers, but I could not do both.
"We´re all capable of mistakes, but I do not care to enlighten you on the mistakes we may or may not have made."
- Vice-President Dan Quayle -
SCRIPTING ERRORS - Sometimes there are mistakes that are not entirely the fault of the captioner. Most local newscasts are a combination of station scripts, combined with realtime captioning from a captioner. Usually the captioner downloads the news scripts prior to the broadcast from the station´s computer, then misspellings are corrected in the news scripts. During the broadcast, the captioner is controlling the flow and display of that news script, as well as jumping in live to provide realtime closed captions where necessary. It´s a delicate balancing act for the captioner to download the most current script possible, while still leaving enough time to clean up the script and convert it for live display.
In addition, in order to adjust the display of the script to match the captions to the story, the captioner must lead the anchors´ reading slightly. At times, the news changes between the time the captioner prepared the script, and the time of the live broadcast, and the captioner doesn´t realize it until it´s too late. On one occasion, I was feeding a script that read "JOHN DOE WAS EXECUTED A HALF HOUR AGO..." I realized the anchor was saying, "THE EXECUTION OF JOHN DOE WAS STAYED..." At that point, all I could do was stop feeding the script, write that the execution was stayed, and pick up as best as possible. There can also be typographical errors in the script that are not apparent. Here´s an example: The script read, "THE TRANSIT SYSTEM IS NOT SAFE." The mayor was actually saying something quite different: "THE TRANSIT SYSTEM IS NOW SAFE."
"To err is human. To really foul things up, you need a computer."
- Anonymous -
HARDWARE and SOFTWARE - Some problems occur as a result of hardware malfunctions or software peculiarities. One evening, minutes before a broadcast, I discovered that the computer or the software had trashed about 5% of the words in my dictionary. Among the words missing from my computer´s vocabulary were HAD, YOU, NAME, AN, HE, BACK, DAYS and NUMBER. Any time those words came up, the computer provided a phonetic translation.
Many times in every broadcast, the captioner will write an incorrect stroke and realize it. If there is sufficient time, the captioner can hit a key telling the computer to ignore the incorrect stroke, and then the correct stroke can be written. However, if the captioner does not catch the incorrect stroke in time, the computer may display all or part of the incorrect word, causing a mistake like translating "lesbians" into "less beans".
Because it´s virtually impossible to put every word in the computer dictionary, good captioning software will exercise a small amount of artificial intelligence. For instance, I have the word "mace" in my dictionary, but I might not have "maces", "maced" and "macing" in my dictionary. I can define a stroke for the computer to create suffixes of "ing", "ed" and "s". However, the computer has to apply some spelling rules to make the words come out right. The computer would have to drop the "e" to add an "ing" or "ed" suffix, but leave the "e" for the "s" suffix. Occasionally spelling rules and artificial intelligence create misspellings, and that´s another possible reason for misspelled words.
"The higher up you go, the more mistakes you´re allowed. Right at the top, if you make enough of them, it´s considered to be your style."
- Fred Astaire -
BAD DAYS - Captioners are human, and all humans have bad days. The difference is, when a captioner has a bad day, several hundred thousand people can tell. It can be due to lack of concentration, illness, a particularly difficult broadcast or any number of other daily life distractions that dull the edge required to caption competently. If there´s any consolation for the viewer, no one feels worse about a lousy broadcast than the captioner. Some captioners react favorably by redoubling their efforts for the next broadcast, and some let it affect their mood for longer than they should. In any event, I´ve never met a captioner who didn´t try their hardest to provide the best captioning.
"Blessed is the man who has some congenial work, some occupation in which he can put his heart, and which affords a complete outlet to all the forces there are in him."
- John Burroughs -
I hope this article has served to give you some understanding of how difficult a job realtime captioning is and to provide some insight into the kinds of errors commonly seen. Keep in mind that this is live television, and mistakes WILL occur. From now on, when those errors do occur, perhaps you will be able to decipher some of them and have an understanding of why they occurred. Two points are always expressed by every top-notch captioner I´ve ever met: Captioning is the most difficult job they´ve ever had; captioning is the most rewarding job they´ve ever had. Personally, no matter how difficult it is, I wouldn´t trade it for any other job in the world.