Parsing Sentences for Easy Reading

For the past year, I’ve had the pleasure to work on and off for a Toronto-based consultancy that focuses on accessibility. In addition to doing training, accessibilty audits, and development, they provide video captioning and description services, which is where I fit in.

Unlike those mysterious beings that close caption live television, I caption videos – generally university lectures. It’s interesting work, and was a lovely part-time job while I was finishing up my Masters. I’ve kept up with it because I enjoy it… and extra pocket money is always enjoyable. What I did not realize when I started, however, was that it would also have a lasting (positive) impact on my work as an IA/tech writer.

You know how sometimes, you’re reading, and a line break hits in the wrong spot, and it’s weird? Bad sentence parsing. I usually only notice if it’s two or three lines of text, like a PowerPoint title, or subtitles on a video. Poor sentence parsing makes it harder for readers to read smoothly: in captioning in particular, you want people to be able to read quickly and glean all the information they’re reading without becoming confused by a subject-object split. As a person whose focus is on expressing concepts clearly, this has been something of a revelation. While I’m not a designer, I necessarily do some design in my work, particularly since I’ve started freelancing. Parsing is something that I increasingly think about, and I think it’s making my visual work easier to understand.

Take, for example, the post-it note that I have had stuck to my monitor, reminding me to write this post:
Post it note reads Parsing Sentences for Easy ReadingThat’s pretty solid. I’m happy with that parsing. I might easily have written this, though:

Parsing Sentences for
Easy Reading

Hopefully you see that this is a less smooth reading experience. Let’s take another example, this time borrowed from a fantastic Girl Geeks Toronto event that I went to last night (recap post to follow). The title of the event was “Designing for Digital: Processes and Planning for Powerful Solutions.” Here are two ways we could write that on the title slide of our presentation deck:

Designing for
Digital: Processes
and Planning
for Powerful Solutions
Designing for Digital:
Processes and Planning
for Powerful Solutions

I think the second one is easier to parse, and easier to read. With presentation software and design software, where you create text boxes and write in them, it’s easy to rely on the automatic text wrapping. I have started to fight that urge, and to think about how best to manage my line breaks. I’m not advocating for manually line breaking everything, but for titles or short, punchy sentences, it’s something to think about. If reading it aloud sounds wonky, try moving the line break. Your readers will thank you, even if they don’t realize it.

Edited to add:

Another excellent example care of Paul, from a Zipcar ad in the New York subway:

My Grade 7 & 11 English Teacher and the Word ‘Got’

As I’ve alluded to before, my high school program was pretty sweet. Its enriched program, which I was in, was challenging and fun, and meant that we had the best teachers at the school teaching us. Many of them were older, and most have since retired to lives of contemplation, pigeon racing (seriously!), poetry, and music. The youth of today don’t know what they’re missing.

I learned a lot in high school: a strong basis in chemistry and the physical sciences, the ability to play the french horn, a remarkable willingness to dissect deceased vertebrates without wearing gloves (why didn’t they give us gloves?), a general understanding of archery, a weirdly in-depth knowledge of the history of Quebec, and the ability to throw a football, among others.

Throwing a football has come in handy (my boyfriend’s brother, who played football in college, was impressed, for instance), but the thing that comes to mind most often, particularly as I’ve started writing professionally, is a lesson I learned first in Grade 7, and then again in Grade 11, when Mister Holt was my English teacher.

Mister Holt is a fantastic teacher. I say ‘is’, because, while I understand that he has retired, the lessons he taught me have stuck. One of the core components of English at my high school was public speaking. Most teachers made their students get up in front of the class and give a speech that they’d written. There was usually a list of topics. I recall “Nature versus Nurture” being one, as well as “Mass Media”. (It was obviously the late 90s / early 00s.) Mister Holt didn’t hold with that kind of hokum for we young grade sevens: he made us learn card tricks. Card tricks, he said, gave you something to do with your hands, while also forcing you to speak, and to keep your audience’s attention. He handed out a twenty-page photocopied packet of papers that detailed a bunch of card tricks: I remember Wild Bill Hickok’s Hand and Houdini’s Double Talk Card Trick as the hardest. They both required a great deal of memorization, and the ability to engagingly tell the story that went along with the cards. I’m 90% sure I can still do Houdini’s Double Talk, but I never really got the hang of Wild bill. After two weeks of practice, we performed the card tricks one-on-one with Mister Holt. It was delightful, and a nice way to ease into the world of public speaking. Come grade eight, the concept of speaking was far less terrifying – it’s not like you had cards to screw up!

But, while that stands out, and – I think – helps to illustrate his somewhat unconventional approach to English education, card tricks are not what come to mind most days.

Mister Holt hated the word ‘got’. Got, he argued, was a lazy word. There is almost always a better word to express what you are trying to express. “I got home”, while perhaps accurate, is weaker than “I arrived home”. “I got an A” less expressive than “I earned an A”, “I’ve got the chicken pox” less evocative than “I have fallen ill with the chicken pox” or “I’m beset with the chicken pox” or “Shit, I’ve finally caught the chicken pox”.

As I recall, Mister Holt was a published writer of either short stories or poetry. I can’t find any trace of that online, so perhaps I’m mistaken, but in any case, much of our writing was of a creative nature. As such, using strong descriptive language was particularly prized. As we started writing more essays, precision of language only got became more important. I asked some friends from high school about this, and everyone who had Mister Holt as a teacher seems to have a little voice in the back of their heads preventing them from writing ‘got’ or ‘get’. My friend who just finished law school is particularly susceptible.

As a tech writer, precision of language is everything. The other day I was typing a sentence, started to write ‘got,’ and said “No!”. An moment later, I had gone in a different direction. Teachers impact our lives: Mister Holt taught me the value of choosing my words carefully (and I always have a card trick up my sleeve for parties, pun intended); Madame Azar, the majority of my French grammar; and Mister Pharès taught so much math and physics that I didn’t learn anything new until Calc 3, despite the three post-secondary math classes that preceded it.

I don’t write in French that much these days, though, and I haven’t needed to integrate in a long while… I write every day, though, and two university degrees don’t consciously impact me nearly as much as two years’ of Mister Holt’s instruction does. So, next time you go to type ‘got’, stop. Reread your sentence, and try to replace that ‘got’ with something better. Your writing will improve, and alumni of Mister Holt’s English classes won’t twitch when reading your wise words.

Better Writing is One Bundle/Package/Plugin Away

Two months ago, I downloaded and installed a writing tools bundle for TextMate 2, one of my favorite text editors. “English Highlight” as it is so innocuously named, does three awesome things:

  1. It highlights weasel words (few, very, fairly, quite, etc.)
  2. It highlights passive sentences (or, should I say, passive sentences are highlighted)
  3. It highlights duplicate words (not not that you’d ever do that).

Christopher Alfeld’s “English Highlight” is an adaptation of Matt Might’s shell scripts. Matt Might is an Assistant Professor at the University of Utah. He noticed that his students tended to ‘abuse’ the passive voice, use weasel words, and repeat words, so he wrote some bash scripts to identify these and integrated them into their LaTeX build. This has spawned a variety of plugins for common text editors. I’ve complied a list of plugins at the bottom of this post.

TextMate 2 with English Highlight screenshot

Screenshot from my TextMate: purple highlighting indicates weasels, passives, or repeated words.

I am not going to lie: it was demoralizing when I first opened up a file and saw tons of purple. Apparently nine years of post-secondary education (6 of which were in the sciences) bred a deep love of the passive voice. Similarly, two years of graduate school, where the answer to every question is ‘it depends’, may have left me generous with my ‘various’, ‘numerous’, and ‘few’s. Highlighting my shortcomings in purple makes it easy for me to identify areas that need work, and to quickly make my writing stronger and clearer.

Weasel Words

The thing about weasel words is that they rarely add to a sentence: they either make your sentence vague or unnecessarily wordy, neither of which is a positive. Admittedly, sometimes you want to say that something is ‘quite’ something. That’s cool! You’re allowed! You might not realize how often you say ‘quite’ or ‘very’, though, and if it’s not helping, it’s hindering.

I went looking for a wishy-washy sentence that I’d recently wrote, but couldn’t find one: it seems my highlighter has done the trick! I’m afraid to open any of my old research papers, so I’m borrowing an example from Matt Might:

Bad: False positives were surprisingly low.
Better: To our surprise, false positives were low.
Good: To our surprise, false positives were low (3%).

I know I have a tendency to overuse ‘various’, ‘numerous’, and ‘fairly’. Highlighting those words draws my eye back to the sentence and makes me think about ways I can improve it. Often it’s as easy as deleting the word.

Passive vs. Active Voice

The passive voice thing is less straightforward than the weasel words. The passive voice has historically held a hallowed position in the sciences, where the prevailing opinion seems to be that science should mysteriously emerge completely independent of the scientists who do it. For this reason, students have to write “10mg of magnesium were massed” in their lab reports, rather than “I massed 10mg of magnesium.” This may have been a contributing factor in my changing majors from chemistry to environment, where I was occasionally allowed to write as though I existed.

During the aforementioned environment undergrad, I attended a somewhat rebellious lecture by Linda Cooper. Linda Cooper is a lecturer at McGill who studies science communication and teaches classes on science writing. She argued that using “direct, active-voiced sentences” makes sentences stronger and easier to read, and that we should all stop blathering on endlessly in the passive voice and instead, choose to use the active when appropriate. It’s easy to see that she’s right when you compare passive-voiced to active-voiced sentences:

Original: If MMS is being run with DB Profiling enabled, further permissions are required.
Revised: If MMS is running with DB Profiling enabled, the user requires additional permissions.

While both sentences point to the same concepts: that running MMS with DB profiling means you’re going to have to do something with permissions, the first sentence is far more vague. What sort of ‘further permissions’ are we talking about? Permissions for MMS? Permissions for you-the-user? Some sort of network permissions? Who knows! The second sentence get to the point: the user requires additional permissions. In either case, the next paragraphs describe what those permissions are, but the revised sentence guides the reader more quickly to the correct answer.

There’s certainly times where passive sentences are appropriate: for instance, I haven’t managed to rewrite “MongoDB is designed specifically with commodity hardware in mind…” as an active-voiced sentence, and I doubt I will. Expunging all passives from the record isn’t the goal here: the goal is to write as clearly as possible, and to be more aware the choices you make when writing.


As mentioned above, Matt Might’s scripts have been adapted for a number of text editors. I particularly like the name of the emacs / vim mode. If you’re doing any sort of writing – technical or not – I highly recommend installing one of these extensions and trying it out. It makes a huge difference.

Classification, MongoDB Operators, and Kitchen Chairs

ROCM: Representation, Organisation, Classification, and Meaning-Making was a required class when I did my Master of Information. At the time, I complained incessently about how useless it was, how I was never going to use any of it in real life… Well, I was wrong.

A cat laying upon a Roomba

Despite being sat upon in the kitchen, this is probably not a kitchen chair. Credit: barbostick

Yesterday morning, I spent about four hours thinking about ROCM as I reorganized MongoDB’s operators page. One of our professors had a tendency to ask “What makes a kitchen chair a kitchen chair? Is any chair in a kitchen a kitchen chair? Is anything you sit on? If you sat on your dog while he’s in the kitchen, is he a kitchen chair? If you carry a chair that’s usually in your kitchen into the living room, is it still a ‘kitchen chair’?”, etc. The “kitchen chair” metaphor became something of a meme among my cohort, one which inevitably gets brought up whenever we’re together.  The crux of the kitchen chair debate is the question of what makes stuff fit into the categories it fits into. This is really the issue that anyone involved in classification or organization has to grapple with, hopefully with the knowledge that no solution is perfect, but some are better than others.

This week, the question was “What makes these MongoDB operators fit together?”. Some groupings were obvious: the ‘Logical’ operators (‘or’, ‘and’, ‘not’, and ‘nor’) go together nicely. Same with the operators related to geospatial queries, and those specific to arrays. My problem was what’s left: the operators we’ve now classified as ‘Comparison’, ‘Element’, or ‘Evaluation’.  In a sense, the ‘Element’ category is ridiculous: by their nature, queries involve elements… for that matter, all queries involve comparison so the ‘Comparison’ category is a bit fraught as well. But what’s an IA-minded technical writer to do? Either give up and put them all in one long, awful list (which is unideal), or categorize as best you can, knowing that it won’t be perfect (also unideal). Clearly, I went with the latter.

I think the categories we settled on will be meaningful for our readers, while also being factually accurate, which is really the goal. The fact that there’s some weirdness is unfortunate, but sadly unavoidable.

The inevitable failure of classification systems is a central theme of Geoffrey Bowker and Susan Leigh Star’s Sorting Things Out: Classification and Its Consequences. Bowker and Star’s examples really highlight the power of classification: they discuss tuberculosis patients, and the ways that wishy washy diagnoses ruined peoples lives; talk about the incredibly inconsistent and damaging categorization of people under Apartheid in South Africa, and discuss nursing interventions classification and its impact on both nurses’ and patients’ lives. Sorting Things Out is one of those texts that I cited in (nearly) every paper I wrote during my MI, because it always applied. At its core, the book is about highlighting the ways that classification is – or becomes – invisible, and how people grow to accept categories as natural.  As people who care about the ways that we structure information, recognizing the theory behind classification and its implications can make us better at determining an ideal structure for our use case, and at recognizing the ways that it might fail.

Sorting Things Out is on sale at Amazon right now, for under $20. I paid about $50 when I bought my copy, so now’s a good time to bone up. It’s a great read that has instilled in me the need to regard existing classification systems with a critical eye rather than accepting them at face value, and to be far more thoughtful when creating my own.