Spellcheck My Doc, Before It's Too Late
Writing is its own skill with its own set of challenges. Instantiating thought into representative syntax is fraught with the potential for errors and issues. It’s axiomatic of the human condition but, also axiomatic to the human condition, people have created some tools to assist with that. Browsing the site you may have seen other references to vale. You may have even already seen the little slugs at the bottom of every post with a link to this page. If you were curious then you’ve come to the right place. We’re going to talk tech writing.
Table Of Contents
Trying out tech writing
I took a technical writing class in university and I underestimated how challenging it was going to be. The idea of constructing communication to be precise is standard when you think about the syntax of a (some) programming languages. Even the loosely typed languages are going to be more precise than the way that humans interact via language. I’m particularly thinking of English and its inherent ambiguity. Yet it still has a complicated set of grammar rules that make it difficult to learn. Through this one singular 3-credit course it quickly became obvious to me that precise communication that was technical in nature required forethought. The lesson that took longer to sink in was that precise technical communication also required revision.
Revision can easily take longer than the actual writing process. Checking, and double checking are important especially when the work product is factual in nature. It’s critical that the meaning, and intention, are clear and, as previously stated, English is ambiguous. To address this there is a burgeoning movement in the tech space called Write the Docs which promotes documentation as an asset. Through this community I, ultimately, discovered vale. I’ve since set up vale in the pipeline for this website and there will be some information at the bottom of each post based on the output from it.
What vale can do
Vale is a capable context-sensitive linter. In this section there will be some examples from documentation, from other sources, and from myself. The unifying statement here is that vale can leverage technology alongside proper configuration to improve consistency and overall quality in technical writing. You may find some of my posts which retain a significant number of errors and warnings. These are a badge of honor. My personal site, overall, is not a technical document. It’s an expression of me as a person involved in technology which is less formal than technical documentation.
Installation
I used Homebrew to install vale. I’ve harped upon the benefits of Homebrew before so I’ll save the diatribe. Having Homebrew installed, it would be as simple as typing brew install vale
. There is a two-part post on the topic of how my pipeline is set up as well as several posts detailing changes so understand that this may not match up with what your environment is like. If you’re reading this far after the post has gone up then it’s possible that I’ve already changed the pipeline. Browsing the pipeline tag may be productive.
That’s it, though. Homebrew will handle it. This holds true for macOS, as well, which is great.
Configuration
Out of the box, vale doesn’t do much at all. This is the most laborious part of the process. People, in the world, know this and have made things easier. vale has the concept of a “style” is a collection of rules that can trigger alerts like errors, warnings, and suggestions. The quickest way forward is to pull a style from an established project. Note that there are different types of styles and, while most are focused on technical documentation, there are other styles that are aimed at other targeted works. I’ve seen styles from Google, Homebrew, and have implemented on this site the style from Microsoft as well as others.
The configuration of vale depends mostly on a .vale.ini
file which is fairly straightforward. It does not require a significant amount of information.
StylesPath = styles
[*.md]
BasedOnStyles = marktoso, Microsoft
MinAlertLevel = suggestion
This is the .vale.ini
that is presently in use (at the time of writing, not reading) on this site and has a declaration for StylesPath which, importantly, is relative to where you would be executing vale
from. Similarly, .vale.ini
should be in the directory from where you are executing vale
from however there are commandline flags that can override that. This is executed in the pipeline therefore .vale.ini
is located in the root of the repository and styles
is a directory at the root of the repository.
The [*.md]
declaration informs vale that the following is to be applied to *.md files (Markdown) which are the formats that I use for writing my posts in Hugo. BasedOnStyles
indicates which styles to apply to this type of file. The styles are located in subfolders and their declaration should match their folder name. I’d like to point out that the rules that I made were more for testing purposes than that I actually have a style that I adhere to. This is a personal blog. I will do what I like.
styles
├── Microsoft
│ ├── AMPM.yml
│ ├── Accessibility.yml
│ ├── Acronyms.yml
│ ├── Adverbs.yml
│ ├── Auto.yml
│ ├── Avoid.yml
│ ├── ComplexWords.yml
│ ├── Contractions.yml
│ ├── Dashes.yml
│ ├── DateFormat.yml
│ ├── DateNumbers.yml
│ ├── DateOrder.yml
│ ├── Ellipses.yml
│ ├── FirstPerson.yml
│ ├── Foreign.yml
│ ├── Gender.yml
│ ├── GenderBias.yml
│ ├── GeneralURL.yml
│ ├── HeadingAcronyms.yml
│ ├── HeadingColons.yml
│ ├── HeadingPunctuation.yml
│ ├── Headings.yml
│ ├── Hyphens.yml
│ ├── Negative.yml
│ ├── Ordinal.yml
│ ├── OxfordComma.yml
│ ├── Passive.yml
│ ├── Percentages.yml
│ ├── Quotes.yml
│ ├── RangeFormat.yml
│ ├── RangeTime.yml
│ ├── Ranges.yml
│ ├── Semicolon.yml
│ ├── SentenceLength.yml
│ ├── Spacing.yml
│ ├── Suspended.yml
│ ├── Terms.yml
│ ├── URLFormat.yml
│ ├── Units.yml
│ ├── Vocab.yml
│ ├── We.yml
│ ├── Wordiness.yml
│ └── meta.json
├── Readability
│ ├── AutomatedReadability.yml
│ ├── ColemanLiau.yml
│ ├── FleschKincaid.yml
│ ├── FleschReadingEase.yml
│ ├── GunningFog.yml
│ ├── LIX.yml
│ └── SMOG.yml
└── marktoso
├── Kiss.yml
├── Readability.yml
├── Spelling.yml
├── Substitute.yml
└── TresComas.yml
Usage
Usage is very straightforward. If you just want to check everything all at once, and you’re starting at the root of a directory structure where your posts are in content/posts, all you need to do is vale content/posts
. This command will run vale on every file in that directory. You can substitute the path to the directory for the path to an individual file.
Readability
style is actually giving me some issues in vale on macOS. Unsure if it’s limited to this particular machine or not but it throws a NaN error for just about every rule.Hype beast linting
Styles are extensible however the best way to understand their implementation is to see some examples. One type of rule that didn’t quite make sense to me with the example provided in the docs was the spelling rule so I’ll break down one that was working. First, however, a Hunspell dictionary is required. errata.ai kindly provided an english dictionary to be used.
cd .. #get out of your project directory. we'll call it "hugo"
git clone https://github.com/errata-ai/en_US-web.git
mkdir hugo/assets/dictionaries
cp en_US-web/src/* hugo/assets/dictionaries
By creating a Spelling.yml
in a subfolder of the styles directory (this is in marktoso
for me but feel free to experiment. Feel out the room. Explore the space) and adding the lines below you’ve created a rule in that style.
extends: spelling
message: "Did you really mean '%s'?"
dicpath: assets/dictionaries
dictionaries:
- en_US-web
level: error
ignore:
- assets/ignoreWords.txt
dicpath
is the relative path from wherevale
runs in my case.dicpath
is a folder that can hold many dictionaries. They should be named uniquely (in this case the dictionary retrieved wasen_US-web.aff
anden_US-web.dic
). Hunspell-compatible dictionaries use a pair of files,.aff
and.dic
, and both are required to have the same name.dictionaries
is which, of potentially many, dictionaries are to be applied in this rule. In most cases it won’t be important however this allows for the flexibility when assigning dictionaries to rules. Assigning multiple dictionaries is possible as this is a yaml list.level
is which of the three (error|warning|suggestion) levels of flag to be thrown when this rule hits.ignore
is a path to one or many text files that have one word per line. These words are to be ignored as spelling errors when checked against the above dictionaries.
The proof of the pudding is in the linting
I’m going to save this file as I’m working on it and run vale
on it. These are the results:
content/posts/spellcheck-my-doc-before-its-too-late.md 10:58 error Did you really mean marktoso.Spelling 'Instatiating'? 10:558 warning Try to avoid using Microsoft.We first-person plural like 'We'. 15:1 warning Use first person (such as 'I Microsoft.FirstPerson ') sparingly. 15:51 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 15:246 error Did you really mean 'loosly'? marktoso.Spelling 15:346 warning Use first person (such as Microsoft.FirstPerson 'I'm') sparingly. 15:541 warning Consider removing 'quickly'. Microsoft.Adverbs 15:567 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 17:14 warning Consider removing 'easily'. Microsoft.Adverbs 17:195 error More than 3 commas! marktoso.TresComas 17:455 warning Use first person (such as ' I, Microsoft.FirstPerson ') sparingly. 17:459 warning Consider removing Microsoft.Adverbs 'ultimately'. 20:328 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 20:423 warning Use first person (such as Microsoft.FirstPerson 'My') sparingly. 20:450 error Use 'isn't' instead of 'is Microsoft.Contractions not'. 20:501 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 23:1 warning Use first person (such as 'I Microsoft.FirstPerson ') sparingly. 23:254 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 23:333 error Did you really mean 'taht'? marktoso.Spelling 28:696 error Did you really mean 'teh'? marktoso.Spelling 30:73 warning Consider removing 'fairly'. Microsoft.Adverbs 30:100 error Use 'doesn't' instead of 'does Microsoft.Contractions not'. 38:25 error Use 'that's' instead of 'that Microsoft.Contractions is'. 38:286 error Did you really mean marktoso.Spelling 'commandline'? 40:9 error Did you really mean 'md'? marktoso.Spelling 40:126 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 40:145 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 40:259 error Did you really mean marktoso.Spelling 'subfolders'? 40:365 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 40:413 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 40:442 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 40:495 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 103:10 warning Consider removing 'very'. Microsoft.Adverbs 103:290 error Did you really mean marktoso.Spelling 'substitue'? 104:73 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 108:245 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 115:35 error Did you really mean marktoso.Spelling 'subfolder'? 115:96 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 126:60 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 126:183 error Did you really mean marktoso.Spelling 'retrived'? 127:33 warning Consider removing Microsoft.Adverbs 'potentially'. 127:96 warning Consider using 'usually' Microsoft.Wordiness instead of 'In most cases'. 127:267 error Did you really mean 'yaml'? marktoso.Spelling 132:1 warning Use first person (such as Microsoft.FirstPerson 'I'm') sparingly. 132:32 warning Use first person (such as Microsoft.FirstPerson 'I'm') sparingly. ✖ 15 errors, 30 warnings and 0 suggestions in 1 file.
But what does it all mean?
The point that I’m attempting to make here is that writing–human writing–is fraught with error. Writing within the constraints of a style can be very challenging when a style is designed in a way to strongly shape content for a specific purpose yet the content it is being applied to is broader or more informal as in the case of this blog and its posts. The lack of first person would be absurd as this is a personal blog and contains personal communication.
The spelling will be corrected but there is no covering up of flaws here as the intent is to be practically educational. The misspelling of “md” is particularly telling because the line and character in question is when I typed [*.md]
. I’m unsure as to whether or not I will add that to ignoreWords.txt
because I would like to be alerted if I simply typed “md” somewhere. Yet, in this case, it triggers an error. This could prove difficult when using the error condition to flag the build as failed in the pipeline.
The other warnings, like the ones for contractions or adverbs, will help shape a consistent corpus of documentation or prose. In the even that your organization already possesses a mandated style then, by all means, adhere to it however, it is important to use discretion and judgement in applying these rules when the scope of your work product is outside of styles generated by companies for business use. Why, then, would I pursue such an endeavour? The reason, I say to you, is because it’s cool.
Styles, by the way, are not the arbiter of quality–the authors are no less human (for the most part). I’ve read styles where the rules expressly check for, and forbid, the Oxford comma which, in my opinion, is a grave mistake. Additionally there are documentation style guidelines that insist on only the most simple “narrative”, for lack of a better term, which removes a certain amount of utility from documentation. I don’t feel judged by vale or the assigned styles, either. It is an impersonal parser that isn’t passing judgement on “quality” but on adherence to rules. Which can amount to quality over a large-enough body of work as it becomes more consistent but that’s a post for another blog.
I will have (and may already have) a small blurb at the bottom of every post with information from vale
. In the sequel to this post I will be sticking this in my pipeline and smoking it.
./content/posts/spellcheck-my-doc-before-its-too-late.md 10:559 warning Try to avoid using Microsoft.We first-person plural like 'We'. 15:1 warning Use first person (such as 'I Microsoft.FirstPerson ') sparingly. 15:51 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 15:246 warning Consider removing 'loosely'. Microsoft.Adverbs 15:347 warning Use first person (such as Microsoft.FirstPerson 'I'm') sparingly. 15:542 warning Consider removing 'quickly'. Microsoft.Adverbs 15:568 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 17:14 warning Consider removing 'easily'. Microsoft.Adverbs 17:195 error More than 3 commas! marktoso.TresComas 17:455 warning Use first person (such as ' I, Microsoft.FirstPerson ') sparingly. 17:459 warning Consider removing Microsoft.Adverbs 'ultimately'. 20:328 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 20:423 warning Use first person (such as Microsoft.FirstPerson 'My') sparingly. 20:450 error Use 'isn't' instead of 'is Microsoft.Contractions not'. 20:501 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 23:1 warning Use first person (such as 'I Microsoft.FirstPerson ') sparingly. 23:254 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 30:73 warning Consider removing 'fairly'. Microsoft.Adverbs 30:100 error Use 'doesn't' instead of 'does Microsoft.Contractions not'. 38:25 error Use 'that's' instead of 'that Microsoft.Contractions is'. 40:126 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 40:145 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 40:365 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 40:413 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 40:442 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 40:495 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 103:10 warning Consider removing 'very'. Microsoft.Adverbs 104:73 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 108:245 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 115:96 warning Use first person (such as Microsoft.FirstPerson 'me') sparingly. 126:60 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 127:33 warning Consider removing Microsoft.Adverbs 'potentially'. 127:96 warning Consider using 'usually' Microsoft.Wordiness instead of 'In most cases'. 132:1 warning Use first person (such as Microsoft.FirstPerson 'I'm') sparingly. 132:32 warning Use first person (such as Microsoft.FirstPerson 'I'm') sparingly. 219:28 warning Don't use end punctuation in Microsoft.HeadingPunctuation headings. 220:16 warning Use first person (such as Microsoft.FirstPerson 'I'm') sparingly. 220:150 warning Consider removing 'very'. Microsoft.Adverbs 220:266 error Use 'it's' instead of 'it is'. Microsoft.Contractions 222:220 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 222:239 warning Use first person (such as Microsoft.FirstPerson 'I'm') sparingly. 222:256 warning Consider using 'whether' Microsoft.Wordiness instead of 'whether or not'. 222:270 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 222:315 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 222:345 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 224:239 error Use 'it's' instead of 'it is'. Microsoft.Contractions 224:425 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 224:465 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 226:21 error Use 'aren't' instead of 'are Microsoft.Contractions not'. 226:156 error More than 3 commas! marktoso.TresComas 226:197 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 226:319 error Punctuation should be inside Microsoft.Quotes the quotes. 226:420 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 226:481 error Use 'it's' instead of 'It is'. Microsoft.Contractions 228:1 warning Use first person (such as 'I Microsoft.FirstPerson ') sparingly. 228:134 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly. 228:162 warning Use first person (such as Microsoft.FirstPerson 'my') sparingly. 230:210 warning Use first person (such as ' I Microsoft.FirstPerson ') sparingly.✖ 10 errors, 48 warnings and 0 suggestions in 1 file.