Spellcheck My Doc, Before It's Too Late

Posted on Dec 26, 2021

Writing is its own skill with its own set of challenges. Instantiating thought into representative syntax is fraught with the potential for errors and issues. It’s axiomatic of the human condition but, also axiomatic to the human condition, people have created some tools to assist with that. Browsing the site you may have seen other references to vale. You may have even already seen the little slugs at the bottom of every post with a link to this page. If you were curious then you’ve come to the right place. We’re going to talk tech writing.

Table Of Contents

Trying out tech writing

I took a technical writing class in university and I underestimated how challenging it was going to be. The idea of constructing communication to be precise is standard when you think about the syntax of a (some) programming languages. Even the loosely typed languages are going to be more precise than the way that humans interact via language. I’m particularly thinking of English and its inherent ambiguity. Yet it still has a complicated set of grammar rules that make it difficult to learn. Through this one singular 3-credit course it quickly became obvious to me that precise communication that was technical in nature required forethought. The lesson that took longer to sink in was that precise technical communication also required revision.

Revision can easily take longer than the actual writing process. Checking, and double checking are important especially when the work product is factual in nature. It’s critical that the meaning, and intention, are clear and, as previously stated, English is ambiguous. To address this there is a burgeoning movement in the tech space called Write the Docs which promotes documentation as an asset. Through this community I, ultimately, discovered vale. I’ve since set up vale in the pipeline for this website and there will be some information at the bottom of each post based on the output from it.

What vale can do

Vale is a capable context-sensitive linter. In this section there will be some examples from documentation, from other sources, and from myself. The unifying statement here is that vale can leverage technology alongside proper configuration to improve consistency and overall quality in technical writing. You may find some of my posts which retain a significant number of errors and warnings. These are a badge of honor. My personal site, overall, is not a technical document. It’s an expression of me as a person involved in technology which is less formal than technical documentation.

Installation

I used Homebrew to install vale. I’ve harped upon the benefits of Homebrew before so I’ll save the diatribe. Having Homebrew installed, it would be as simple as typing brew install vale. There is a two-part post on the topic of how my pipeline is set up as well as several posts detailing changes so understand that this may not match up with what your environment is like. If you’re reading this far after the post has gone up then it’s possible that I’ve already changed the pipeline. Browsing the pipeline tag may be productive.

That’s it, though. Homebrew will handle it. This holds true for macOS, as well, which is great.

Configuration

Out of the box, vale doesn’t do much at all. This is the most laborious part of the process. People, in the world, know this and have made things easier. vale has the concept of a “style” is a collection of rules that can trigger alerts like errors, warnings, and suggestions. The quickest way forward is to pull a style from an established project. Note that there are different types of styles and, while most are focused on technical documentation, there are other styles that are aimed at other targeted works. I’ve seen styles from Google, Homebrew, and have implemented on this site the style from Microsoft as well as others.

The configuration of vale depends mostly on a .vale.ini file which is fairly straightforward. It does not require a significant amount of information.

StylesPath = styles

[*.md]
BasedOnStyles = marktoso, Microsoft
MinAlertLevel = suggestion

This is the .vale.ini that is presently in use (at the time of writing, not reading) on this site and has a declaration for StylesPath which, importantly, is relative to where you would be executing vale from. Similarly, .vale.ini should be in the directory from where you are executing vale from however there are commandline flags that can override that. This is executed in the pipeline therefore .vale.ini is located in the root of the repository and styles is a directory at the root of the repository.

The [*.md] declaration informs vale that the following is to be applied to *.md files (Markdown) which are the formats that I use for writing my posts in Hugo. BasedOnStyles indicates which styles to apply to this type of file. The styles are located in subfolders and their declaration should match their folder name. I’d like to point out that the rules that I made were more for testing purposes than that I actually have a style that I adhere to. This is a personal blog. I will do what I like.

styles
├── Microsoft
│   ├── AMPM.yml
│   ├── Accessibility.yml
│   ├── Acronyms.yml
│   ├── Adverbs.yml
│   ├── Auto.yml
│   ├── Avoid.yml
│   ├── ComplexWords.yml
│   ├── Contractions.yml
│   ├── Dashes.yml
│   ├── DateFormat.yml
│   ├── DateNumbers.yml
│   ├── DateOrder.yml
│   ├── Ellipses.yml
│   ├── FirstPerson.yml
│   ├── Foreign.yml
│   ├── Gender.yml
│   ├── GenderBias.yml
│   ├── GeneralURL.yml
│   ├── HeadingAcronyms.yml
│   ├── HeadingColons.yml
│   ├── HeadingPunctuation.yml
│   ├── Headings.yml
│   ├── Hyphens.yml
│   ├── Negative.yml
│   ├── Ordinal.yml
│   ├── OxfordComma.yml
│   ├── Passive.yml
│   ├── Percentages.yml
│   ├── Quotes.yml
│   ├── RangeFormat.yml
│   ├── RangeTime.yml
│   ├── Ranges.yml
│   ├── Semicolon.yml
│   ├── SentenceLength.yml
│   ├── Spacing.yml
│   ├── Suspended.yml
│   ├── Terms.yml
│   ├── URLFormat.yml
│   ├── Units.yml
│   ├── Vocab.yml
│   ├── We.yml
│   ├── Wordiness.yml
│   └── meta.json
├── Readability
│   ├── AutomatedReadability.yml
│   ├── ColemanLiau.yml
│   ├── FleschKincaid.yml
│   ├── FleschReadingEase.yml
│   ├── GunningFog.yml
│   ├── LIX.yml
│   └── SMOG.yml
└── marktoso
    ├── Kiss.yml
    ├── Readability.yml
    ├── Spelling.yml
    ├── Substitute.yml
    └── TresComas.yml

Usage

Usage is very straightforward. If you just want to check everything all at once, and you’re starting at the root of a directory structure where your posts are in content/posts, all you need to do is vale content/posts. This command will run vale on every file in that directory. You can substitute the path to the directory for the path to an individual file.

💡 The Readability style is actually giving me some issues in vale on macOS. Unsure if it’s limited to this particular machine or not but it throws a NaN error for just about every rule.

Hype beast linting

Styles are extensible however the best way to understand their implementation is to see some examples. One type of rule that didn’t quite make sense to me with the example provided in the docs was the spelling rule so I’ll break down one that was working. First, however, a Hunspell dictionary is required. errata.ai kindly provided an english dictionary to be used.

cd .. #get out of your project directory. we'll call it "hugo"
git clone https://github.com/errata-ai/en_US-web.git
mkdir hugo/assets/dictionaries
cp en_US-web/src/* hugo/assets/dictionaries

By creating a Spelling.yml in a subfolder of the styles directory (this is in marktoso for me but feel free to experiment. Feel out the room. Explore the space) and adding the lines below you’ve created a rule in that style.

extends: spelling
message: "Did you really mean '%s'?"
dicpath: assets/dictionaries
dictionaries:
  - en_US-web
level: error
ignore:
  - assets/ignoreWords.txt
  • dicpath is the relative path from where vale runs in my case. dicpath is a folder that can hold many dictionaries. They should be named uniquely (in this case the dictionary retrieved was en_US-web.aff and en_US-web.dic). Hunspell-compatible dictionaries use a pair of files, .aff and .dic, and both are required to have the same name.
  • dictionaries is which, of potentially many, dictionaries are to be applied in this rule. In most cases it won’t be important however this allows for the flexibility when assigning dictionaries to rules. Assigning multiple dictionaries is possible as this is a yaml list.
  • level is which of the three (error|warning|suggestion) levels of flag to be thrown when this rule hits.
  • ignore is a path to one or many text files that have one word per line. These words are to be ignored as spelling errors when checked against the above dictionaries.

The proof of the pudding is in the linting

I’m going to save this file as I’m working on it and run vale on it. These are the results:

content/posts/spellcheck-my-doc-before-its-too-late.md
 10:58    error    Did you really mean             marktoso.Spelling      
                   'Instatiating'?                                        
 10:558   warning  Try to avoid using              Microsoft.We           
                   first-person plural like 'We'.                         
 15:1     warning  Use first person (such as 'I    Microsoft.FirstPerson  
                   ') sparingly.                                          
 15:51    warning  Use first person (such as ' I   Microsoft.FirstPerson  
                   ') sparingly.                                          
 15:246   error    Did you really mean 'loosly'?   marktoso.Spelling      
 15:346   warning  Use first person (such as       Microsoft.FirstPerson  
                   'I'm') sparingly.                                      
 15:541   warning  Consider removing 'quickly'.    Microsoft.Adverbs      
 15:567   warning  Use first person (such as       Microsoft.FirstPerson  
                   'me') sparingly.                                       
 17:14    warning  Consider removing 'easily'.     Microsoft.Adverbs      
 17:195   error    More than 3 commas!             marktoso.TresComas     
 17:455   warning  Use first person (such as ' I,  Microsoft.FirstPerson  
                   ') sparingly.                                          
 17:459   warning  Consider removing               Microsoft.Adverbs      
                   'ultimately'.                                          
 20:328   warning  Use first person (such as       Microsoft.FirstPerson  
                   'my') sparingly.                                       
 20:423   warning  Use first person (such as       Microsoft.FirstPerson  
                   'My') sparingly.                                       
 20:450   error    Use 'isn't' instead of 'is      Microsoft.Contractions 
                   not'.                                                  
 20:501   warning  Use first person (such as       Microsoft.FirstPerson  
                   'me') sparingly.                                       
 23:1     warning  Use first person (such as 'I    Microsoft.FirstPerson  
                   ') sparingly.                                          
 23:254   warning  Use first person (such as       Microsoft.FirstPerson  
                   'my') sparingly.                                       
 23:333   error    Did you really mean 'taht'?     marktoso.Spelling      
 28:696   error    Did you really mean 'teh'?      marktoso.Spelling      
 30:73    warning  Consider removing 'fairly'.     Microsoft.Adverbs      
 30:100   error    Use 'doesn't' instead of 'does  Microsoft.Contractions 
                   not'.                                                  
 38:25    error    Use 'that's' instead of 'that   Microsoft.Contractions 
                   is'.                                                   
 38:286   error    Did you really mean             marktoso.Spelling      
                   'commandline'?                                         
 40:9     error    Did you really mean 'md'?       marktoso.Spelling      
 40:126   warning  Use first person (such as ' I   Microsoft.FirstPerson  
                   ') sparingly.                                          
 40:145   warning  Use first person (such as       Microsoft.FirstPerson  
                   'my') sparingly.                                       
 40:259   error    Did you really mean             marktoso.Spelling      
                   'subfolders'?                                          
 40:365   warning  Use first person (such as ' I   Microsoft.FirstPerson  
                   ') sparingly.                                          
 40:413   warning  Use first person (such as ' I   Microsoft.FirstPerson  
                   ') sparingly.                                          
 40:442   warning  Use first person (such as ' I   Microsoft.FirstPerson  
                   ') sparingly.                                          
 40:495   warning  Use first person (such as ' I   Microsoft.FirstPerson  
                   ') sparingly.                                          
 103:10   warning  Consider removing 'very'.       Microsoft.Adverbs      
 103:290  error    Did you really mean             marktoso.Spelling      
                   'substitue'?                                           
 104:73   warning  Use first person (such as       Microsoft.FirstPerson  
                   'me') sparingly.                                       
 108:245  warning  Use first person (such as       Microsoft.FirstPerson  
                   'me') sparingly.                                       
 115:35   error    Did you really mean             marktoso.Spelling      
                   'subfolder'?                                           
 115:96   warning  Use first person (such as       Microsoft.FirstPerson  
                   'me') sparingly.                                       
 126:60   warning  Use first person (such as       Microsoft.FirstPerson  
                   'my') sparingly.                                       
 126:183  error    Did you really mean             marktoso.Spelling      
                   'retrived'?                                            
 127:33   warning  Consider removing               Microsoft.Adverbs      
                   'potentially'.                                         
 127:96   warning  Consider using 'usually'        Microsoft.Wordiness    
                   instead of 'In most cases'.                            
 127:267  error    Did you really mean 'yaml'?     marktoso.Spelling      
 132:1    warning  Use first person (such as       Microsoft.FirstPerson  
                   'I'm') sparingly.                                      
 132:32   warning  Use first person (such as       Microsoft.FirstPerson  
                   'I'm') sparingly.                                      

✖ 15 errors, 30 warnings and 0 suggestions in 1 file.

But what does it all mean?

The point that I’m attempting to make here is that writing–human writing–is fraught with error. Writing within the constraints of a style can be very challenging when a style is designed in a way to strongly shape content for a specific purpose yet the content it is being applied to is broader or more informal as in the case of this blog and its posts. The lack of first person would be absurd as this is a personal blog and contains personal communication.

The spelling will be corrected but there is no covering up of flaws here as the intent is to be practically educational. The misspelling of “md” is particularly telling because the line and character in question is when I typed [*.md]. I’m unsure as to whether or not I will add that to ignoreWords.txt because I would like to be alerted if I simply typed “md” somewhere. Yet, in this case, it triggers an error. This could prove difficult when using the error condition to flag the build as failed in the pipeline.

The other warnings, like the ones for contractions or adverbs, will help shape a consistent corpus of documentation or prose. In the even that your organization already possesses a mandated style then, by all means, adhere to it however, it is important to use discretion and judgement in applying these rules when the scope of your work product is outside of styles generated by companies for business use. Why, then, would I pursue such an endeavour? The reason, I say to you, is because it’s cool.

Styles, by the way, are not the arbiter of quality–the authors are no less human (for the most part). I’ve read styles where the rules expressly check for, and forbid, the Oxford comma which, in my opinion, is a grave mistake. Additionally there are documentation style guidelines that insist on only the most simple “narrative”, for lack of a better term, which removes a certain amount of utility from documentation. I don’t feel judged by vale or the assigned styles, either. It is an impersonal parser that isn’t passing judgement on “quality” but on adherence to rules. Which can amount to quality over a large-enough body of work as it becomes more consistent but that’s a post for another blog.

I will have (and may already have) a small blurb at the bottom of every post with information from vale. In the sequel to this post I will be sticking this in my pipeline and smoking it.

Hi, this post was checked with vale which is a content-aware linter. It was checked using the Microsoft style as well as some rules that I made. A summary of those results is below. More details as to how this was put together check out this post. This post had: 10 errors, 47 warnings and 0 suggestions For details on the linting of this post
 ./content/posts/spellcheck-my-doc-before-its-too-late.md
 10:559   warning  Try to avoid using              Microsoft.We                 
                   first-person plural like 'We'.                               
 15:1     warning  Use first person (such as 'I    Microsoft.FirstPerson        
                   ') sparingly.                                                
 15:51    warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 15:246   warning  Consider removing 'loosely'.    Microsoft.Adverbs            
 15:347   warning  Use first person (such as       Microsoft.FirstPerson        
                   'I'm') sparingly.                                            
 15:542   warning  Consider removing 'quickly'.    Microsoft.Adverbs            
 15:568   warning  Use first person (such as       Microsoft.FirstPerson        
                   'me') sparingly.                                             
 17:14    warning  Consider removing 'easily'.     Microsoft.Adverbs            
 17:195   error    More than 3 commas!             marktoso.TresComas           
 17:455   warning  Use first person (such as ' I,  Microsoft.FirstPerson        
                   ') sparingly.                                                
 17:459   warning  Consider removing               Microsoft.Adverbs            
                   'ultimately'.                                                
 20:328   warning  Use first person (such as       Microsoft.FirstPerson        
                   'my') sparingly.                                             
 20:423   warning  Use first person (such as       Microsoft.FirstPerson        
                   'My') sparingly.                                             
 20:450   error    Use 'isn't' instead of 'is      Microsoft.Contractions       
                   not'.                                                        
 20:501   warning  Use first person (such as       Microsoft.FirstPerson        
                   'me') sparingly.                                             
 23:1     warning  Use first person (such as 'I    Microsoft.FirstPerson        
                   ') sparingly.                                                
 23:254   warning  Use first person (such as       Microsoft.FirstPerson        
                   'my') sparingly.                                             
 30:73    warning  Consider removing 'fairly'.     Microsoft.Adverbs            
 30:100   error    Use 'doesn't' instead of 'does  Microsoft.Contractions       
                   not'.                                                        
 38:25    error    Use 'that's' instead of 'that   Microsoft.Contractions       
                   is'.                                                         
 40:126   warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 40:145   warning  Use first person (such as       Microsoft.FirstPerson        
                   'my') sparingly.                                             
 40:365   warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 40:413   warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 40:442   warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 40:495   warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 103:10   warning  Consider removing 'very'.       Microsoft.Adverbs            
 104:73   warning  Use first person (such as       Microsoft.FirstPerson        
                   'me') sparingly.                                             
 108:245  warning  Use first person (such as       Microsoft.FirstPerson        
                   'me') sparingly.                                             
 115:96   warning  Use first person (such as       Microsoft.FirstPerson        
                   'me') sparingly.                                             
 126:60   warning  Use first person (such as       Microsoft.FirstPerson        
                   'my') sparingly.                                             
 127:33   warning  Consider removing               Microsoft.Adverbs            
                   'potentially'.                                               
 127:96   warning  Consider using 'usually'        Microsoft.Wordiness          
                   instead of 'In most cases'.                                  
 132:1    warning  Use first person (such as       Microsoft.FirstPerson        
                   'I'm') sparingly.                                            
 132:32   warning  Use first person (such as       Microsoft.FirstPerson        
                   'I'm') sparingly.                                            
 219:28   warning  Don't use end punctuation in    Microsoft.HeadingPunctuation 
                   headings.                                                    
 220:16   warning  Use first person (such as       Microsoft.FirstPerson        
                   'I'm') sparingly.                                            
 220:150  warning  Consider removing 'very'.       Microsoft.Adverbs            
 220:266  error    Use 'it's' instead of 'it is'.  Microsoft.Contractions       
 222:220  warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 222:239  warning  Use first person (such as       Microsoft.FirstPerson        
                   'I'm') sparingly.                                            
 222:256  warning  Consider using 'whether'        Microsoft.Wordiness          
                   instead of 'whether or not'.                                 
 222:270  warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 222:315  warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 222:345  warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 224:239  error    Use 'it's' instead of 'it is'.  Microsoft.Contractions       
 224:425  warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 224:465  warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 226:21   error    Use 'aren't' instead of 'are    Microsoft.Contractions       
                   not'.                                                        
 226:156  error    More than 3 commas!             marktoso.TresComas           
 226:197  warning  Use first person (such as       Microsoft.FirstPerson        
                   'my') sparingly.                                             
 226:319  error    Punctuation should be inside    Microsoft.Quotes             
                   the quotes.                                                  
 226:420  warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 226:481  error    Use 'it's' instead of 'It is'.  Microsoft.Contractions       
 228:1    warning  Use first person (such as 'I    Microsoft.FirstPerson        
                   ') sparingly.                                                
 228:134  warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                
 228:162  warning  Use first person (such as       Microsoft.FirstPerson        
                   'my') sparingly.                                             
 230:210  warning  Use first person (such as ' I   Microsoft.FirstPerson        
                   ') sparingly.                                                

10 errors, 48 warnings and 0 suggestions in 1 file.