Documentation for DevOps: A DevOpsDays Open Space Writeup

I’m borrowing this image from my friends at Chef, because it’s really funny to pretend that it’s possible to just sprinkle on “devops”, something that is about cultural change at every level.

At every DevOpsDays I go to (and that’s quite a few), I propose/wrangle/run two open spaces. One is about feature flags/canary launches/hypothesis-driven development, and the other is this one, which I usually refer to as “Docs for DevOps”.

DevOps documentation is difficult because it’s not intended for external audiences, so it’s hard to make a case for hiring a writer, but it’s also mission-critical, which means that it can’t be ignored or neglected.

I also have a talk that I give on this topic. You can find recordings from DevOpsDays Minneapolis and NDC Minnesota here. but this is specifically about the open space. I am indebted to a number of people in the last year who have come to open spaces, shared their experiences, asked questions, and challenged my thinking and assumptions.

I’ve grouped the questions I hear most often into these categories:

  • How can I find anything?
  • How do I get people to read the docs?
  • How do I get people to write the docs?
  • My team is small, do you have particular advice?
  • My team is very large, what should we do?
  • Distributed team best practices?
  • Why is sharepoint?
  • What tools are most useful?

How can I find anything?

Frequently, the first problem that people have is not that nothing is written down, but that lots of things are, and the problem is discoverability and accuracy. It’s one thing to have instructions, and quite another to have instructions that you can find when you need them. Many people have their operations documents spread across one or many wikis, ticketing systems, and document storage platforms.

The short answer to this problem is “search”. That is also most of the long answer. The thing I think we overlook is that we don’t have to use the search engine that came with our wiki. We could put something else in front of it. With the demise of the physical Google Search Appliance, you still have a number of service options. Most of them will give you the option to index and search multiple sites.

As a consultant who helped companies structure documentation so it was usable, I have a lot of specific advice on how to change the architecture and the culture, but honestly, if you could get your company to hire someone to help with that, you could probably get them to hire a writer and solve the problem another way.

The last part of this answer is that you could get some consulting dollars and hire an archivist/librarian to help you come up with tags that help identify the types of data and make it easier to search for. When we’re trying to recall data, we frequently do so by keyword or tag, and an archivist can help you make a canonical list of what people should use for that.

How do I get people to read what we have?

First, see above, make it easy to find.

Second, make it as easy as possible to read. Sit down in a group and hash out what type of information has to be included to make a topic/article/page useful to others. Once you know what you need to have included, you can make a template so that people remember to include it all.

The next problem is to figure out how to point people to the documentation without being dismissive or making them feel like you are too good to answer their questions. This might be a good place for a chatbot, or a single web page full of resources.

Many people would rather be able to find an answer themselves than bug a coworker to get it, so if they’re not reading the documentation, there’s probably a systemic problem blocking them.

How do you get people to write the docs?

Do you know any sysadmin or SRE who has ever been fired for lack of documentation?

Do you know of anyone who has gotten promoted solely on their ability to write internal documentation?

Most of us don’t. No matter how much organizations say they care about documentation, they are not putting anything of meaning behind those words. It’s not malicious, but it does tell you that documentation is not actually something they care about.

One of the ways to fix this is by using a reward system. Buy a package of Starbucks cards, or awesome stickers, or ice cream certificates, and give them out every time someone completes a documentation task, immediately. Adding a tally mark for their eventual review is much less motivational than immediate rewards for work. You don’t have to be a manager to do this – anyone can motivate their co-workers. The reward doesn’t have to have cash value, but I do think it’s the easiest way to indicate your appreciation.

Another thing to pay attention to is how easy or hard it is to write the documentation. If it’s a complicated system, the barrier to entry is high. If it’s in a tool the team already uses, in a language they already know, it’s much easier to persuade them to drop in a few lines. if there is a form for them to fill in so they don’t have to figure out what they are supposed to include, that’s even better. Lowering the cognitive barrier to entry is as important as the tool barrier to entry. After all, we’d have a lot more trouble writing bug reports if we had to write them out by longhand without any fields to fill in. Documentation is the same way.

My team is small

Consider using just one single page of truth. No, seriously. If you’re working with a team under 8, and especially if they’re distributed, consider just putting everything on one single web page and searching it when you need to find something. The overhead of managing a documentation system or wiki is not something you need when your team is that small. As you write new things, they go at the top and older stuff gets pushed down to the bottom and becomes obviously less important or relevant. If you need to find something, you can easily search the page. No one has a secret pile of documents that the rest of the team doesn’t know about. It’s not an elegant solution, but it’s hella utilitarian.

My team is large

Hire a writer. And/or a librarian. I was talking to someone whose IT organization was over seven thousand people. They obviously do need some structure, some tooling, some information architecture. Rather than spend internal IT cycles on it, I suggest hiring experts. There are thousands of people in this world who are trained to manage, collate, write, and wrangle large sets of documents, and it’s wasteful to try to do it by yourself.

What are the best practices for distributed teams?

Pretty much the only effective way to have a distributed team is to change the team culture to one where you write down questions and answers, instead of popping by to share them. It’s important to make sure that key data is always written down and not passed on by word of mouth, because as soon as you have an oral culture, you are leaving remote members out.

It’s useful to have at least one team member working remotely who has distributed team experience. You can ask them for what has worked for them in the past and enlist their help in iterating your practices. For example, is it hard for the team to remember to set up the call-in for meetings? Change the meeting template to include a shared meeting room automatically. No one gets a distributed team right immediately, anymore than we do a colocated team. It’s just that it’s easier for distributed teams to end up in a broken state before someone notices.

Why is SharePoint?

Because the paradigm and core analogy is filing cabinets. “What if I could see inside the Operations filing cabinet?” Because it thinks of itself as a way to organize pieces of paper, it’s not very good at documents that need to be shared and altered simultaneously across organizational boundaries. It was a marvel for its time, a way for non-technical users to get their documents out of shared drives and file cabinets and give other people regulated access, but it is fundamentally hostile to searching.

What tools are most useful?

It depends, right? Tools exist on a sliding scale between absolutely zero barrier to entry on one end and and taxanomic usefulness on the other. Every team must find their own best place, which may not be the same across the company. But if a team dislikes a tool or finds it burdensome, they just won’t use it.

The best tool is the one you can get people to adopt. If your team is 100% people who know how to use LateX and created their resumes and gaming character sheets in it, then it would be a fine tool. If you had a team composed entirely of people under 20, maybe a a youtube channel would be the way to go. Since none of us have completely uniform teams, the best thing we can do is to find a tool that everyone can grudgingly agree only sucks a little bit.

In conclusion

  • Use a better search engine
  • Hire experts whenever possible
  • Make it easier to write things by using templates
  • Iterate and keep improving your devops docs process. There’s no solution, just a fix for now.