How to Write Developer Documentation People Actually Read and Use

The 40-Page Doc No One Opens

Your team spent 3 weeks writing The

Complete Architecture Guide. Forty pages. Detailed diagrams. Every service documented. Posted to Confluence with celebration emojis.

Six months later:

New engineers still ask "how do I run this locally?" (answered on page 23).
Someone breaks production because they didn't know about the deployment process (documented on page 31).
A critical system change happens. No one updates the doc. Now half of it is wrong.

The doc exists. No one reads it. So people ask the same questions in Slack.

This isn't a failure of discipline. It's a failure of design. 40-page docs don't get read because they're optimized for comprehensive coverage, not for human usage.

Good developer docs aren't about completeness. They're about answering the right questions at the right time. They help someone get unblocked in 10 minutes, not after reading a novel.

Let's talk about documentation people actually open, trust, and return to.

What Good Developer Docs Actually Do

Before writing a doc, ask: What job is this doc doing?

Good docs serve specific purposes:

1. Help Someone Get Started Quickly

Job: New engineer (or your future self in 6 months) needs to run this service locally, understand what it does, and make their first change.

Success: They're productive in 30 minutes without asking anyone for help.

2. Help Someone Debug or Change Something Safely

Job: Engineer hits an error, needs to understand a component, or wants to add a feature without breaking things.

Success: They find the relevant context, understand the constraints, and make the change confidently.

3. Preserve Decisions and Context Over Time

Job: Six months from now, someone asks "Why did we build it this way?"

Success: The doc explains the tradeoffs, alternatives considered, and why this choice made sense at the time.

Types of Docs You Need

Different jobs need different doc types:

README / Getting Started: "How do I run this?" (for repos/services)
Architecture Overview: "How does this system work?" (high-level)
Runbooks: "Something broke, what do I do?" (on-call guides)
Decision Records (ADRs): "Why did we choose this?" (preserving context)

Don't write one giant doc that tries to be all four. Write focused docs for specific jobs.

The Core Types of Docs You Need

1. README / Getting Started

Purpose: Get someone from zero to running the service locally in <30 minutes.

What it should contain:

What this service does (2-3 sentences)
Prerequisites (dependencies, tools, versions)
How to run it locally (step-by-step)
How to run tests
Common issues and troubleshooting
Links to deeper docs (architecture, deployment, API docs)

Example README structure:

# Order Service

Handles order creation, payment processing, and fulfillment workflows.

## Prerequisites

- Node.js 18+
- PostgreSQL 14+
- Redis 6+

## Running Locally

1. Install dependencies:

npm install


2. Set up database:

createdb orders_dev npm run migrate


3. Copy environment file:

cp .env.example .env

Edit `.env` and set `DATABASE_URL` and `REDIS_URL`.

4. Start the server:

npm run dev


Service runs at http://localhost:3000

## Running Tests

npm test


## Common Issues

**"Connection refused" when starting:**  
Make sure PostgreSQL and Redis are running:

brew services start postgresql brew services start redis


**"Migration failed":**  
Drop and recreate the database:

dropdb orders_dev && createdb orders_dev npm run migrate


## More Documentation

- [Architecture Overview](docs/architecture.md)
- [API Documentation](docs/api.md)
- [Deployment Guide](docs/deployment.md)

Key principles:

Start with the goal ("run this locally")
Be specific: exact commands, not "install the dependencies"
Anticipate common failures: include troubleshooting
Keep it short: If it's more than 2 screens, split it up

2. Architecture Overview

Purpose: Help someone understand the high-level system design without reading code.

What it should contain:

System context: What problem does this solve? How does it fit in the larger system?
Key components: Main services, databases, queues, caches
Data flow: How requests move through the system
Key design decisions: Why this architecture?

Focus on stable concepts, not implementation details.

Example architecture doc snippet:

# Order Service Architecture

## Context

Order Service handles the full order lifecycle: creation, payment, fulfillment, and cancellation. It's called by the Web/Mobile APIs and integrates with Payment Service and Fulfillment Service.

## High-Level Architecture

[Web/Mobile API] ↓ [Order Service] ↓ ↓ [Payment] [Fulfillment] ↓ [Database]


## Key Components

**Order API**: REST API for creating/updating orders.  
**Order Processor**: Background worker that processes payment and fulfillment.  
**PostgreSQL**: Stores order data.  
**Redis**: Caches user cart data (TTL 24 hours).  
**SQS Queue**: Async jobs for fulfillment notifications.

## Data Flow: Creating an Order

1. Client calls `POST /orders` with cart data.
2. Order Service validates inventory and creates order record (status: `pending`).
3. Order Service calls Payment Service to charge card.
4. If payment succeeds, order status → `paid`. Job enqueued to Fulfillment Service.
5. If payment fails, order status → `failed`. Client notified.

## Key Design Decisions

**Why async fulfillment?**  
Fulfillment can take 5-30 seconds (warehouse API calls, label generation). We don't want to block the user's request. Orders are created immediately, fulfillment happens async.

**Why Redis for cart?**  
Carts are high-write, low-durability (OK to lose if Redis restarts). Keeping them out of Postgres reduces write load.

**Why SQS instead of in-process jobs?**  
Fulfillment jobs can fail and need retries. SQS gives us durability and visibility. We can also scale workers independently.

What to include:

Text + simple diagrams (ASCII art, Mermaid, or simple boxes-and-arrows)
Flows, not just static structures
Rationale for non-obvious choices

What to skip:

Implementation details (class names, file structure—that's in code comments)
Details that change frequently (specific config values, IP addresses)

3. Runbooks: On-Call Guides

Purpose: Help someone debug and fix a production issue at 3am with minimal context.

Audience: Your future stressed-out self.

Structure:

Symptoms: Alert name, error message, or user report
Immediate steps: Checklist to diagnose and mitigate
How to escalate: Who to contact if stuck
Links: Dashboards, logs, deeper investigation docs

Example runbook:

# Runbook: High Order API Latency

## Symptoms

- Alert: "Order API p99 latency > 2 seconds"
- Users reporting slow checkout

## Immediate Checks

1. **Check dashboard**: [Order Service Metrics](https://grafana.company.com/orders)
   - Look for: request rate spike, error rate, database query time

2. **Check database**:
   ```sql
   -- Find slow queries
   SELECT query, mean_exec_time 
   FROM pg_stat_statements 
   ORDER BY mean_exec_time DESC 
   LIMIT 10;

Check Redis: Is cache hit rate lower than normal? (normal: >80%)
```
redis-cli INFO stats | grep keyspace
```
Check Payment Service: Is payment API slow? (check #payment-alerts in Slack)

Common Causes & Fixes

Cause 1: Database Connection Pool Exhausted

Symptom: Logs show "connection pool timeout"

Fix: Restart the service (temporarily):

kubectl rollout restart deployment/order-service -n production

Follow-up: Increase connection pool size (requires deploy).

Cause 2: Redis Cache Miss Storm

Symptom: Redis hit rate <50%, database queries spiking

Fix: Warm the cache:

kubectl exec -it order-service-pod -n production -- npm run cache:warm

Cause 3: Payment Service Timeout

Symptom: Payment Service latency >5 seconds

Escalate: Page Payment team (#payment-alerts). Mitigation: Consider enabling circuit breaker (requires code change).

Escalation

During business hours: Post in #order-service Slack channel
After hours: Page on-call engineer: @oncall-orders in PagerDuty

Deeper Investigation


**Key principles**:
- **Start with what you see** (alert, error), not "if you have problem X..."
- **Checklists**, not paragraphs
- **Assume stress and low context**: step-by-step commands, not "check the logs"
- **Link to monitoring**: dashboards, log queries, metrics

### 4. Architecture Decision Records (ADRs)

**Purpose**: Document **why** key technical decisions were made, preserving context for future engineers.

**Format**: Short (1-2 pages), structured docs stored in the repo.

**When to write an ADR**:
- Choosing a database, message queue, or major technology
- Significant architecture changes (sharding, microservices split)
- Trade-offs that will affect the team for years

**Structure**:
```markdown
# ADR ###: [Title]

**Status**: Accepted | Rejected | Superseded  
**Date**: YYYY-MM-DD  
**Decision Owner**: Name

## Context
Why are we making this decision? What problem or constraint drove it?

## Decision
What did we decide?

## Alternatives Considered
What other options did we evaluate? Why did we reject them?

## Consequences
What are the benefits and trade-offs of this decision?

Example ADR:

# ADR 005: Use PostgreSQL JSON Columns for Order Metadata

**Status**: Accepted  
**Date**: 2025-11-10  
**Decision Owner**: Ruchit Suthar

## Context
Orders have flexible metadata (gift messages, custom engraving text, promo codes applied). Different order types have different metadata. Adding a column for each field bloats the schema and requires migrations for every new field.

## Decision
Store flexible metadata in a JSONB column: `orders.metadata`.

## Alternatives Considered

**Option 1: Separate `order_metadata` table**  
Rejected: Requires JOIN on every order fetch. Complicates queries.

**Option 2: Add columns as needed**  
Rejected: Frequent schema migrations. Unclear which fields are actually used.

**Option 3: NoSQL document store (MongoDB)**  
Rejected: Rest of the system is PostgreSQL. Adds operational complexity. JSONB in Postgres gives us flexibility without leaving the RDBMS.

## Consequences

**Benefits**:
- No schema changes for new metadata fields.
- Can query JSON with PostgreSQL's JSON operators: `metadata->>'gift_message'`.

**Trade-offs**:
- Harder to enforce schema (no strong typing). Mitigated with application-level validation.
- Less efficient than indexed columns for high-frequency queries. Acceptable for metadata (queried infrequently).

**Migration plan**:
Existing metadata columns (`gift_message`, `promo_code`) will be migrated into `metadata` JSONB. Old columns deprecated in 3 months.

Benefits of ADRs:

New engineers understand "why" instead of reverse-engineering from code
Prevents re-litigating old decisions ("why didn't we use MongoDB?")
Searchable history of technical choices

Designing READMEs That Unblock People in 10 Minutes

READMEs are your most important doc. They're the first thing anyone reads.

Template for a good README:

Section 1: What This Does (2-3 sentences)

# Service Name

One-sentence description of what this service does and who uses it.

Example:

Order Service handles order creation, payment processing, and fulfillment for all e-commerce purchases.

Section 2: Quick Start (How to Run Locally)

## Quick Start

1. Install dependencies: [exact command]
2. Set up database: [exact command]
3. Configure environment: [exact steps]
4. Run the service: [exact command]

Section 3: Testing

## Running Tests

[exact command to run tests]

## Running Specific Tests

[how to run one test file or test case]

Section 4: Common Issues

## Troubleshooting

**Error: "X"**  
Cause: Y  
Fix: Z

Section 5: Links to Deeper Docs

## Documentation

- [Architecture Overview](docs/architecture.md)
- [API Documentation](docs/api.md)
- [Deployment Guide](docs/deployment.md)
- [Runbook](docs/runbook.md)

Keep it short: If your README is >200 lines, split sections into separate docs and link them.

Making Docs Discoverable and Trusted

Even great docs are useless if no one can find them.

1. Standard Locations and Naming

Put docs where people expect them:

README.md in repo root
/docs folder for deeper docs
Link from dashboards, alerts, and wikis

Use consistent naming:

architecture.md
runbook.md
api.md
deployment.md

2. Link from the Tools People Use

Dashboards: Link runbooks from Grafana/Datadog alerts
Error messages: Link troubleshooting guides from log messages
Repos: Link architecture docs from README

3. Show Last-Updated Date and Owner

**Last Updated**: 2025-11-10  
**Owner**: @ruchit (Slack: @ruchit)

This signals: "Is this doc current?" and "Who do I ask if it's wrong?"

4. Delete Outdated Docs

Zombie docs are worse than no docs. If a doc is outdated and no one will update it, delete it.

Better to have 5 accurate docs than 50 docs where 30 are wrong and no one knows which.

Closing: Documentation as Part of the Definition of Done

Documentation isn't bureaucracy. It's craftsmanship.

Good docs:

Speed up onboarding
Reduce support burden
Preserve knowledge when people leave
Prevent repeated mistakes

Bad docs (or no docs):

Tribal knowledge locks in your head
Same questions in Slack every week
New engineers blocked for days
Context lost forever when someone quits

Treat docs as part of shipping code. A feature isn't done until:

README updated (if relevant)
Runbook updated (if on-call needs to know)
ADR written (if major decision)

Checklist: Upgrade Docs for One Service This Week

Pick one service/repo and improve its docs:

README exists and covers: What it does, how to run locally, how to test, common issues
Architecture doc exists (high-level system design, key components, data flows)
Runbook exists (for production service): symptoms, immediate steps, escalation
ADRs exist for major technical decisions (at least 1-2 key choices documented)
Docs are discoverable: Linked from README, dashboards, and team wiki
Docs show last-updated date and owner
Outdated docs deleted (if any)

This takes 2-4 hours. It saves weeks of confusion.

The best docs are boring. They're short, focused, and answer one question clearly. They're updated when things change. They're linked from the places people actually look.

Write docs people will actually read. Your future self—and your teammates—will thank you.

How to Write Developer Documentation People Actually Read and Use

TL;DR

How to Write Developer Documentation People Actually Read and Use

The 40-Page Doc No One Opens

What Good Developer Docs Actually Do

1. Help Someone Get Started Quickly

2. Help Someone Debug or Change Something Safely

3. Preserve Decisions and Context Over Time

Types of Docs You Need

The Core Types of Docs You Need

1. README / Getting Started

2. Architecture Overview

3. Runbooks: On-Call Guides

Common Causes & Fixes

Cause 1: Database Connection Pool Exhausted

Cause 2: Redis Cache Miss Storm

Cause 3: Payment Service Timeout

Escalation

Deeper Investigation

Designing READMEs That Unblock People in 10 Minutes

Section 1: What This Does (2-3 sentences)

Section 2: Quick Start (How to Run Locally)

Section 3: Testing

Section 4: Common Issues

Section 5: Links to Deeper Docs

Making Docs Discoverable and Trusted

1. Standard Locations and Naming

2. Link from the Tools People Use

3. Show Last-Updated Date and Owner

4. Delete Outdated Docs

Closing: Documentation as Part of the Definition of Done

Checklist: Upgrade Docs for One Service This Week

Topics

About Ruchit Suthar

TL;DR

How to Write Developer Documentation People Actually Read and Use

The 40-Page Doc No One Opens

What Good Developer Docs Actually Do

1. Help Someone Get Started Quickly

2. Help Someone Debug or Change Something Safely

3. Preserve Decisions and Context Over Time

Types of Docs You Need

The Core Types of Docs You Need

1. README / Getting Started

2. Architecture Overview

3. Runbooks: On-Call Guides

Common Causes & Fixes

Cause 1: Database Connection Pool Exhausted

Cause 2: Redis Cache Miss Storm

Cause 3: Payment Service Timeout

Escalation

Deeper Investigation

Designing READMEs That Unblock People in 10 Minutes

Section 1: What This Does (2-3 sentences)

Section 2: Quick Start (How to Run Locally)

Section 3: Testing

Section 4: Common Issues

Section 5: Links to Deeper Docs

Making Docs Discoverable and Trusted

1. Standard Locations and Naming

2. Link from the Tools People Use

3. Show Last-Updated Date and Owner

4. Delete Outdated Docs

Closing: Documentation as Part of the Definition of Done

Checklist: Upgrade Docs for One Service This Week

Topics

About Ruchit Suthar

Related Articles

Copilot Instructions & Context Files: The Before/After That Changes Everything

The GitHub Copilot Strategy for 2026: From Autocomplete to Architecture Copilot

From Pilot to Copilot: How Senior Developers Should Leverage AI in 2026

Stay Updated