Database migrations are your deployment's weakest link. Here's how to strengthen them. 💾
The Migration Nightmare Scenarios:
🔥 The Friday Night Horror • Migration runs for 6 hours on production • Locks critical tables • Users can't access the application • Weekend ruined, customers angry
💀 The Rollback Trap • New migration deployed • Application breaks unexpectedly • Try to rollback... migration is irreversible • Data lost, panic ensues
🐛 The Environment Drift • Migration works in dev and staging • Fails in production due to data differences • Production has edge cases dev didn't account for
Common Migration Mistakes:
❌ Not Testing with Production-Like Data
-- Works with 1000 test records
ALTER TABLE users ADD COLUMN preferences JSON;
-- Fails with 10M production records (takes hours)
❌ Breaking Changes Without Coordination
-- This breaks running application instances
ALTER TABLE orders DROP COLUMN legacy_status;
-- Better: Mark deprecated first, remove later
❌ No Rollback Strategy
-- Can't undo this easily
UPDATE users SET email = LOWER(email);
-- Better: Keep original data for rollback
❌ Ignoring Lock Duration
-- Locks table for entire migration
ALTER TABLE large_table ADD COLUMN new_field VARCHAR(255) NOT NULL DEFAULT '';
-- Better: Add nullable first, populate, then make NOT NULL
The Safe Migration Framework:
🎯 Phase 1: Additive Changes
-- Add new column as nullable
ALTER TABLE users ADD COLUMN new_email VARCHAR(255);
-- Create index concurrently (PostgreSQL)
CREATE INDEX CONCURRENTLY idx_users_new_email ON users(new_email);
🔄 Phase 2: Data Population
-- Populate in batches to avoid locks
UPDATE users SET new_email = email
WHERE id BETWEEN 1 AND 1000 AND new_email IS NULL;
📊 Phase 3: Application Update • Deploy application to use new column • Monitor for issues • Ensure dual-write if needed
🧹 Phase 4: Cleanup
-- Make NOT NULL after population
ALTER TABLE users ALTER COLUMN new_email SET NOT NULL;
-- Drop old column in separate deployment
ALTER TABLE users DROP COLUMN email;
Migration Best Practices:
⚡ Performance Considerations
Batch Large Updates:
-- Bad: Updates all rows at once
UPDATE large_table SET status = 'active' WHERE status = 'pending';
-- Good: Process in batches
WHILE @@ROWCOUNT > 0
BEGIN
UPDATE TOP(1000) large_table
SET status = 'active'
WHERE status = 'pending';
END
Use Online Schema Changes:
-- PostgreSQL: Add column without locks
ALTER TABLE users ADD COLUMN new_field TEXT;
-- MySQL: Use pt-online-schema-change for large tables
pt-online-schema-change --alter "ADD COLUMN new_field TEXT" D=mydb,t=users --execute
🛡️ Safety Checks
Pre-Migration Validation:
-- Check data constraints before migration
SELECT COUNT(*) FROM users WHERE email IS NULL;
-- If > 0, fix data first
-- Check table size
SELECT COUNT(*) FROM large_table;
-- Plan migration strategy based on size
Rollback Preparation:
-- Create backup before destructive changes
CREATE TABLE users_backup AS SELECT * FROM users;
-- Document rollback steps
-- ROLLBACK PLAN:
-- 1. Stop application
-- 2. DROP TABLE users;
-- 3. RENAME users_backup TO users;
-- 4. Restart application
🔧 Migration Tools & Techniques
Feature Flags for Schema Changes:
if (featureFlag.newUserSchema) {
// Use new column
user.email = user.new_email;
} else {
// Use old column
user.email = user.email;
}
Shadow Tables:
-- Create new table structure
CREATE TABLE users_v2 (
id SERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL,
-- new schema
);
-- Migrate data gradually
-- Switch application to new table
-- Drop old table
🚨 Emergency Procedures
Migration Monitoring:
-- Monitor long-running queries
SELECT pid, query, state, query_start
FROM pg_stat_activity
WHERE state = 'active' AND query_start < NOW() - INTERVAL '1 hour';
Quick Rollback Checklist:
- Can we stop the migration safely?
- Is the application still functional?
- Do we have a backup?
- What's the rollback procedure?
- How long will rollback take?
The Golden Rules:
- Never drop columns in the same deployment as code changes
- Always test migrations on production-sized data
- Have a rollback plan before you start
- Monitor lock duration and blocking queries
- Use feature flags for gradual rollouts
Remember: Migrations are code. Treat them with the same rigor as application code—testing, code review, and careful deployment.
Your database is your application's foundation. Don't let bad migrations crack it.
What's your worst migration story? 😅
