</>StackKit
</>StackKit

Developer tutorials & guides

Database server infrastructure
Database

Database Indexing Explained Simply (With Real Query Examples)

Understand what database indexes are, how they work under the hood, when to add them, and common indexing mistakes that hurt performance.

S
Sam Rivera
April 5, 20257 min read
#database#sql#postgres#performance#backend

What Is a Database Index?

An index is a separate data structure — usually a B-tree — that the database maintains alongside your table to speed up lookups. Think of it like a book's index: instead of reading every page to find "authentication," you jump straight to the page number listed in the index.

Without an index, a query like SELECT * FROM users WHERE email = 'alex@example.com' scans every row in the table — an O(n) operation. With an index on email, it's O(log n).


How Indexes Work Internally

PostgreSQL (and most databases) use a B-tree (balanced tree) as the default index structure:

  • All leaf nodes are at the same depth
  • Each lookup traverses from root to leaf in O(log n) time
  • The tree automatically stays balanced on inserts/updates

For a 10-million row table, that's about 23 comparisons instead of up to 10,000,000.


Creating Indexes

-- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Composite index (column order matters!)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);

-- Unique index (also enforces constraint)
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);

-- Partial index (only indexes rows matching condition)
CREATE INDEX idx_active_users ON users(email) WHERE active = true;

When to Add an Index

Good candidates:

  • Columns in WHERE clauses you run frequently
  • Columns used in JOIN conditions
  • Columns in ORDER BY when sorting large result sets
  • Columns with high cardinality (many unique values)

Poor candidates:

  • Low cardinality columns (gender, status with 2-3 values) — full scan is often faster
  • Columns on small tables (< 1,000 rows) — full scans are fast enough
  • Columns that are rarely queried

Composite Indexes: Column Order Matters

CREATE INDEX idx_orders ON orders(user_id, status, created_at);

This index can satisfy:

  • WHERE user_id = 5
  • WHERE user_id = 5 AND status = 'pending'
  • WHERE user_id = 5 AND status = 'pending' AND created_at > '2025-01-01'
  • WHERE status = 'pending' ✗ (doesn't start with user_id)
  • WHERE created_at > '2025-01-01'

The leftmost prefix rule: the index is usable for any query that specifies the leading columns in order.


Using EXPLAIN ANALYZE

Always verify your indexes are being used:

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 123 AND status = 'pending'
ORDER BY created_at DESC;

Look for:

  • Index Scan — using the index ✓
  • Seq Scan — full table scan (check if you need an index)
  • Bitmap Heap Scan — partial index use ✓
  • cost and actual time numbers

Common Indexing Mistakes

1. Over-indexing

Every index speeds up reads but slows down writes. Each INSERT, UPDATE, DELETE must update all indexes on that table. A table with 15 indexes on a high-write workload will be slower than one with 5 well-chosen indexes.

2. Using functions on indexed columns

-- This WON'T use the index on email:
WHERE LOWER(email) = 'alex@example.com'

-- Fix: create a functional index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));

-- Or store emails pre-lowercased

3. Implicit type casting

-- email is VARCHAR, this won't use the index:
WHERE email = 12345

-- Fix: match types
WHERE email = '12345'

Monitoring Index Usage

-- Find unused indexes (PostgreSQL)
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY tablename;

Drop indexes that never get used — they only cost write performance.


Conclusion

Indexes are one of the highest-leverage performance tools in a developer's toolkit. The formula is simple: identify slow queries with EXPLAIN ANALYZE, add indexes on the columns being filtered/joined/sorted, verify the index is used, and monitor for unused indexes over time. Most performance problems in database-backed apps are solved with the right indexing strategy.

#database#sql#postgres#performance#backend