December 19, 2024 3 min read

TimescaleDB for Logs

Why time-series matters for observability

I almost made a huge mistake.

When I started building Recall, I reached for what I knew: PostgreSQL. Regular tables. Regular indexes. Ship fast, optimize later.

Then I ran some numbers. And my stomach dropped.

The math that scared me

A medium-sized Rails app generates maybe 10,000 log entries per day. That's 3.6 million per year. Per app.

If Recall has 100 customers? 360 million rows.

I've managed tables with hundreds of millions of rows before. They're miserable. Queries slow down. Inserts back up. Vacuuming takes forever.

I was about to build a product designed to collect massive amounts of data... on a database architecture that hates massive amounts of data.

Not great.

The discovery

A friend mentioned TimescaleDB. "It's just PostgreSQL," he said. "But for time-series data."

I was skeptical. "Just PostgreSQL" usually means "PostgreSQL with extra problems."

But I tried it anyway. And holy shit.

What makes it different

TimescaleDB does one clever thing: it splits your table into chunks by time.

Instead of one giant logs table with 100 million rows, you get:

logs (hypertable)
├── last week (chunk 1)
├── week before (chunk 2)
├── two weeks ago (chunk 3)
└── ...

When you query "logs from the last hour," TimescaleDB only looks at the recent chunk. The old chunks don't exist as far as the query is concerned.

Simple idea. Massive impact.

The benchmark that sold me

I loaded 10 million log entries into regular PostgreSQL and TimescaleDB. Same data. Same hardware.

Query: "Give me error logs from the last hour, ordered by time, limit 100."

Regular PostgreSQL: 2,400ms

TimescaleDB: 12ms

I ran it again. Same results. Not a fluke.

200x faster. On the same hardware. Same SQL.

I was sold.

The features I didn't expect

Here's what I love about TimescaleDB:

Automatic cleanup. One command tells it "delete data older than 30 days." It just... does it. No cron jobs, no manual purging. The old chunks disappear automatically.

This is perfect for tiered pricing:
- Free tier: 7 days retention
- Pro tier: 30 days
- Enterprise: 365 days

One config change. Done.

Compression. Old chunks get compressed automatically. My test data compressed 12x. 100GB becomes 8GB. Still queryable.

Disk space was one of my biggest cost concerns. TimescaleDB made it a non-issue.

It's still PostgreSQL. This is the killer feature. All my Rails knowledge works. ActiveRecord works. Migrations work. pg_dump works.

I didn't have to learn Elasticsearch. I didn't have to manage a cluster. It's just Postgres with superpowers.

The migration

Moving from regular PostgreSQL to TimescaleDB took about an hour:

Install the TimescaleDB extension
Change one line in my migration: SELECT create_hypertable('logs', 'timestamp')
That's it

My Rails code didn't change at all. It's still Log.where(timestamp: 1.hour.ago..).order(timestamp: :desc).limit(100).

ActiveRecord doesn't know the difference. Neither does my application code.

What I'd do differently

I wish I'd started with TimescaleDB from day one. Not because migration was hard—it wasn't. But because I spent mental energy worrying about scale that I didn't need to spend.

If you're building anything with timestamps—logs, metrics, events, analytics—just use TimescaleDB. Don't think about it. Don't "wait until you need it."

You'll need it. And the migration is easier at the beginning than later.

The stack now

Recall's data layer is dead simple:

TimescaleDB for storage (chunks, compression, retention)
Regular ActiveRecord for queries
MCP interface for Claude

No Elasticsearch cluster to manage. No Kafka for log ingestion. No complex data pipeline.

Just PostgreSQL. With superpowers.

Sometimes the boring choice is the right choice.

— Andres