author Magesh

Not Everything Can Be Vibe-Coded

Magesh Ravi on June 20, 2025

Two months ago, I set out to build what I assumed would be a simple pipeline: fetch a few hundred files from a specific Google Drive folder, process their contents, and load them into a database to support a RAG (Retrieval-Augmented Generation) system.

The folder had about 300 documents buried in a nest of subfolders — a mix of Google Docs, Slides, and Spreadsheets. I’d ignore everything else. Easy, right?

It wasn’t.

I sketched a basic DB schema and used GitHub Copilot to scaffold a Python CLI with typer. The core steps were simple in theory:

  1. Read the files,
  2. Generate vector embeddings from their contents,
  3. Store embeddings with metadata in the database.

Step 1: Reading the Files

The CLI worked fine on my local machine — at first. Docs and Slides went through cleanly. But some spreadsheets threw timeout errors.

❌ Problem 1: Timeouts on Large Spreadsheets

Some files had 160K to 200K rows. Reading them in one go was out of the question.

Fix: Batch reads in chunks of 10,000 rows. That solved the timeouts locally.

But when I tested the same script in production, inside a containerized Azure environment, it crashed. No errors in the logs. Just a silent failure.

❌ Problem 2: Memory Constraints

The container was limited to 1.5GB of memory — a limit I couldn’t change. Reading 10K rows at a time exhausted it.

Fix: Reduce batch size. Eventually settled on 1,000 rows per batch. It worked — until it didn’t.

❌ Problem 3: Rate Limits

At 1,000-row batches, Google Drive’s API rate limits (60 requests/sec) started kicking in.

Fix: Implemented a rolling-window rate limiter in code. Wait, retry, resume.

Stable now.

Step 2: Creating Embeddings

Using OpenAI’s embedding API was straightforward. Until…

❌ Problem 4: Token Limits

Even 1,000-row batches exceeded the model’s \~8K token limit.

Fix: Used tiktoken to estimate token size and split batches into chunks.

But…

❌ Problem 5: Lost Context

Some rows got split mid-sentence or mid-table, losing coherence across chunks.

Fix: Introduced overlap between chunks (100 tokens) to preserve continuity.


Reflections: More Than Code Completion

This pipeline ended up being one of the most complex pieces of software I’ve written in recent memory — and not because the logic was inherently difficult. It’s because robust software has to survive real-world constraints:

And yet, from the outside, it’s just "read files → embed → save." This is the paradox of production systems: they look simple only when they’re built right.


The Myth of “Vibe-Coding”

There’s a trend in tech to assume that you can just prompt your way to a working solution. And while AI tools are powerful, they don’t replace engineering judgment.

You can’t prompt your way around:

These aren’t "AI problems." They’re engineering realities.


Closing Thought

You don’t need to write everything from scratch. But you do need people who can recognize complexity when it’s hiding behind a seemingly “simple” task.

Some systems can be vibe-coded.

But the systems that matter? They need engineers.