The most consistent failure mode of AI-built and vibecoded systems is the schema. It looks reasonable for three use cases and becomes unworkable the moment the fourth one arrives. Everything else — the auth, the deploys, the front-end — can be patched in place. The schema can't. It is the part of the system everything else is anchored to, and it is the part the model has the least incentive to get right.
We rebuild the same handful of patterns every time. None of them are clever. All of them are dull, finite, and the highest-leverage work in any pre-launch engagement. This post is the catalogue.
Foreign keys, not "we'll join in application code"
The prototype gets away without foreign keys because the prototype has one developer, one database, and a small number of rows that were inserted in a known order. Referential integrity is enforced by the fact that the same human wrote the inserts and the reads in the same afternoon. Nothing has had time to drift.
Production doesn't get this. Production has a worker that retries, a background job that runs out of order, a deploy that rolls back halfway, a support tool that does an UPDATE someone didn't think about. Without foreign keys the database will, eventually, contain a row that references a parent that no longer exists, or never existed. The bug surfaces three weeks later as a 500 on a page nobody owns, and it takes a day to track down because the data on disk is now lying to you about what's possible.
A foreign key is one line of DDL. It is also a contract the database enforces on every writer, including the writers you haven't built yet. We add them everywhere, even where the model assured us we wouldn't need them.
Real constraints, not vibes
NOT NULL, CHECK, UNIQUE. These three constraints carry most of the load of keeping a production database honest, and they are the ones AI-generated schemas treat as optional decoration.
The pattern we see most often: a column that is "always set in practice" but nullable in the schema, because the prototype's seed script happened to set it. Six months later there are rows where it is null, because a code path nobody remembered didn't set it, and now every query has to defensively handle the null case. Multiply that across a schema and you have a codebase where every read is wrapped in conditionals that exist because the database wasn't asked to do its job.
Worse than no constraints is vibes-coded constraints. A CHECK that allows three values where the application only ever produces two, "in case we need it later". A UNIQUE on the wrong column pair. A NOT NULL on a column the application sets to an empty string when it doesn't have a real value. These look like discipline and aren't. They give the next engineer false confidence about what's in the table.
We write constraints that match what the application actually does, and we write them tight. If the application produces two values, the CHECK allows two values. If a column is required, it is NOT NULL and the application is changed to provide a value, not the other way around.
Indexes designed for the queries you actually run
Models guess at indexes. They are not bad guesses — they will index the foreign keys, they will index the obvious lookup columns, they will sometimes add a composite that looks plausible. They are also, almost always, not the indexes the application ends up needing.
The cost of guessing wrong has two halves. The first is indexes that exist and aren't used: dead weight on every write, dead pages in cache, dead bytes in backups. The second is queries that should use an index and don't, because the index was built on the wrong column order or the wrong predicate, and the planner quietly does a sequential scan instead. The application is slow and nobody knows why.
We design indexes against the actual query patterns once we have them. That means looking at pg_stat_statements, reading EXPLAIN ANALYZE on the queries that matter, and deleting indexes that have zero scans after a week of real traffic. Partial indexes, where the workload is skewed, are often worth ten times what a full index costs. None of this is exotic; all of it requires sitting with the database for an afternoon.
Migrations that run forward and backward
Most rollback attempts fail. The reason is almost always the same: the forward migration changed data, and the down migration only knows how to undo the schema change, not the data change.
A migration that adds a column with a default, backfills the existing rows, then drops the old column is three operations. The forward path runs them in order. The backward path has to recreate the old column, copy the data back (now possibly transformed), and drop the new one. If the data was lossy in either direction — a string truncated, an enum collapsed, a timestamp coerced — the rollback can't restore it. The team finds this out at the worst possible moment.
We write down migrations that actually work, and we test them against a copy of production data before we trust them. Often the right answer is to make the forward migration non-destructive: add the new shape, run both shapes for a release, switch reads, then drop the old shape in a later migration. Boring, slow, and recoverable. The opposite of clever.
JSONB is a tool, not a victory lap
JSONB columns are the place AI-generated schemas hide the decisions they didn't want to make. "We'll figure out the structure later" becomes a data jsonb column that accumulates fields nobody documented, types nobody enforced, and access patterns nobody indexed.
JSONB is genuinely useful. It is the right answer for sparse, evolving, tenant-specific shapes where promoting every field to a real column would be ridiculous. It is the wrong answer for the core entity of the system, where the fields are known, the queries are known, and the only reason it's JSONB is that the model didn't want to commit.
Our rule: every JSONB column gets an honest accounting of what's in it, a documented expected shape, and a plan for which fields will be extracted to real columns once their access patterns settle. We add CHECK constraints with jsonb_typeof where the shape is stable. We add expression indexes where a specific path is queried frequently. And we revisit the column every quarter to see what should now be a real column with a real constraint.
What this costs and what it's worth
Getting the schema right before launch is roughly one engineering week on a system of moderate size. It is dull work. There is nothing to demo at the end of it. The screenshots look the same as they did before.
Getting it wrong is months of incremental pain after the first hundred customers arrive. It is the on-call page at 2am because a NULL slipped into a column that "couldn't" have one. It is the migration that takes the database down for forty minutes because nobody added a concurrent index. It is the support ticket that can't be resolved because two rows reference a tenant that was deleted in March and the foreign key wasn't there to stop it.
Schema is the part of the system you cannot refactor without affecting every other part. Adding an index is cheap. Changing a column type on a live table with a hundred million rows is a project. Splitting a JSONB blob into normalised columns once half the application reads from it is a project that takes a quarter.
The discipline of getting it right before launch is dull, finite, and the highest-leverage week in any pre-launch engagement. We don't enjoy this work more than anyone else does. We just know what it costs not to do it.


