Six Days

I want to be honest about what “built in six days” means, because most posts that use that framing are describing something much simpler than what I built, and they leave out the parts that actually matter.

WhereToAdvisor has a scoring engine across eight facets, 100+ destinations, personalized results based on user identity profiles, a dealbreaker system with real-time recalculation, a data pipeline pulling from 18 sources, and an ethical architecture that required as much design thinking as any technical feature.

Six days. Here is what that actually looked like, and what it tells me about where product development is heading.

Day one was not code

The first thing I did was write a PRD.

Not because I was following a process. Because I had learned, over years of building products with teams, that ambiguous specs produce ambiguous software. AI is faster than any engineering team I have worked with. It is also completely literal. Give it a clear spec and it executes. Give it a vague one and it fills the gaps with guesses, and you spend the time you saved on rework.

Day one was product decisions. A PRD. Three design specs: the dealbreaker system, the real estate sub-facet, the climate zone framework and demographic data inventory. A responsible use section for the acceptance facet with the same priority as a security requirement.

The spec work took roughly the first two days. It saved multiples of that in implementation time. That ratio is not obvious until you have experienced the alternative, which is prompting AI in real time and iterating your way to something coherent. That approach works for simple products. It does not work when the product has ethical constraints built into the architecture and a data model that has to hold together across eight facets and 100+ destinations.

Horizontal bar chart divided into six color-coded segments showing where six days of build time were allocated: Spec and PRD (30%, dark teal), Data pipeline and normalization (20%, medium teal), Scoring engine (15%, coral), Quiz and results UI (20%, warm sand), Content pipeline and blog (10%, muted teal), QA and launch (5%, dark teal). — A third of the build was spec work. That ratio is intentional.

What AI handled well

I want to be specific here because the vague version of this claim is not useful to anyone.

The data pipeline. AI wrote the ingestion scripts, normalization functions, and Supabase integration once I had documented the methodology. What would have been several days of engineering work took hours.

The Next.js App Router structure. Component scaffolding, API route patterns, Vercel deployment configuration. Reliable and fast. I described what I needed and it built it.

The scoring engine implementation. Once I had specified the weighting logic, normalization rules, and persona multipliers, translating that spec into working code was straightforward. AI did not design the scoring model. It implemented the one I designed.

Debugging. Paste an error, get a diagnosis and a fix. This alone reclaims hours across a six-day build.

The blog content pipeline. This one is worth describing in detail because it illustrates what AI-assisted building actually looks like at its best. The pipeline queries Supabase for destination and scoring data, applies persona weight multipliers, calls the Anthropic API to generate narrative content per destination, and outputs Portable Text for direct entry into Sanity CMS. A process that would have taken most of a day manually is now a script that runs in minutes. I described what I needed. AI built it.

What AI did not handle well

This is where the vibe coding posts usually go quiet. I am going to be specific here too.

The ethical architecture. AI cannot decide that demographic data should feed scoring algorithms but never surface as user-facing filters. That is a product decision with moral weight. It requires a human who has thought through the misuse cases and is willing to hold the line on a constraint that makes the product harder to build. I described that decision in Post 3. AI implemented it. It could not have generated it.

Data source evaluation. Which sources are authoritative. Which ones require commercial licenses. Which ones are too stale to use for MVP and which gaps are acceptable to ship with labeled. These are judgment calls that require understanding what the data actually measures, not just what it says it measures. AI can research options. It cannot evaluate them.

Normalization methodology. Deciding how to translate a Transparency International score into a 0–100 governance rating required understanding what the score measures, what its distribution looks like, and what trade-offs different normalization approaches introduce. The decision shapes every governance score in the product. It is not a technical problem. It is a product problem that requires human judgment.

The scoring model weights. The acceptance facet defaults to 15%. Economics to 18%. Governance to 12%. Those numbers are deliberate product decisions informed by persona research. AI can implement any weights you give it. It has no basis for deciding what the weights should be.

Two-column table titled “Six days. Two kinds of work.” Left column (teal header) lists what AI handled: data pipeline, normalization functions, component scaffolding, scoring engine implementation, debugging, blog pipeline, PDF generation. Right column (cream header) lists what required human judgment: ethical architecture, data source evaluation, normalization methodology, scoring weights, misuse case identification, responsible use principles, data privacy requirements. — AI is a very capable builder. It is not a product manager.

The model

Here is what I think is actually happening when a product leader builds this way.

The constraint was never technical skill. Product managers who could not write code have always had to translate product thinking into engineering specifications and wait for the result. The AI layer collapses that gap. The translation is now faster and the feedback loop is tighter.

What does not change is the thinking that has to precede the building. The ability to spec clearly. To make ethical decisions early and document them with enough precision that they survive implementation. To hold the line on principles when a shortcut presents itself. To look at what AI produced and know whether it is right, even when it runs without errors.

That last one is more important than it sounds. AI-generated code that runs is not necessarily correct. A scoring normalization function can execute perfectly and produce outputs that are subtly wrong in ways that only become visible when you look at a full set of results and ask whether they make sense. Catching that requires product judgment, not technical skill.

What got deferred

City-level scoring for 200+ destinations: targeted for v1.1. Country-level is sufficient to validate the model and get real user feedback.
Real estate sub-facet: fully specced, not yet built.
Multi-language capability

These are decisions, not failures. Shipping a focused product that does one thing well is better than shipping an incomplete product that does several things poorly. The spec for each deferred feature exists. The build is sequenced. That is what a roadmap is for.

Next: What We Got Wrong. What is incomplete, what I would do differently, and what this series has convinced me about the future of product management.

Read the full series at wheretoadvisor.com/blog.