Base44 Trains Proprietary LLM to Escape Frontier Model Dependency

Base44, the vibe-coding platform Wix bought for $80 million when it was barely six months old and had a team of eight, is now rolling out its own LLM — a direct bet that owning the model beats renting GPT-4 or Opus.

Why a Vibe-Coding Startup Needed Its Own Model

Founder Maor Shlomo put it bluntly: “Training and owning the model as part of [our] entire stack allows us a lot more optimizations on latency, cost, and efficiency.” That’s a polite way of saying that paying per-token for every user’s app-generation session is a terrible margin play at scale.

Competitor Lovable recently hit $500M ARR relying on external LLMs. Base44, at $100M ARR, is taking a different path. Shlomo expects other scaled players to follow — “at least the players that have gotten enough scale and velocity to have enough data.”

Data, Distribution, and the Cost of Inference

Jonathan Userovici, a general partner at Headline (portfolio includes Mistral AI), named three defensibility ingredients: data, distribution, and tech stack. Base44 now claims all three. The first iteration of its model, Base1, was trained on a dataset generated from “tens of millions of real user interactions on the platform.”

Userovici also flagged the cost pressure driving enterprise customers away from always using the latest frontier models. “An entire infrastructure is being set up to do orchestration and optimization to select the right models … so that costs don’t skyrocket.” Base44’s move directly addresses that: a model purpose-built for vibe coding, not general chat.

Specialization vs. Frontier Labs

Shlomo argues that frontier models will stay general, giving Base44 an edge in its niche. But Userovici warned against underestimating them, citing Harvey — the legal AI startup that abandoned plans to train its own model. The real competitive threat may come from Cursor, Claude Code, and xAI’s Grok, all encroaching on vibe-coding turf with their own data feedback loops.

Even so, Base44’s vertical integration — owning distribution, data, and inference — mirrors what Userovici calls the “only vertically integrated vibe-coding application.” The payoff isn’t instant; Base44 noted that “ownership of the model gives direct control over compute and inference spend, expected to result in a structurally stronger margin profile over time.”

With Base1, Base44 is betting that a focused, optimized model trained on its own user data can undercut frontier model costs and lock in margins — a play that only works if their data moat deepens faster than the frontier labs generalize.

Source: Vibe coding platform Base44 launches own model as AI startups seek defensibility
Domain: techcrunch.com

Base44 Trains Proprietary LLM to Escape Frontier Model Dependency

Why a Vibe-Coding Startup Needed Its Own Model

Data, Distribution, and the Cost of Inference

Specialization vs. Frontier Labs

More in Artificial Intelligence