Defensibility: Applications

Part 7: Where bucks are born

Dec 12, 2024

This is the seventh in a series of posts dedicated to understanding defensibility in technology-driven markets.

If you’re new, start with the first post and subscribe to be notified as new posts are published.

Applications are the most important area of the AI stack because products must win before other layers of the stack can become durable businesses. If no applications exist, other layers will have no long-term customers; strategic innovation spend can only carry a market for so long.

Recalling the central topic of this letter, it’s not a question of whether AI will benefit customers, but “How will AI companies achieve durable value?”. Put more bluntly: who will achieve power and how will they achieve it?

In past market cycles, defensibility has been a theoretical exercise for startups; competition was rarely the cause of premature mortality. As is often said in the Y-Combinator community, “more startups die from suicide than homicide”.

Given the speed of advancement and intense competition in the market, I believe homicide—death by competition—will be a significant cause of AI application startup failure.

With AI, this time is different: incumbents are stronger and disruption is more frequent

The “invest in products that are 10x better than alternatives” investment algorithm has historically made a lot of money for founders and investors for two major reasons, as we discussed in the opening of this letter:

Incumbents were either slow or nonexistent. The taxi industry was ill-prepared for Uber. Symantec was digesting Veritas as CrowdStrike rose to prominence. Neither Friendster or MySpace were entrenched incumbents during Facebook’s rise.
New disruptive tech stacks occurred more slowly than startup lifecycles. Historically, startups could build structural power before they faced their first Seldon crisis: the first time they need to catch a new technology wave. Netflix’s switch from DVD to streaming occurred long after their IPO. Facebook’s focus on mobile (via their Instagram acquisition) happened right before their IPO.

Today, neither are true.

“Big tech” companies are much better at execution than incumbents from a decade ago and have quickly integrated AI into their products. For the most part, tech leaders have strong leadership, strategies that embrace disruptive innovation, and many talented individuals at all levels of the organization.

In their 12.3 update, Tesla was able to migrate their autonomy architecture from a traditional computer vision model with “300k lines of explicit C++ code” to a modern AI stack without needing to deploy any major hardware changes.1 Google has neatly integrated AI results at the top of their search results page. Microsoft has added co-pilot features to their enterprise productivity suite. OpenAI (which I think now counts as big tech) has been able to successfully incorporate major AI advancements into ChatGPT for several years without changing their interaction model.

Some investors believe the vast majority of the value created by AI will go to incumbents, with little whitespace left over for startups. Regardless of who accumulates more value, I believe there are ripe opportunities for startup founders and investors to seize, as we’ll cover below.

Aside from incumbent competition, disruption timelines are compressed in this market cycle. AI is moving so fast that there may be a new tech stack every few years. As quickly as we remember technology moving in the past, AI is moving even faster.

Today, many AI products are “10x better” than the human equivalent. Enterprise co-pilots vs. legacy manual processes. AI marketing tools vs. agencies. AI-powered industrial robotics vs. manual work cells and expensive system integrators.

A 10x product today might take the form of a strong technical team building a carefully fine-tuned model or exceptionally engineered RAG system that provides far better results than an “off the shelf” models. A better AI legal assistant, a better AI software engineer, etc.

Here’s the problem: wave after wave of new AI tech stacks make today’s 10x into tomorrow’s table stakes. The next iteration of foundation models might be better than current 10x products out of the box. Once that happens, today’s mind-blowing products become commonplace. What happens when the Adobe, Microsoft, Google, and OpenAI build the feature into their leading suites? What was once 10x value where users would deal with the UX and procurement inconvenience of a new AI-first application turns into 1.5x value that doesn’t clear a customer’s activation energy. Why leave Gmail, Photoshop, ChatGPT, or Excel when you don’t have to?

If the reason we have a 10x better product is not a component we control, beware!

To reiterate a point from earlier in this letter:

“If everyone else can use the component that enables 10x better product, there will be no resultant power.”

One way to avoid the trap of “point in time thinking” is to think backward, not forward. What does the world look like when all companies can build their products on top of superhuman AGI?

Founders and investors must be prepared for a world where AI products are not compared with humans, but with other AI products. In enterprise knowledge work, for example, we’re seeing the first wave of AI replace manual processes (“human is the loop”) with co-pilots (“human in the loop”). Soon, co-pilots will be the status quo and fully autonomous agents (“human on the loop”) will be the disruptive new stack. Over time, AI super-intelligence may be able to handle long-duration strategic projects that would have taken a whole department of humans. Super-intelligence would be yet another disruptive stack, replacing singular agent products.

Assuming a company can get enough momentum to be able to build structural power, what form should they invest in? Below, we’ll look at how AI impacts two common forms of power in technology companies: switching costs that disincentivize customers from leaving a product and data network effects that allow datasets to provide increased customer value with increased scale.

AI reduces switching costs.

Switching costs have been strong moats for technology companies: data, integration and process lock-in make it difficult for customers to leave. Historically, when customers store a large amount of data in one vendor’s product it was expensive to “port” that data to a competitor’s. Large hospital systems spend years and millions of dollars customizing new electronic health record (EHR) systems. Enterprises of all sorts incur tremendous costs integrating ERP systems (e.g. SAP, Oracle, Microsoft, NetSuite) such that switching becomes a prominent feature of employees’ nightmares.2 CRMs, similarly, contain a lot of historical data and a multitude of integrations accumulated over the years. Adobe’s PDF format and Microsoft’s various Office document formats create both switching costs for preexisting data and a network effect for sharing data between individuals and organizations.

AI makes everything fungible, reducing structural switching costs.

AI makes it much easier and more reliable to migrate data, integrations, and processes between platforms. What used to be a slow, manual, and error-prone process necessitating a small army of consultants can now be fully automated. There’s one exception: any time a user interface is tightly integrated into peoples’ daily work life, switching times are longer because people change more slowly than computers. The product design principle of MAYA—“most advanced yet acceptable”—is a speed limit to the rate of change groups of humans can tolerate.

Incumbents fight back. We should expect to see legacy vendors put up technical and contractual barriers to customers slurping data out of their systems as new entrants build AI-driven migration tools. I speak with countless startups who, when integrating with legacy enterprise software products, have tough negotiations to access platform data or API functionality. Sometimes barriers take the form of prohibitive “API access” fees or even blanket prohibition on anything other than manual “point-and-click” usage. Fortunately for startups, customers have leverage and can push legacy vendors to open up. Some large vendors have recognized this effect and embraced openness. Salesforce executed on an “open system of record” strategy effectively in the last market cycle: instead of trying to own all use cases, they became the integration point other sales and marketing tools depended on.

AI can have strong or weak data network effects, depending on the use case.

Questioning the orthodoxy for data always being a strong network effect seems like heresy given how often companies talk about their “data moat”. This is applicable both to application vendors who customize models as well as model-only vendors.

To understand why data is valuable for AI, it helps to first classify data into two key flavors:

World data is used to train models to better understand how our universe works. Internet content, experimental protein folding data, emails, labeled images, vehicle driving sessions, and human feedback on prior AI generated responses all help AI models get more intelligent: to make their output less random and more desirable.
Context data is used to provide background information about a particular user, organization, or situation to produce better results than “one-size-fits-all” output. A user’s current location and historical reservations when asking for a restaurant recommendation. Prior emails when composing a new message. Existing marketing collateral when designing a new ad campaign.

To distinguish between world and context data, we must ask whether the data contains “new information” about the world as a whole or whether it tells us new information about a particular person, organization, or situation.

Data vs. new information is a subtle, yet important, concept. Any form of raw input is data, e.g. text, images, audio and video. Information is the meaning extracted from that raw data. Consider the following simple dataset:

Sally owns a red convertible.
Sally owns a black truck.
Sally owns a red convertible and a black truck.

If we have data point #1, then adding #2 tells us new information: that Sally also has a truck. If we add data point #3, we’re adding new data but no new information. Data points #1 and #2 improve the model’s understanding of the world, but #3 does not. When we add data but no new information, the additional data holds no value.3

Put another way, new information is the only form of valuable data. If data tells us something we already know: it’s likely useless.

As this applies to world data: when we already know a lot about the world, it becomes increasingly difficult to find things we don’t already know. For randomly-gathered datasets, novelty becomes increasingly hard to come by. Such data has “decreasing marginal value”—each additional piece of data is less valuable than the last. We’ll get into a specific example as this applies to robotics below.

To understand context data, let’s consider another data point:

4. Sally is currently driving her red convertible with the top down.

While #4 doesn’t tell us too much new information about the world, it does tells us what Sally is doing right now. When asked for a restaurant recommendation, a model can infer that it’s probably warm outside and Sally is more likely to want to eat outdoors than if the convertible’s top were up. This contextual information is very valuable to help a model produce the right results without Sally having to explicitly ask for restaurants with outdoor seating.

Context data is only valuable in a limited scope. In a month from now, knowing that at one point Sally had her convertible’s top down doesn’t tell us about what sort of restaurants she might want then. Similarly, knowing what kind of car Sally is driving is unlikely to help a model make recommendations for Sam, a completely unrelated person in a different part of the world. In order for a source of context data to remain valuable, it must be fresh and relevant to the situation a model is reasoning about.

Both world and context data produce value for customers because they improve the quality of a model’s output. This implies that there is some form of network effect, but the network effect plays out differently for each:

World data has a global network effect: all users benefit from data generated by all other users. This can lead to power at a global scale: early leaders have the best data and therefore an opportunity to build best product. They can attract more users, generate more data, build an even better product, and so forth.
Context data has a local network effect: only the user whose context is present benefits from the data. This tends to advantage incumbents who have existing customer relationships where they can build new products on top of existing context data. New entrants must invest not only in a new product but also a source of context data; however, when context can be gathered via API (e.g. Email, Calendar, financial transactions) or via a mobile device (e.g. location, activity tracking) that incumbent “cross-sell” data advantage becomes smaller.

In both cases, each additional data point generally has diminishing new information and therefore, diminishing additional value. The most important question when analyzing data defensibility is: when is a dataset sufficient? When do we reach a point where we stop caring about more data.

Let’s look at another example of AI for robotics where data directly translates into improved operating characteristics. In systems engineering, one way to measure reliability is the “uptime percentage”. The shorthand engineers use is to “count the 9s”:

99% (two nines) =~ 3.7 days of downtime per year
99.9% (three nines) = ~8.8 hours of downtime per year
99.99% (four nines) = ~53 minutes of downtime per year
99.999% (five nines) = ~5.3 minutes of downtime per year4
99.9999% (six nines) = ~32 seconds of downtime per year

In AI for robotics, at what point does the reliability of a model reach sufficiency such that further improvements no longer meaningfully impact customer satisfaction? Assuming there are no safety issues5, the number of nines to reach sufficiency depends on the use case:

Fully autonomous (level 5) passenger vehicles probably need 5-6 nines. We’re willing to tolerate a few minutes of stoppage during a snowstorm, but hours would be infuriating.
Other robots out in the world probably need probably 2-3 nines. It’s easy for such robots to stop while remote pilots can take over difficult situations: 0.1-1% human piloted time adds very little to a cost structure.
Factory automation needs perhaps 3-5 nines, depending on the criticality of the process. Manufacturing lines are expensive to stop, but if stoppage is infrequent it doesn’t meaningfully impact economics of the factory as a whole.

Now, given we have reliability targets, how much data do we need to achieve each? It depends on the complexity of the problem and how good our data is. If we’re taking a random sample of data from the real world, adding additional 9s requires increasing amounts of data. For example, moving from 99.9% to 99.99% might require 10x the data because adding that extra 9 requires training on very rare circumstances that, by definition, are infrequent in naturally occurring datasets.

Because data and training cost money, at some point there’s a breakeven point where the cost of training on additional data is more than the customer value unlocked by improved model reliability. If we’re in a circumstance where models can get to sufficiency with just few 9s, this might not be much data at all!

Let’s not stop there: not all data is created equal. In circumstances where a system’s failure modes are predictable, we can supply carefully-chosen or synthetically-generated training data to capture edge cases and achieve many nines of reliability with even less data than a random sampling of real world events would require. With carefully chosen training data, data network effects in narrow applications might be even smaller than most would predict.

If a model can get to sufficiency with little data, then there’s no data network effect.

On the other hand, general purpose AI systems must know about everything in the world. Getting to “good enough” requires a colossal amount of data. General purpose AI applications seem to have a large data network effect.

For founders and investors, we better know how much data we need before reaching sufficiency. Betting on a data network effect where none exists instead of investing in other forms of power can be a terminal error. Once models reach sufficiency, power is not based on data, but other effects: e.g. economies of scale, cost, distribution, brand, and execution.

$100 trillion of global GDP is up for grabs outside big tech’s blast radius.

I believe AI will be a disruptive tech stack in every industry. This is great news for startup founders and investors because it means there will be opportunities to build durably valuable companies outside of the immediate roadmaps of incumbent tech leaders.

Large technology companies who are very good at incorporating new advancements are scary competitors. In contrast, companies in “sleepy” industries accustomed to only gradual technological change present softer targets for startups to overtake.

When entering traditionally “non-tech” markets, startups need to decide how they approach the market and what type of offering to sell.

Sell to or compete with?

When entering an existing market, startups must decide whether incumbents are customers or enemies. Sometimes the answer is obvious because there’s only one viable path. Much of the time, though, there’s a complex series of tradeoffs. Some key questions to ask include:

What are sales, marketing and implementation costs of selling to incumbents? Do they have experience integrating technology? If not, a startup will probably face long sales and integration cycles leading to poor unit economics. It might be easier to compete.
Are there regulatory barriers? Healthcare, legal, and financial services, for example, are all markets that have so much regulation that it’s not realistic for a single company to dominate the entire market with a disruptive product. Unsurprisingly, startups in these sectors tend to either sell to incumbents when their product is broadly applicable (e.g. generalist AI legal tools) or compete with incumbents when it is narrowly focused (e.g. AI-powered specialty law firms).
Is the market fragmented or consolidated? At one extreme, selling into a consolidated monopoly or oligopoly market is dangerous because there are a tiny number of customers. Economists call such a monopoly of demand a “monopsony”. Airbus and Boeing are unlikely to cede power to suppliers because they’re the only games in town for large commercial passenger planes. Unsurprisingly, autonomous aircraft startups are competing with incumbents, starting with smaller aircraft. A fully fragmented market can also be difficult to sell disruptive products into because deal sizes are so small—often better to compete there as well. There’s a goldilocks zone for “sell to” where deal sizes are large but no individual customer can push us around.
Scale of customer impact? In industries that require hundreds of millions or billions of dollars of capital expenditures (e.g. steel, semiconductors, mining, aviation, railroads, telecom, chemicals) a company must have a transformative tech stack that massively impacts economics to raise the required capital. Startups increasing profit by a small amount in chemical manufacturing might have a large market when selling across segments, but the impact is probably not a large enough difference to justify the risk of building their own plants.
What are end customers’ switching costs? Historically, these are high for any product with workflow integration or lots of data. AI lowers these costs, especially if users are willing and able to learn a new system. It’s an especially interesting opportunity when existing users abhor the user experience of legacy products.
How will we eventually gain power? Startups should expect incumbent customers to either try to duplicate their products internally or switch to a cheaper competitor if possible. Is there a path to power?

“Compete with” is a viable option in a lot of markets because AI can substantially reduce costs while providing more value to customers. Expect large rewards for startups outside of what we traditional consider “technology” for those that successfully execute and build durable power.

Sell outcomes, not software.

While there are still some a number of recalcitrant industries that use pen and paper or ancient software, many markets are software-saturated.

It’s tempting to dismiss new software companies as squeezing water from a rock. After years of skilled teams looking for opportunities in every corner of the economy, many buyers have software fatigue. Overbuying during the good times is leading to consolidation and cost cutting in the lean times.

At the same time, many companies are still a mess. Internal processes cost too much, take too long, fail frequently and provide little differentiated value to the organization. Instead of selling software to support these workflows, AI allows startups to sell workflows as a whole that adapt to each customer’s unique environment.

Just as the global economy benefits from division of labor of humans, the unbundling of companies will lead to division of labor amongst organizations. There’s no reason most companies need to run undifferentiated workflows when they can pay someone else to run the process better and cheaper. An early version of this is the migration from internally-run datacenters to cloud computing. Very few organizations can run a datacenter as effectively as Amazon, Google, or Microsoft so they shouldn’t bother. As a result, companies can focus on applications “at the top of the stack” where they can provide differentiated value.

Already, there are a number of startups successfully selling AI-enabled outcomes across functions: IT, security, marketing, HR, legal, finance, sales, and engineering.

In consumer markets, I suspect a similar specialization of labor will occur: more situations where we can pay someone to do something we don’t want to do.

When selling outcomes, companies can create high switching costs by investing in human relationships (i.e. sales) and integrate into multiple workflows. Once embedded at multiple points with a customer, a company ceases selling software and starts selling “forever-ware”.

When selling outcomes, improvements in AI technology are likely “evolutionary” because the product we deliver the customer remains the same. We can cleanly integrate AI advancement into our offering. Transitioning from co-pilots to agents doesn’t change what we sell, it just improves our gross margin.

Will AI application revenue justify the hype?

Unlike some technology booms where the “story:reality ratio” never gets right side-up, I believe AI will deliver on its promise to transform the global economy. Even today’s embryonic AI products deliver tangible benefits (making things better, faster, cheaper, and/or more reliable).

Consider use cases where both startups and incumbents are successfully employing AI either in production or soon-to-be-production environments:

Industrial robotics
Autonomous vehicles
Software engineering
Legal services
Customer support
Travel
Marketing
Healthcare
Biology
Media
PsyOps

The AI cynics point to all sorts of use cases where AI isn’t yet good enough to supplant current solutions. Of course, there’s a very reasonable clear-eyed cynical argument against the blinded optimism that some AI boosters profess: AI investments are currently outstripping sustainable revenue.

Ultimately, I believe it’s a matter of timing. Given the trajectory of progress, I believe there will be no such thing as an AI industry. All companies will need to embrace AI or they will get smoked by others that do.

To support the optimistic case, very few markets have reached “sufficiency” with respect to foundation models, so every model improvement makes AI products even better.

That said, value is created by particularly talented teams who are dedicated to serving their customers. Not by simply showing up and waving our hands, shouting “(something something) AI”.

“…rapid puncturing of a share price bubble is typical of what happens during a phase in which market valuations are driven more by themes and concept stocks than by profits, dividends and other fundamental considerations. It has been repeated many times…”
- Alasdair Nairn

The best companies will build something people want—catch a wave and keep surfing it—until there’s another wave. Then use their execution power to catch that one.

Speed, rainmaking, and taste. That’s power.

Defensibility
A series of posts dedicated to answer the question: Where will value accrue in AI?

Introduction: Why this time is different.
Technology markets are organized around “stacks”.
"Execution power” is becoming more important than classical “structural power”.
Market revolutions occur when “critical" technology makes a new stack “viable”.
When multiple stacks become viable in rapid succession, companies must “AND” or “OR”.
Power within the AI stack—hardware, hosting, models, and infrastructure.
Power in AI applications—big tech, switching costs, network effects, and the $100 trillion of global GDP up for grabs.

In 2015, I was part of a research project where with physical access to a Model S, we were able to install malware to be able to remotely disable a vehicle while driving. Even then, Tesla was able to fix the issue with an automatic software update. Meanwhile, other automakers facing similar issues had to fully recall their fleets.

There’s an old joke that modified SAP’s tag line from “The best run SAP” to “Only the best can survive an SAP implementation.” To be fair, this is not an issue specifically with SAP, but a recognition that migrating to any new ERP is a pain.

There’s a small asterisk here: reinforcing existing data can be important if the dataset has some uncertainty around it. For example, in intelligence gathering, all sources are suspect but if we hear the same information from two independent sources, we have higher confidence that it's credible.

In software, critical systems are often engineered for “five nines”. In the early day’s of Twitter’s “fail whale”, engineers used to joke that it had “five sevens” of uptime.

For manufacturing this might mean an isolated “safety curtain” system that prevents failures from hurting people or in autonomous driving an isolated “pull over and stop” system that prevents crashes.