Fundraising for Video AI and Sports Tech Founders
Lecture 2

Moats and Machines: Defining Your Data Advantage

Fundraising for Video AI and Sports Tech Founders

Transcript

SPEAKER_1: Last time we landed on this idea — investors fund prescription, not description. Now I keep wondering: what actually makes one AI platform harder to displace than another? SPEAKER_2: Almost universally, the answer is data, not the algorithm. Model architectures are increasingly commoditized. Most frontier models are accessible via cloud APIs. Durable advantage comes from proprietary data and the ability to deploy AI at scale. SPEAKER_1: So the moat isn't the model itself — it's what the model was trained on. SPEAKER_2: Exactly. A data moat is a defensible competitive advantage created when a company controls unique, hard-to-replicate data that improves its products as it accumulates. The key idea is that it compounds — better performance attracts more users, which generates more data. SPEAKER_1: Can someone listening get a concrete example of what that loop looks like in sports tech? SPEAKER_2: Think of a platform continuously capturing game footage across dozens of venues. Over time, that growing dataset can improve automated tagging and player tracking. A competitor entering later simply doesn't have that historical depth — and can't buy their way to it overnight. SPEAKER_1: So venue access itself becomes part of the moat. That's not just a software advantage. SPEAKER_2: Right. Companies like Spiideo and Fair Vision show that regulatory and venue-access constraints can become part of the moat. Exclusive rights to install capture systems inside arenas mean competitors simply can't gather equivalent footage. The hardware relationship locks in the entire data pipeline. SPEAKER_1: Tomasz Tunguz puts four specific conditions on a real data moat. What are those? SPEAKER_2: The data must be proprietary, must improve with scale, must be hard to copy, and must meaningfully raise switching costs for customers. If any one of those four is missing, the moat is probably shallower than it looks. SPEAKER_1: What's the risk for a founder who thinks they have a moat but actually doesn't? SPEAKER_2: Andreessen Horowitz makes a pointed argument — many claimed data moats are overstated. Data is often less unique than founders believe. Competitors can approximate it using adjacent sources. Synthetic data and transfer learning can also erode presumed advantages, because rivals may train on simulated datasets to reach comparable performance. SPEAKER_1: So synthetic data is a threat — but it cuts both ways, right? SPEAKER_2: It does. Synthetic data can be more valuable than real footage when real-world data is scarce, expensive to label, or legally restricted. For youth sports, privacy rules around minors make real footage hard to use. Synthetic data fills that gap — and that's an underexplored moat opportunity. SPEAKER_1: That brings up labeling quality — I've seen this flagged as a hidden differentiator, not just footage volume. SPEAKER_2: V7 Labs is direct on this. High-fidelity annotations directly improve model accuracy and are expensive to replicate. A competitor can scrape similar footage, but they can't easily replicate years of expert-labeled ground truth. Labeling quality, not data quantity alone, is what makes computer-vision systems defensible. SPEAKER_1: Now, for someone building early-stage partnerships with sports teams — how do they protect the insights generated from that data? SPEAKER_2: Contract structure matters enormously. Andreessen Horowitz recommends designing products that naturally generate unique labeled data through usage — so the startup owns derived insights, not just raw footage. That means negotiating IP rights over model outputs and annotations from day one, before the partnership scales. SPEAKER_1: McKinsey adds something important about operationalizing AI for moat-building. What's the argument there? SPEAKER_2: McKinsey emphasizes that embedding models into decision flows, training users, and building governance often matters more than marginal accuracy improvements. Greylock frames it similarly — the strongest moats are systems of intelligence sitting across multiple datasets, turning raw data into continuously improving decisions. The moat is organizational, not just technical. SPEAKER_1: There's also an underexplored angle — lower-tier sports. Most attention goes to top professional leagues. SPEAKER_2: And that's exactly where an unexpected moat can form. Youth and amateur sports data remains relatively under-digitized. Companies that can economically capture and structure that long-tail data may build advantages that are harder to displace — because no one else bothered. ADvantage VC explicitly looks for startups whose data capabilities scale across leagues, sports, and geographies. SPEAKER_1: For builders in this space, the algorithm is usually not the defensible part. SPEAKER_2: investors evaluate the uniqueness, volume, and relevance of data assets. Define the data strategy early — what unique data gets collected, how it compounds, how it improves performance. The model is a commodity. The data pipeline, the labeling quality, the venue access, the workflow integration — that's the moat.