2025-11-28

The Era of Data-Driven Ideas

Legibility creates velocity


Anthony Lee Zhang categorizes ideas into two types:

  • Idea-driven ideas: an idea where you start off with a high-level philosophical frame and deduce a concrete insight using logical reasoning
  • Data-driven ideas: an idea derived from pure data with no preconceptions

Building on this framework, industries can be characterized as idea-driven or data-driven:

Idea-driven industries:

IndustryReasoning
Early-stage venture capitalLimited data on unproven startups creating new markets
Traditional economicsA macro-level assumption about markets (e.g. markets are efficient) leads to various downstream conclusions
Policy-makingFrequently shaped by ideological principles before empirical testing
Brand marketingOften relies on intuition about cultural trends and human psychology

Data-driven industries:

IndustryReasoning
Private equityPotential LBOs are modeled on a spreadsheet; if the investment is able to clear a given hurdle rate, then the investment is "good"
Quant tradingStrategies are thoroughly backtested against historical data
InsuranceRisk is quantified via actuarial tables.
Growth engineeringFeatures are A/B tested until a local maximum is achieved
PokerComputer solvers determine exact EV for every decision

Similarly, movements can be categorized as primarily idea-driven or data-driven:

Idea-driven movements:

MovementReasoning
Civil RightsAll people are created equal, segregation is inherently unjust
CryptoProperty rights should be digital-native, censorship resistant, and permissionless.
LongevityDeath is a problem to be solved, not an inevitability to be accepted; aging is a disease rather than a natural process

Data-driven movements:

MovementReasoning
Progress StudiesCompounding economic GDP growth is the most important factor to increase human welfare
Climate ActivismIf global temperatures increase by more than 1.5 degrees Celsius, it will cause irreversible negative externalities
Effective AltruismMoral obligations and contributions can be quantified, and therefore optimized through utilitarian frameworks

Enabled by technology, data-driven ideas operate at higher velocities than their idea-driven counterparts across industries and movements. Data-driven ideas are not more rigorous in nature, as evidenced by the many industries and movements built on falsified data.

In this post, I outline how the legibility of data-driven ideas creates coordination, which in turn, creates velocity.

Data as a Schelling Point

The post-Moneyball era demonstrated that everybody must think in data or risk obsolescence. The current era of statistical models and graphs aggregating thousands to billions of data points is a very recent phenomenon that is only now gaining widespread cultural acceptance.

The key feature of data is that it is focal, not that it is necessarily better. This leads to people gaining a sense of transcendence to data for its promise of unbiased representation, even if data is often misleading at best and fabricated at worst.

Charts and statistics compress an arbitrary number of data points into something everyone can see and cite. Often, these charts and statistics transcend language barriers, requiring only very basic knowledge of English and often zero additional context.

Legibility enables quick ascents

Data-driven ideas convert attention into conviction faster than idea-driven ideas. This enables individuals, industries, and movements to experience extremely large growth rates in a short period of time:

  • Individuals: Aella is a scientist and sex worker who has leveraged her following to gather some of the most comprehensive datasets on relationships and sexuality in human history. One of her surveys, which takes over one hour to complete, has over 850,000 responses. Every single post by Aella is backed by evidence, typically a graph derived by her own data from one of her surveys. Sex and relationship discourse is stale and vibes-based, while Aella's data provides legibility.

  • Industries: Quant funds out-fundraised discretionary managers not purely on returns, but on the ability to cite legible backtested data over vibes-based hedge fund managers. Growth engineers beat product visionaries in internal debates by A/B testing every aspect of the product experience.

  • Movements: EA scaled within years based on the premise that impact can be numerically measured. Lives and charities can be directly measured in impact per dollar, enabling the comparison of interventions on a spreadsheet and the ability to optimize your altruism.

Idea-driven ideas have to continually persuade, while data-driven ideas just have to point.

Falsifiable claims create fragility

The same property that enables quick ascent also enables quick collapse:

  • Individuals: Public intellectuals who stake their reputation on data-driven claims rise and fall with that claim. The prediction market pundit who calls an election correctly becomes a genius; the next miss makes them a cautionary tale. The half-life of a data-driven pundit is much shorter because their credibility is indexed to something falsifiable.

  • Industries: The replication crisis put the entire field of social psychology into question overnight. A quant fund faces the same dynamic: the backtest that attracted capital becomes a liability the moment the model breaks.

  • Movements: Climate activism coordinated globally around a specific number: 1.5 degrees. Websites now display countdown clocks to 1.5 degrees, currently at 3 years and 235 days. This creates a failure mode if there are no long-term externalities after reaching the number. What happens if we pass 1.5 degrees and the predicted externalities don't materialize on the expected timeline?

Idea-driven ideas are low-beta and don't face the same fragility because they are tied to unfalsifiable ideas.

Open Questions

  • How do the aesthetics of data-driven ideas relate to their popularity?
  • The half-life of many public intellectuals is quite short. How much can be attributed to staking their reputation on a specific data-driven idea and going down its respective rabbit hole?
  • How correlated are idea-driven / data-driven ideas and missionary / mercenary, respectively?
  • Where do LLM-derived ideas fit?
  • What's the optimal ratio of data-driven to idea-driven claims for a movement that wants both velocity and durability?

Popularity prediction hash: 9fa5947ff5a5745a1291997d6f7d81edcf0913ff6f2239ac79fc43d992b08237