The Era of Data-Driven Ideas
Legibility creates velocity
Anthony Lee Zhang categorizes ideas into two types:
- Idea-driven ideas: an idea where you start off with a high-level philosophical frame and deduce a concrete insight using logical reasoning
- Data-driven ideas: an idea derived from pure data with no preconceptions
Building on this framework, industries can be characterized as idea-driven or data-driven:
Idea-driven industries:
| Industry | Reasoning |
|---|---|
| Early-stage venture capital | Limited data on unproven startups creating new markets |
| Traditional economics | A macro-level assumption about markets (e.g. markets are efficient) leads to various downstream conclusions |
| Policy-making | Frequently shaped by ideological principles before empirical testing |
| Brand marketing | Often relies on intuition about cultural trends and human psychology |
Data-driven industries:
| Industry | Reasoning |
|---|---|
| Private equity | Potential LBOs are modeled on a spreadsheet; if the investment is able to clear a given hurdle rate, then the investment is "good" |
| Quant trading | Strategies are thoroughly backtested against historical data |
| Insurance | Risk is quantified via actuarial tables. |
| Growth engineering | Features are A/B tested until a local maximum is achieved |
| Poker | Computer solvers determine exact EV for every decision |
Similarly, movements can be categorized as primarily idea-driven or data-driven:
Idea-driven movements:
| Movement | Reasoning |
|---|---|
| Civil Rights | All people are created equal, segregation is inherently unjust |
| Crypto | Property rights should be digital-native, censorship resistant, and permissionless. |
| Longevity | Death is a problem to be solved, not an inevitability to be accepted; aging is a disease rather than a natural process |
Data-driven movements:
| Movement | Reasoning |
|---|---|
| Progress Studies | Compounding economic GDP growth is the most important factor to increase human welfare |
| Climate Activism | If global temperatures increase by more than 1.5 degrees Celsius, it will cause irreversible negative externalities |
| Effective Altruism | Moral obligations and contributions can be quantified, and therefore optimized through utilitarian frameworks |
Enabled by technology, data-driven ideas operate at higher velocities than their idea-driven counterparts across industries and movements. Data-driven ideas are not more rigorous in nature, as evidenced by the many industries and movements built on falsified data.
In this post, I outline how the legibility of data-driven ideas creates coordination, which in turn, creates velocity.
Data as a Schelling Point
The post-Moneyball era demonstrated that everybody must think in data or risk obsolescence. The current era of statistical models and graphs aggregating thousands to billions of data points is a very recent phenomenon that is only now gaining widespread cultural acceptance.
The key feature of data is that it is focal, not that it is necessarily better. This leads to people gaining a sense of transcendence to data for its promise of unbiased representation, even if data is often misleading at best and fabricated at worst.
Charts and statistics compress an arbitrary number of data points into something everyone can see and cite. Often, these charts and statistics transcend language barriers, requiring only very basic knowledge of English and often zero additional context.
Legibility enables quick ascents
Data-driven ideas convert attention into conviction faster than idea-driven ideas. This enables individuals, industries, and movements to experience extremely large growth rates in a short period of time:
-
Individuals: Aella is a scientist and sex worker who has leveraged her following to gather some of the most comprehensive datasets on relationships and sexuality in human history. One of her surveys, which takes over one hour to complete, has over 850,000 responses. Every single post by Aella is backed by evidence, typically a graph derived by her own data from one of her surveys. Sex and relationship discourse is stale and vibes-based, while Aella's data provides legibility.
-
Industries: Quant funds out-fundraised discretionary managers not purely on returns, but on the ability to cite legible backtested data over vibes-based hedge fund managers. Growth engineers beat product visionaries in internal debates by A/B testing every aspect of the product experience.
-
Movements: EA scaled within years based on the premise that impact can be numerically measured. Lives and charities can be directly measured in impact per dollar, enabling the comparison of interventions on a spreadsheet and the ability to optimize your altruism.
Idea-driven ideas have to continually persuade, while data-driven ideas just have to point.
Falsifiable claims create fragility
The same property that enables quick ascent also enables quick collapse:
-
Individuals: Public intellectuals who stake their reputation on data-driven claims rise and fall with that claim. The prediction market pundit who calls an election correctly becomes a genius; the next miss makes them a cautionary tale. The half-life of a data-driven pundit is much shorter because their credibility is indexed to something falsifiable.
-
Industries: The replication crisis put the entire field of social psychology into question overnight. A quant fund faces the same dynamic: the backtest that attracted capital becomes a liability the moment the model breaks.
-
Movements: Climate activism coordinated globally around a specific number: 1.5 degrees. Websites now display countdown clocks to 1.5 degrees, currently at 3 years and 235 days. This creates a failure mode if there are no long-term externalities after reaching the number. What happens if we pass 1.5 degrees and the predicted externalities don't materialize on the expected timeline?
Idea-driven ideas are low-beta and don't face the same fragility because they are tied to unfalsifiable ideas.
Open Questions
- How do the aesthetics of data-driven ideas relate to their popularity?
- The half-life of many public intellectuals is quite short. How much can be attributed to staking their reputation on a specific data-driven idea and going down its respective rabbit hole?
- How correlated are idea-driven / data-driven ideas and missionary / mercenary, respectively?
- Where do LLM-derived ideas fit?
- What's the optimal ratio of data-driven to idea-driven claims for a movement that wants both velocity and durability?
Popularity prediction hash: 9fa5947ff5a5745a1291997d6f7d81edcf0913ff6f2239ac79fc43d992b08237