What do humans see that AI misses? What does AI see that humans ignore?

Public research on AI-human symbiosis in high-stakes decisions. We're building the methodology—and publishing everything we learn.

AI-Human Decision Systems
Active Research

The Thesis

AI has conquered the quantitative layer. Models parse documents in seconds, flag anomalies, calculate risk scores.

But experts still outperform. The radiologist who escalates a scan the algorithm scored as routine. The deal professional who kills an opportunity the spreadsheet approved. They're reading something the models miss.

We're researching where that gap lives—and building systems that close it without eliminating the human judgment that matters.

Our Approach

"Retrospective before predictive. We start with deals where the outcome is known, work backward to what was knowable, then build forward."

Current Focus

Active Research

Deal Signal

WeWork/BowX SPAC Merger (October 2021)

Testing AI architectures against a deal with known outcome. What could systems have surfaced from the documents? What required human judgment?

The numbers said yes. Everyone said yes. They were all wrong. Let's figure out why.

Study Constraints

  • Documents Only materials available before merger close
  • Contamination LLMs trained on post-failure data; controls documented
  • Ground Truth Established before AI testing, locked, published
  • Questions 40 questions: extraction, calculation, inference, judgment
  • Evaluation Found, Accurate, Complete, Cited, Relevant, Actionable

All Research

Project Domain Expert AI Role Status
Deal Signal Investment Decisions Deal Professionals Surface risk signals from documents Active
MRI Triage Medical Imaging Radiologists Surface urgency signals from scans Archive

This research is early. The methodology is incomplete. I'm asking anyway.

If you are:

  • An investor who has killed a deal on instinct you couldn't fully justify
  • A radiologist who has escalated a scan the system scored as routine
  • A deal professional who's seen red flags ignored because they weren't quantifiable
  • An AI practitioner who thinks this framing is naïve or wrong

The Ask

30 minutes. Blunt feedback. Examples of where this breaks. No pitch. No deck. Just conversation.