The Math of a Warm Reply

Top-performing VC firms made 16% more introductions year over year, while firms across the market sent and received 17% more emails in Q4 2025 and added 4% more new contacts year over year, according to Affinity's 2025 Venture Capital Benchmark Report, based on activity data from almost 3,000 firms across 68 countries. The pattern points to a simple operational reality: relationship volume is rising faster than many firms can manage manually, leaving important context scattered across inboxes, decks, meeting recaps, and partner memory. At the same time, PitchBook and NVCA logged more than 100 U.S. megadeals per quarter through 2025.

That is the problem Ujjwal Jain, a founding engineer at Originalis AI, has been solving from inside the engine room. A former Amazon Advertising engineer who left the Seattle giant in early 2025 to rewire how capital finds innovation, Jain is best known for the deal-intelligence pipeline that has processed more than 552,000 email threads, saved an estimated 50,000 hours of manual labor, and is now running at a throughput of roughly 68,000 new threads in a single rolling month. But the intake system was only the front door. The relationship-intelligence engine behind it, which scores, ranks, and narrates every contact on a portfolio, is where the research work has gone in 2026, and it is beginning to reshape how partners decide where to spend the one resource no AI can manufacture: their attention.

The Needle-in-Haystack Problem

Before any scoring is possible, something has to decide which emails are worth scoring at all. Jain's intake layer reads every forwarded intro, deck drop, portfolio update, and founder ping against a single question: is this an investment signal, or is it noise? The answer, measured across hundreds of thousands of threads, is unambiguous. On the Originalis production pipeline, 91.1% of all incoming email is classified as noise, conference invites, product newsletters, service notifications, and automated receipts. Only 6.6% is a genuine new pitch. Another 2%, spread across portfolio updates, deal follow-ups, founder opportunities, and material refreshes, is actionable but non-pitch context the partner still needs. The remaining trace is clarifications and pings.

Across the roughly 552,000 threads the system has triaged, about 500,000 were removed from the partner's cognitive queue without a human touch. The 36,294 new pitches and the 11,800 pieces of actionable downstream email are what remain. Of the 11,864 deal memos now living in the platform, 10,557 (roughly 89%) entered through automated ingestion rather than manual submission. In other words, the engine is no longer an assistant to the partner's workflow; it is the workflow.

Why Relationships Can't Live in a Spreadsheet

In most venture firms, the contact database is still a flat list. A name, a title, a firm, maybe a last-touched date, usually manually entered and almost always stale. What the list cannot show is the thing that actually matters: whether the relationship is strengthening or decaying, which co-investor sent the best deal last quarter, which founder is quietly going cold, or which operator, three steps removed from the partner's inner circle, is the right person to call on a specific piece of diligence.

Jain's bet was that all of that information already lives inside the firm's unstructured data, email bodies, calendar attendees, meeting transcripts, DocSend links, PDF decks, and LinkedIn histories; with the right architecture, it could be compiled into a single, typed relationship knowledge graph. Formally, G = (V, E, w, τ, t), where vertices include investors, contacts, firms, deals, and source channels; edges carry a weight w ∈ [0, 1] representing relationship strength; labels τ distinguish typed relations between actors; and timestamps t anchor every edge in history. No edge is ever deleted. A relationship that has gone quiet is still visible; it has simply decayed in weight.

The graph in production currently holds 669,646 email activity edges connecting 245 users to 211,089 contacts across 57,326 unique user–contact pairs. It is the substrate on which everything else runs.

"The inbox already contains the answer. We just treated it like a to-do list instead of a data source. Once you accept that every email, invite, and forwarded intro is an edge in a graph, scoring and ranking become functions you can write down and defend," Ujjwal Jain says.

The Formula Behind a Warm Reply

Once the graph exists, the obvious question is how to score it. Most relationship-intelligence products either punt to a black-box embedding that no partner can explain or fall back to a single vanity metric, last-touched date, that misses everything interesting. Jain chose a different path. Every relationship between a user and a contact carries a deterministic, bounded strength score S* on a 0–100 scale, built out of interpretable components:

S* = Ψ(Φ(B_W), R) → [0, 100]

where Φ is a fixed composition of behavioural signals computed over a recent observable window, email volume, responsiveness, meetings, and warm introductions, and Ψ modestly amplifies that base score for structural role, such as a founder the user has actually backed or a co-investor at a firm with shared deal history. The final value is saturated at 100, so every score is bounded, auditable, and reproducible by hand.

The most unusual piece of the formula, and the one Jain describes as his proudest research finding, is the responsiveness term. Most CRMs measure how fast the contact replies to the user. Originalis measures the opposite.

"The strongest behavioural signal of importance isn't the contact's latency. It's yours. If you are consistently replying within twelve hours, your calendar has already told you this relationship matters, regardless of how much they write back. We encoded that asymmetry into the score, and the rankings started looking the way a thoughtful partner would rank them, not the way an email client does," Ujjwal Jain shares.

The production distribution of those scores is itself a story. Across 124,271 scored relationships, 82.06% score between 0 and 20; another 13.92% fall between 21 and 40. Only 3.12% reach the 41–60 band; 0.71% reach 61–80; and just 0.19% (a population of 234 relationships across the entire user base) cross 80. The median score is zero. What the math is saying, bluntly, is that a strong relationship is rare. Most of a partner's declared network is, in practical terms, inert, and the engine now surfaces exactly which contacts, for each partner, are not.

Finding the Right Expert

A score is necessary but not sufficient. The harder question a partner asks every day is who in my network should I call about this specific deal. Jain's system routes over a two-dimensional plane whose axes are expertise fit and relationship strength. A naive top-k would produce fifteen experts all clustered in the same sub-domain, leaving zero coverage on adjacent topics that actually decide a deal. His solution is a coverage-aware two-phase algorithm: phase one fills a bucket of warm, high-expertise contacts; phase two surfaces cold experts only if they represent absolute excellence or fill a coverage gap the first bucket did not. The result is a ranked list that is both strong and diverse, with an explicit coverage guarantee.

A smaller but telling win came from the interface. When top candidates cluster within two or three points of each other, users routinely conclude "the tool can't tell them apart." Jain wrote a rank-preserving spreading transformation that expands a tight cluster like 78, 77, 77, 76, 75 into a legible spread such as 85, 78, 70, 63, 55 whenever the span is narrow, while keeping the ordering identical. Nothing about the underlying ranking changes; the human simply gets a legible distribution.

The operating metrics tell the same story as the math. Since deployment, platform throughput has increased 16.5 times, and roughly 89% of new deals now enter the pipeline through automated ingestion rather than manual handling: 10,557 of 11,864 live memos. Median triage latency is 5.2 seconds end to end; median time to produce a first-pass memo from a freshly ingested pitch is 31.6 seconds. More importantly, every surfaced relationship, expert, or deal carries a provenance trail back to the specific email, meeting, or funding round that justified it, which means a partner challenged on a recommendation can reconstruct the reasoning rather than simply trust the tool.

What Comes After the Score

With the scoring engine in place, Jain designed three additional layers on top of the graph. Each one is implemented as a pure function over the graph state, so its output can always be reconstructed from the input rather than held in hidden caches. The first is narrative generation. For every scored relationship, the system writes a short plain-English summary of the current state: what the user and contact have been discussing, which threads are still open, which artifacts have been exchanged. Today, thousands of such narratives are live in production, each one of them regenerated recently enough to reflect the current state; the remainder are marked stale and scheduled for recomputation. Each line of prose links back to the specific message or meeting that supports it.

The second layer is signal classification. Jain tags each relationship as warming, cooling, dormant, or stable based on the slope of its recent activity rather than its absolute level. The design choice was deliberate: a warming mid-strength relationship is often more relevant to a partner's next call than a strong-but-cooling one, and a pure score would collapse that distinction. The signal surfaces it without overriding the score itself.

The third layer is long-tail expertise decay. An expert tagged years ago with a domain they have since walked away from is a worse recommendation than a current operator with thinner but active engagement. Jain placed the decay function on the expertise edge rather than the contact record, so the system distinguishes between "this person no longer works on X" and "this person is no longer active at all." The former is still useful for historical introductions; the latter is not.

"Putting the decay on the edge, not on the contact, was a small choice with a large consequence. A person doesn't stop being an expert—a specific piece of their expertise goes stale. Keeping those two concepts separate is what lets the graph age gracefully instead of forgetting people entirely," Jain says.

Jain describes the design philosophy of the whole stack as a single constraint: use language models where judgment is required, and deterministic functions everywhere else. Triage decisions, narrative generation, and expertise inference are model calls because they involve interpretation. The score, the graph structure, the routing algorithm, and the decay functions are fixed code because they have to be reproducible. When a partner asks why a particular contact surfaced, the deterministic layers return a finite audit trail; the model-call layers show which prompt and which evidence were used. The split is what lets the system be challenged without being rebuilt.

Join the Discussion

The Math of a Warm Reply

The Needle-in-Haystack Problem

Why Relationships Can't Live in a Spreadsheet

The Formula Behind a Warm Reply

Finding the Right Expert

What Comes After the Score

Steam Machine Vulkan Certification Signals Final Pre-Launch Stage for Valve Console

Walmart Surveillance Pricing Push Alarms 68% of Americans, Three States Now Banned It

Call of Duty Modern Warfare 4 Cover Art Leak Confirms 사, Korean Setting

Windows 11 KB5089573 Performance Update Makes PCs Faster With File Explorer Improvements

Apple's First Foldable iPhone Faces Early Mass Production Challenges Due to Yield Problems