June 4, 2026

Bob Suh

Evidence of AI Payoff in Financial Services

A Gap Between Investor Expectations and Reality

The capital markets are betting on big economic returns from artificial intelligence - and anxiously awaiting the evidence. A recent article in the MIT Technology Review says AI is not yet affecting broader economic metrics (see their chart below). This isn’t surprising given that markets are forward-looking. Yet there are signs of success. Goldman Sachs pointed to the rapid adoption of AI with individuals, while the need for new data orchestration and deployment layers is causing delays in enterprises. Enterprise AI, like the horse Secretariat, is a “deep closer:” It will win the race, but only after a slower start.

Our experience shows that AI is ready to deliver, but execution challenges are slowing results. We’ve demonstrated that production agents can make a measurable impact on time savings in financial operations. Why has our experience been different? 

Here are three lessons we learned. 

  • Move to production early. It wasn’t easy, but we quickly learned how to feed agents the right data, monitor their reliability, and fine-tune them to boost performance. AI agents need real-life issues in order to learn and adapt. Daily production is a trial-by-fire environment where agents have to perform. Many teams are stuck in demo and POC cycles and can’t improve agent performance.
  • Task agents to perform deeper work. We created deep industry research agents, which we described in a recent Harvard Business Review article, to handle more end-to-end tasks. Many teams are deploying agents to do superficial work. These small incremental time savings will not add up to much. 
  • Target the largest cost pools. While our clients are mostly asset managers, their spending with service providers far outweighs internal labor costs. We therefore included our clients’ service providers in the daily use of our agents, so we could be sure we’d have a bigger impact. 

The Process of Generating Tangible Returns From Agents

Generating tangible returns from AI agents is not easy. It requires complex problem solving, months of tracking processes, and fine-tuning to achieve higher performance levels. The first step was to develop reliable core metrics that could help us target opportunities for the highest ROI. In financial services operations, researching and resolving exceptions is the single largest labor category. Unfortunately, most reconciliation systems provide insufficient data to make these calls. 

The chart below shows actual time spent on tasks for a global operations team. You can see in the baseline bar to the far left that 60 percent of total time is spent researching and resolving exceptions. This insight demonstrated that if we could train agents to do the deep work of researching and resolving exceptions, the impact would be multiples greater than savings made by software alone. 

Once deployed, our deep industry research agents began tracking the root causes of the exceptions. Below is a table of the root causes suggested by our agents (rows) and verified with human experts (columns).  The data is created by agents and not available in any traditional database. 

Root cause data has two major benefits: 1) agents can use it to improve their accuracy in predicting the next root cause, and 2) it enables leaders to calculate the returns from auto-closing. For example, we found that a significant percentage of exceptions were so similar, they could be bulk-closed. In other words, when an expert validated the root cause of an exception with high confidence, the system could find other exceptions that matched it and auto-close them too.

Here is a summary of our staged process for driving substantial returns with agents. 

  • The first stage is to build real performance and evaluation data. We know firms have data on who is doing the work and how much they cost. But they are lacking data on how people are spending their time. This turns out to be the key. Root cause data is critical to understanding how people spend their time because we can tell whether people are wasting time on false-positives and problems that have already been solved. Many financial operations are caught in a “groundhog day syndrome” where they are repeating the same research work because their software can’t learn. We push learning agents into production to assist in the root cause identification of exceptions. We then use the root cause data to help agents improve their predictive accuracy. 
  • The second stage is to set policies for auto-closing. We use data on the frequency of exceptions, the time-to-resolve exceptions, the agent’s predictive accuracy, and the materiality (the cost of being wrong).
  • The third stage is to track the time savings and cost of the auto closings against a baseline. This tracking is provided in a real-time dashboard illustrated below. It provides leaders a quantified return on time and cost savings, based on their prevailing labor hours and costs.

In this particular system, which uses agents to research and resolve a group of alts funds, the time and cost savings delivered by agents have increased 520 percent in the last six months (see chart below).

Final Thoughts

While we can’t predict when broad-based economic returns will show up in the largest GDP sectors, we can remove the doubt that it will happen. It’s no surprise that execution is both critical and extremely difficult. To ensure faster execution, leaders must work with partners and teams experienced in deploying and supporting agents in daily production. To achieve the largest returns, leaders must target the largest cost pools across all their service providers. We’ve followed these steps in our work with world class asset managers and service providers and we know it works.

Featured Content

June 4, 2026
Evidence of AI Payoff in Financial Services
Read More
October 8, 2024
How We Trained an AI to Read and Reconcile Reports
Read More
October 8, 2024
Why AI Beats Humans at Oversight
Read More