Full text
Most enterprise AI deployments so far have focused on coding assistants and customer service bots. Morgan Stanley has deployed agents in one of banking's most accuracy-critical, deadline-driven workflows instead — profit and loss (P&L) reconciliation — and cut the work in half. The counterintuitive part: it got there by making the system less autonomous, not more.
Humans stay tightly in the loop, and their decisions are iteratively turned into repeatable rules the system can apply on its own.
“It's much more like a co-worker than a copilot,” Morgan Stanley Managing Director Todd Johnson said at a recent VB AI Impact event. The internal production agentic system, known as FIXR, goes beyond simple, straightforward "gen AI 1.0" tasks. “We think that's where the opportunity is to really unlock more complex work in the organization.”
FIXR behind the scenes
Every trading day, Morgan Stanley’s trade desks handle the important work around transactions such as cash equities or debt investments.
And, at the end of each of those days, controllers must reconcile P&L across the finance giant’s Finance, Risk, Operations, and Trade Capture systems. All that data must come together, and, perhaps not surprisingly, hundreds of thousands of attributes frequently fail to match.
Typically, this means controllers must manually investigate each mismatch (or “break”), make decisions on adjustments, then ideally sign off before the number goes to the desk. And all of this while working on a hard morning deadline.
Previously, this could take up to six hours for a single book. Now, FIXR performs the task in two to three hours, Johnson said. Across the roughly 100 controllers who do this work, that adds up to about 1,500 hours saved per week.
After nightly P&L calculations complete, the system automatically analyzes “breaks” and proposes resolutions based on learned rules. Several agents work together:
One interprets past guidance to develop start-of-day resolutions.
One learns from controller behavior and documents the rules they apply.
One converts repeated patterns into durable, automated logic.
Over time, the system can auto-clear certain breaks it’s encountered before, suggest solutions for others that may be less familiar, ask for help when it’s unsure, and flag for human investigation. When items are repeatedly resolved through the same method, it can create firm rules.
Critically, humans don’t leave the loop, but stay fully in it, he said. They review, approve or correct every recommendation, then feed those decisions back to improve the next run. The agent learns daily from controllers what it gets right and wrong and codifies that knowledge as it iterates.
“You still preserve that element of human accountability even as you start to automate,” Johnson said. “Over time you'll see more and more of those items resolved in an automatic way.”
He emphasized that autonomy requires a great deal of trust; enterprises will not see efficiency gains if everyone's checking everything an agent does.
The human–agent feedback loop was critical to addressing the challenge of controlled, measured, and repeatable automation. “We recognized that all that intelligence that's sitting in the mind of a controller is gonna be difficult to get all into an agent on day one,” Johnson said.
Focus on process-first, extensibility
It was critical to establish processes first, before getting any AI involved, Johnson said. His team ran a “very thorough” process intelligence assessment that mapped and mined workflows to identify where automation would be the most advantageous: Was the answer agents, traditional automation, or simple re-engineering of an inefficient step?
“If we can fix that first before we add agents to the problem, then we really will be transforming the opportunity,” he said.
The P&L sign-off process was full of manual steps suitable for automation, and agents taking over some of these time-consuming tasks are freeing up controllers for “more value-added analysis” and “deeper risk consideration” work, he said.
Extensibility, though, was just as important as time savings. Johnson’s team chose this particular P&L reconciliation use case because hundreds of controllers were doing this work globally across the business (in the Americas, Europe, Asia).
So start with a use case, prove it, extend it, “and then ultimately the transformation will be as we roll this out more and more across the organization,” Johnson said.
Deterministic by design
Johnson said the team also deliberately limited how much of the workflow depended on the model's judgment at all. "If you have an opportunity to make things very prescribed and repeatable, that's cheaper in terms of token consumption, it's more repeatable in terms of controls — and have the LLM do the stuff where you don't need that kind of deterministic workflow," he said.
As the system sees more controller feedback on a given break type, Morgan Stanley converts that pattern into a fixed rule instead of leaving it to the model.
Humans still own the behavior
An interesting (and perhaps fundamental) question being raised at the dawn of the agentic era is: Are agents code or digital employees?
Johnson argues that “they're probably a little bit of both,” and, as such, require nuance when it comes to governance and oversight. Technical teams must still be responsible for maintaining protections and guardrails like firewalls or encryption, for instance.
But there’s a new dynamic around the “performance element”: Humans using agents are responsible for them because it’s aiding their business work. For instance, if a senior controller is working with a junior controller, they don’t just relinquish responsibility because someone is helping them out, Johnson noted.
“One of our strong principles in our AI governance generally is that there always has to be human accountability, even if there's a degree of automation,” he said.
But there typically isn’t “one single one person,” and the process is ultimately continuous. To this point, Johnson joked that one “depressing” thing about agentic AI is that it’s going to require ongoing training because models are ever-changing.
“You're never gonna be able to say: ‘We've done all the evaluation and testing that we need to do. Let's just let it go.’ You're going to have to have a constant view as it evolves over time.”
Morgan Stanley is aiming at real enterprise pain points
Morgan Stanley's experience mirrors patterns VentureBeat has uncovered across enterprise AI deployments.
In VentureBeat's recent VB Pulse survey, nearly three-quarters of respondents reported seeing little to no ROI from custom model fine-tuning, describing a "sandbox graveyard" of AI projects that proved too costly to maintain. This suggests that Morgan Stanley's process-first, buy-and-blend approach may be more sustainable than chasing bespoke models. The survey had 87 respondents and findings should be considered directional.
Governance emerged as another common challenge: 38% of respondents cited the lack of a single accountable owner as their biggest barrier to production AI, while only two of the 87 enterprises surveyed had active monitoring and alerting in place to detect model failures.
Comments
No comments yet — be the first to weigh in 👇
No comments yet. Be the first!