Six Pillars of Trustworthy Financial AI – Conclusion
GenAI is a different kind of system
Simon Gregory | CTO & Co-Founder
Pillar 1: Auditability
When you can’t see how an answer was formed, you can’t trust it
Pillar 2: Authority
When AI can’t tell who is allowed to speak, relevance replaces legitimacy
Pillar 3: Provenance
When you can’t see the lineage, the system invents it
Pillar 4: Context Integrity
When the evidential world breaks, the model hallucinates the missing structure
Pillar 5: Temporal Integrity
When time collapses, financial reasoning collapses with it
Pillar 6: Determinism
When behaviour is unstable, trust must come from the architecture, not the model
Conclusion
We have spent decades building intuitions about how software works. How it scales. What a demo proves. What an MVP means. Those intuitions are correct, for deterministic systems. Prove it works in one scenario, it works in the equivalent scenario. Same inputs, same outputs. The scaling assumptions hold because the physics are linear and reproducible.
Non-deterministic systems have different physics. The scaling assumptions don’t transfer. A demo proves the system can produce a good output under optimised conditions. It says almost nothing about what happens at scale, under varied conditions, under production load, with real data, with edge cases.
This is the root cause that sits above the LLM Delusion. Most organisations evaluating financial AI are applying deterministic intuitions to a system with fundamentally different physics. Like judging quantum mechanics with Newtonian assumptions. At the surface level it might appear fine, but the reality is it doesn’t work at scale.
The relationship between demo performance and production performance is not linear or predictable, because the physics are different. The benchmark captures the peak. Production inherits the full distribution, including both tail failure types. A more capable model shifts the peak but doesn’t eliminate the tails.
The demo doesn’t prove what they think it proves. It proves the system can perform under optimised conditions. That’s a different question entirely.
Imagine a sat nav that one in a hundred times took you to completely the wrong destination. Not just a slightly wrong route, but the wrong destination entirely. You’d never know which journey was the wrong one, as it wouldn’t tell you. It would say “you have arrived” with complete confidence.
Now apply that to a financial information system. One in a hundred times it returns a confident, well-formatted, completely wrong answer. No flag. No caveat. No way to know which one.
Would you use that sat nav for a journey that mattered?
The only viable response
Deterministic systems fail predictably. You can find the failure, reproduce it, audit it, and fix it.
Non-deterministic systems fail unpredictably, at unpredictable times. You can only contain them.
In finance, predictable failure is an engineering asset and the foundation of trust. A system that fails predictably can be audited, replayed, and inspected. A system that fails unpredictably can’t be any of those things, so the architecture is the answer.
Containment isn’t a weakness, it is the only rational architectural response to non-deterministic physics. Every proposed alternative (testing, monitoring, fine-tuning, guardrails, prompt engineering) either gets absorbed into containment or fails to work on a system with no fixed character.
The three axioms are the engineering expression of that containment. Evidence Integrity, Provenance Integrity, and Determinism are the minimum requirements for a financial AI system that can be trusted, audited, and acted upon with confidence.
The Six Pillars are the map of what happens when any of them are violated. Read individually, each pillar identifies a failure mode. Together they identify a system, and the architecture required to make it trustworthy. The pillars shouldn’t be interpreted as just safeguards. They describe the physics of a system that can be trusted.
This framework emerged from over a decade of building production financial AI systems for tier-1 institutions, under real regulatory scrutiny, with real consequences for getting it wrong. The architecture came first. The principles followed. The company I co-founded exists because of them. The Six Pillars are what I found, and what I believe any trustworthy financial AI system must be built to prevent.
Financial AI earns trust only when its reasoning is constrained, inspectable, and replayable.
Outside that boundary, it isn’t really a system, it’s uncontrolled behaviour.
< Previous | Pillar 6: Determinism




