Six Pillars of Trustworthy Financial AI

GenAI is a different kind of system. The only viable response is deterministic architecture.

Simon Gregory | CTO & Co-Founder
MPhys Physics, University of Warwick

Request your copy of the Executive Summary / Full Report (PDF)

The Six Pillars: Foreword / What are the Six Pillars?
Where financial AI fails – the risks and challenges on the path to production

Pillar 1: Auditability
When you can’t see how an answer was formed, you can’t trust it

Pillar 2: Authority
When AI can’t tell who is allowed to speak, relevance replaces legitimacy

Pillar 3: Provenance
When you can’t see the lineage, the system invents it

Pillar 4: Context Integrity
When the evidential world breaks, the model hallucinates the missing structure

Pillar 5: Temporal Integrity
When time collapses, financial reasoning collapses with it

Pillar 6: Determinism
When behaviour is unstable, trust must come from the architecture, not the model

The Six Pillars: Conclusion
GenAI is a different kind of system. The only viable response is deterministic architecture.

Conclusion

GenAI is a different kind of system

We have spent decades building intuitions about how software works. How it scales. What a demo proves. What an MVP means. Those intuitions are correct, for deterministic systems. Prove it works in one scenario, it works in the equivalent scenario. Same inputs, same outputs. The scaling assumptions hold because the physics are linear and reproducible.

Non-deterministic systems have different physics. The scaling assumptions don’t transfer. A demo proves the system can produce a good output under optimised conditions. It says almost nothing about what happens at scale, under varied conditions, under production load, with real data, with edge cases.

This is the root cause that sits above the LLM Delusion. Most organisations evaluating financial AI are applying deterministic intuitions to a system with fundamentally different physics. Like judging quantum mechanics with Newtonian assumptions. At the surface level it might appear fine, but the reality is it doesn’t work at scale.

The relationship between demo performance and production performance is not linear or predictable, because the physics are different. The benchmark captures the peak. Production inherits the full distribution, including both tail failure types. A more capable model shifts the peak but doesn’t eliminate the tails.

The demo doesn’t prove what they think it proves. It proves the system can perform under optimised conditions. That’s a different question entirely.

The question to ask of any PoC: will it work the same in production?

Imagine a sat nav that one in a hundred times took you to completely the wrong destination. Not just a slightly wrong route, but the wrong destination entirely. You’d never know which journey was the wrong one, as it wouldn’t tell you. It would say “you have arrived” with complete confidence.

Now apply that to a financial information system. One in a hundred times it returns a confident, well-formatted, completely wrong answer. No flag. No caveat. No way to know which one.

Would you use that sat nav for a journey that mattered?

The only viable response

Deterministic systems fail predictably. You can find the failure, reproduce it, audit it, and fix it.

Non-deterministic systems fail unpredictably, at unpredictable times. You can only contain them.

In finance, predictable failure is an engineering asset and the foundation of trust. A system that fails predictably can be audited, replayed, and inspected. A system that fails unpredictably can’t be any of those things, so the architecture is the answer.

Containment isn’t a weakness, it is the only rational architectural response to non-deterministic physics. Every proposed alternative (testing, monitoring, fine-tuning, guardrails, prompt engineering) either gets absorbed into containment or fails to work on a system with no fixed character.

The three axioms are the engineering expression of that containment. Evidence Integrity, Provenance Integrity, and Determinism are the minimum requirements for a financial AI system that can be trusted, audited, and acted upon with confidence.

The Six Pillars are the map of what happens when any of them are violated. Read individually, each pillar identifies a failure mode. Together they identify a system, and the architecture required to make it trustworthy. The pillars shouldn’t be interpreted as just safeguards. They describe the physics of a system that can be trusted.

This framework emerged from over a decade of building production financial AI systems for tier-1 institutions, under real regulatory scrutiny, with real consequences for getting it wrong. The architecture came first. The principles followed. The company I co-founded exists because of them. The Six Pillars are what I found, and what I believe any trustworthy financial AI system must be built to prevent.

Financial AI earns trust only when its reasoning is constrained, inspectable, and replayable.
Outside that boundary, it isn’t really a system, it’s uncontrolled behaviour.

Three Axioms

The Six Pillars are six expressions of three foundational requirements: axioms that any trustworthy financial AI system must satisfy. Violate any one of them and the others cannot hold.

Evidence Integrity
A system must maximise and preserve the integrity of the evidence it operates on. When evidence integrity collapses, every other pillar collapses with it.

Authority, Context Integrity, and Temporal Integrity are all dimensions of this axiom. They describe the ways in which the evidential world can be corrupt or incomplete before reasoning has even begun.

Provenance Integrity
A system must preserve the lineage of its evidence and verify it architecturally, such that every output surfaces its authoritative sources for human review and action. Without this, nothing downstream can be trusted, audited, or acted upon with confidence.

Auditability and Provenance are the primary expressions of this axiom. But the integrity pillars carry a provenance dimension too: it is not enough to retrieve authoritative evidence if you cannot prove that its lineage survived the journey.

Determinism
Non‑determinism is a property, not a feature. In finance, it must be contained by deterministic architecture.

A deterministic system operating on broken evidence is still broken. But a system with evidence integrity and provenance, operating within a deterministic architecture, becomes something financial institutions can actually trust: inspectable, replayable, and auditable by design.

< Previous | Pillar 6: Determinism

Request your copy of the Executive Summary / Full Report (PDF)

News & Insights

13th May 2026

Six Pillars of Trustworthy Financial AI

GenAI is a different kind of system. The only viable response is deterministic architecture.

Conclusion

GenAI is a different kind of system

The only viable response

Three Axioms

Six Pillars of Trustworthy Financial AI – Conclusion

Six Pillars of Trustworthy Financial AI – Pillar 6 – Determinism

Six Pillars of Trustworthy Financial AI – Pillar 5 – Temporal Integrity

Six Pillars of Trustworthy Financial AI – Pillar 4 – Context Integrity