Six Pillars of Trustworthy Financial AI

Financial AI earns trust only when its reasoning is constrained, inspectable, and replayable. Outside that boundary, it isn’t really a system – it’s uncontrolled behaviour.

Simon Gregory | CTO & Co-Founder
MPhys Physics, University of Warwick

Request your copy of the Executive Summary / Full Report (PDF)

The Six Pillars: Foreword / What are the Six Pillars?
Where financial AI fails – the risks and challenges on the path to production

Pillar 1: Auditability
When you can’t see how an answer was formed, you can’t trust it

Pillar 2: Authority
When AI can’t tell who is allowed to speak, relevance replaces legitimacy

Pillar 3: Provenance
When you can’t see the lineage, the system invents it

Pillar 4: Context Integrity
When the evidential world breaks, the model hallucinates the missing structure

Pillar 5: Temporal Integrity
When time collapses, financial reasoning collapses with it

Pillar 6: Determinism
When behaviour is unstable, trust must come from the architecture, not the model

The Six Pillars: Conclusion
GenAI is a different kind of system. The only viable response is deterministic architecture.

Pillar 1: Auditability

When you can’t see how an answer was formed, you can’t trust it

Auditability is the discipline of being able to trace, verify, and justify how an AI assisted outcome was produced. In traditional software, this is straightforward: deterministic code paths, logs, and reproducible behaviour give you a clear chain of causality. Generative models break that assumption. Their internal processes are opaque, their outputs are non deterministic, and their explanations are narratives rather than evidence. That combination makes auditability one of the defining challenges of trustworthy financial AI.

LLMs and vector systems operate as opaque black boxes. Their internal states, intermediate steps, and decision paths are not observable or reconstructable. You cannot inspect how a specific answer was formed, and you cannot replay the internal reasoning that led to it. This means auditability cannot rely on introspection; it must rely on external verification.

Because the model sits outside the trust boundary, its output must be treated as untrusted input. This is the same posture used in security engineering: anything that originates outside the system of record is untrusted until validated. Fluency, confidence, and coherence do not grant trust. Only verifiability does.

Showing its working

When asked to “show its working,” an LLM generates a post hoc reconstruction. The important distinction is that this is not the same thing as an auditable trace. The explanation is produced after the answer, using the same generative mechanism that produces the answer itself.

The model has no access to its own weights, its activation patterns, or the computational path that led to the output. What it produces is a plausible narrative rather than a record, which is why explanations cannot be treated as audit evidence.

Hallucinations are a direct consequence of how generative models work, and it’s now widely accepted that they cannot be eliminated without fundamentally changing what the models can do. The same mechanism that enables generalisation, inference, and creativity also enables confident fabrication. This is why external validation is mandatory, as the model is narrating a plausible explanation, rather than exposing its explicit reasoning.

Thinking Deeper

Extended reasoning models don’t just answer. They plan. They decompose the query into sub-problems, execute a sequence of internal steps, loop across retrievals, and synthesise across the results before producing a response.

From the outside, a single query went in and a single answer came out. What actually happened was an orchestrated agentic process. Unlike a standard LLM response, the reasoning trace broadly reflects the path taken, where the steps shown are generally the steps executed. But the trace is still a generated output, and the integrity of what happened at each step is not independently verifiable.

Reasoning models appear to solve the auditability problem by showing their thinking before the answer rather than after. Producing a visible chain of deliberation, step following step with apparent rigour. But within that chain, the model also generates verification clauses (for example, “let me check this,” “this contradicts what I said earlier”) that appear to be quality controls. They are generated outputs, subject to the same faithfulness constraints as everything else in the trace.

The path is visible, but whether each point on that path produced a trustworthy output is not independently verifiable. A self-correction that appears to catch an error may be genuine, or it may be generated plausibility. The trace cannot tell you which, and neither can the model.

The core auditability problem remains. What changes is the order (before rather than after) and the convincingness. A reasoning trace that appears methodical makes the final output harder to question while doing nothing for verifiability. Early errors compound forward with increasing apparent rigour, and the answer arrives already defended.

In this scenario, the replayability requirement fails at multiple compounding depths the user cannot see existed.

These models are marketed on the depth of their reasoning. That framing is accurate, but depth is also what makes them harder to audit. The deeper the reasoning, the higher the complexity, the further the output is from anything auditable. Reasoning models hide this complexity. Capability and auditability are moving in opposite directions, and the gap between them is invisible by design.

The chain of thought shows you the plan, but whether the model followed it is a separate question entirely.

Respecting your boundaries

Models can be instructed to operate within defined constraints, such as a specific set of sources, a scoped domain, or a bounded context. However, those instructions are only requests, and there is no guarantee of adherence.

A model that stays within its defined boundaries most of the time has simply not yet found a reason to leave. Boundary violations arrive without warning, without an error state, fluent, confident, and indistinguishable from a correctly scoped response. The model doesn’t record what it chose to ignore, and there isn’t an audit trail.

For example, where scope boundaries reflect regulatory jurisdiction, information barriers, and data entitlements, this isn’t simply a theoretical concern. A model reasoning across a boundary it was asked not to cross has breached a control, and the system doesn’t tell you it happened.

Marking its own homework

The natural response to untrustworthy output is verification. The natural temptation, given the fluency and apparent capability of these systems, is to use another model to do it.

An LLM used to validate another LLM’s output inherits every property that made the original output untrustworthy. It is non-deterministic, not replayable, and has no access to the internal state of the model it is evaluating. It will confidently assess plausibility, coherence, surface consistency. But it doesn’t verify whether the reasoning was sound, whether the sources were used correctly, or whether the output reflects what actually happened inside the generative layer.

LLM validation is a second opinion from a perspective that wasn’t able to observe the decision making either.

The instability compounds. If the original output varies across runs, the validation of that output also varies. You now have two non-deterministic, non-replayable processes in sequence, and the result of the second is being used to grant trust to the first. The audit trail ends where it needs to begin.

Compound systems don’t distribute trust

When multiple agents or models are chained together, the reasoning across stages becomes structurally unverifiable. The inputs and outputs at each node are observable. However, what happened between them, what the model weighted, what it discarded, why it transitioned from one conclusion to the next, is not.

This isn’t only a problem of obviously chained architectures. A reasoning model is itself a compound system: planning, retrieval, synthesis, all looped internally before a response surfaces. The opacity is identical, just with a more polished interface.

A network of unvalidated agents risks distributing error. Each step introduces a source of discrepancy, weakens attribution, and erodes auditability. The uncertainties can compound, rather than averaging out. Without controls, the system becomes a multiplier of its own weaknesses.

The final output of a chained system carries the accumulated opacity of every step that produced it, arriving with the same fluency and confidence as a single-model response. The causal chain that connects it to its sources is structurally absent.

The containment requirement

Auditability cannot be recovered from within the generative layer, as the model cannot trace its own reasoning. A reasoning trace is not a verified record, and another model cannot independently verify it. Prompt instructions cannot be relied on to enforce strict boundaries, and opacity accumulates across every stage in compound systems.

Nothing the model produces about its own reasoning constitutes evidence about itself. For consequential decisions, the only logical conclusion is that trust must be established from outside. From within a deterministic boundary that separates it from the generative layer.

An output without a reproducible process is evidence of nothing.

< Previous | he Six Pillars: Foreword / What are the Six Pillars?

Next > | Pillar 2: Authority

News & Insights

13th May 2026