Who says what is true?

An AI system produces a mathematical result through inference. A computer algebra system computes one. They may agree, they may not. Who decides which is correct? On what basis? And what would “correct” mean when every source is fallible?

The invention of certainty

We can say for sure that 1 + 1 = 2. Mathematics is where humans invented certainty: a formal system of abstractions where “true” and “false” are defined and computable. Outside mathematics, assertions rely on concepts that only partially represent reality, so they remain debatable.

Two layers of uncertainty

AI and Computer Algebra Systems (CAS) sit on top of mathematical certainty, each introducing its own uncertainty. AI reasons about mathematics but carries training gaps, hallucinations, and plausible-but-wrong inferences. CAS computes but is subject to bugs, incomplete algorithms, and scope limitations. Neither layer is the mathematical bedrock; both are fallible machinery operating above it.

ExaktAI’s focus is on these two layers, questioning whether the AI-inferred result and the CAS’s computation converge and withstand scrutiny.

Three principles

ExaktAI’s validation architecture rests on three ideas, none of them new.

Descartes

In order to seek truth, it is necessary once in the course of our life to doubt, as far as possible, of all things.

Popper

Since we can never know anything for sure, it is simply not worth searching for certainty; but it is well worth searching for truth; and we do this chiefly by searching for mistakes, so that we can correct them.^*

Feynman

The first principle is that you must not fool yourself - and you are the easiest person to fool.

These three principles map directly to ExaktAI’s flow: question AI outputs → validate through convergence → report what could not be validated.

The real landscape

“The AI proposed it, the CAS computed it, and they agree” is the easy sentence. Behind it lies a landscape where no single validation is infallible.

AI and CAS disagree

The AI infers one result; the CAS computes a different one. The discrepancy is caught. AI-CAS agreement is what most people imagine when they think of validation.

CAS cannot verify

The AI reasons, but the CAS lacks the algorithm, or the problem falls outside its scope. The result may be correct, but no CAS evidence supports it.

No AI inference at all

The problem is outside training data. No AI system produces a meaningful answer.

CAS has a bug

Computer algebra systems can have bugs. Cross-checking with an independent source (AI, other CAS, or other method) makes the bug visible.

Cross-CAS verification

Sounds ideal, but one system may lack the algorithms the other has, and decomposing into finer steps in the weaker system can compound probable errors rather than helping with the verification.

Every validation method is partial and can fail. The point, then, is not whether any single check can be trusted, but that the composition of the appropriate independent checks can produce something significantly stronger than any one alone.

What validated means

ExaktAI runs independent validation procedures on the result the AI inferred, and reports one of three outcomes.

Validated means: the AI-inferred result passed every check we performed.

Not validated means: a check found that the AI-inferred result does not hold (an independent computation disagreed).

Inconclusive means: the checks could neither confirm nor refute the AI-inferred result. This is not failure; it leaves room for human review.

The validation is auditable. Every step is present in an executable document (a Mathematica notebook or a Maple document) that you can inspect, re-run, and verify independently. The trust is not in ExaktAI. The trust is in the evidence the document contains.

AI-mathematics: weeding out its unreliability

No AI or CAS, by themselves, can guarantee mathematical correctness. What ExaktAI can do, and no other tool in this space currently does, is validate AI-mathematics results. Where an AI-inferred result is refuted by the ExaktAI validation procedures, the independently computed result using computer algebra is shown beside it, in a CAS document where you can reproduce the computation, and edit it as you see fit.

Our vision →

^* Karl Popper, In Search of a Better World (from the 1982 Alpbach lecture “Knowledge and the Shaping of Reality”); quoted at The Marginalian. ↩