Who says what is true?

An AI system produces a mathematical result. A computer algebra system computes one. They may agree, they may not. Who decides which is correct? On what basis? And what would “correct” mean when every source is fallible?

The invention of certainty

We can say for sure that 1 + 1 = 2. Mathematics is where humans invented certainty: a formal system of abstractions where “true” and “false” are defined and computable. Outside mathematics, assertions rely on concepts that only partially represent reality, so they remain debatable.

Two layers of uncertainty

AI and Computer Algebra Systems (CAS) sit on top of mathematical certainty, each introducing its own uncertainty. AI reasons about mathematics but carries training gaps, hallucinations, and plausible-but-wrong inferences. CAS computes but is subject to bugs, incomplete algorithms, and scope limitations. Neither layer is the mathematical bedrock; both are fallible machinery operating above it.

ExaktAI’s focus is on these two layers, questioning whether the AI’s claim and the CAS’s computation converge and withstand scrutiny.

Three principles

ExaktAI’s validation architecture rests on three ideas, none of them new.

Descartes

In order to seek truth, it is necessary once in the course of our life to doubt, as far as possible, of all things.

Popper

Since we can never know anything for sure, it is simply not worth searching for certainty; but it is well worth searching for truth; and we do this chiefly by searching for mistakes.

Feynman

The first principle is that you must not fool yourself - and you are the easiest person to fool.

These three principles map directly to ExaktAI’s flow: question AI outputs → validate through convergence → report what could not be validated.

The real landscape

“The AI proposed it, the CAS computed it, and they agree” is the easy sentence. Behind it lies a landscape where no single validation is infallible.

AI and CAS disagree

The AI infers one result; the CAS computes a different one. The discrepancy is caught. AI-CAS agreement is what most people imagine when they think of validation.

CAS cannot verify

The AI reasons, but the CAS lacks the algorithm, or the problem falls outside its scope. The result may be correct, but no CAS evidence supports it.

No AI inference at all

The problem is outside training data. No AI system produces a meaningful answer.

CAS has a bug

Computer algebra systems can have bugs. Cross-checking with an independent source (AI, other CAS, or other method) makes the bug visible.

Cross-CAS verification

Sounds ideal, but one system may lack the algorithms the other has, and decomposing into finer steps in the weaker system can compound probable errors rather than helping with the verification.

Every validation method is partial and can fail. The point, then, is not whether any single check can be trusted, but that the composition of the appropriate independent checks can produce something significantly stronger than any one alone.

What validated means

ExaktAI runs independent validation procedures as the problem and result require.

Validated means: the result passed every check we performed.

Not validated means: at least one check failed or could not run. This is not necessarily failure; it leaves room for human review.

The validation is auditable. Every step is present in an executable document (a Mathematica notebook or a Maple document) that you can inspect, re-run, and verify independently. The trust is not in ExaktAI. The trust is in the evidence the document contains.

The honest limit

No AI or CAS, by themselves, can guarantee mathematical correctness. What ExaktAI can do, and what no other tool in this space currently does, is make the basis for each validation explicit, auditable, and reproducible.

The gap between “the AI said so” and “the result survived independent scrutiny” is the gap ExaktAI fills.

Our vision →ExaktAI vs CAS AI assistants →