ExaktAI uses AI
Claude, Gemini, and other AI systems provide the mathematical reasoning. ExaktAI’s context structures that reasoning into a sequence of declared steps, each one independently checkable.
An AI assistant inside a Computer Algebra System (CAS) document is not the same thing as AI whose reasoning is turned into an executable document where the computation is validated.
ExaktAI is AI-first. You ask a mathematical question in the ExaktAI App. The CAS runs as a backend: every step of the AI’s reasoning is translated into executable Maple or Mathematica. The CAS document is then generated, each step executed and validated, and the document automatically opened for you. This CAS document is where you can interact with the solving process step by step — inspect, reproduce, modify, experiment — until the result is valuable.
Wolfram’s Notebook Assistant and Maple’s AI Assistant are CAS-first. The notebook or worksheet is the environment the user was already working in; the AI sits beside it as an assistant, helping the Mathematica or Maple user understand topics or how to solve a problem. On request, the system may attempt to translate parts of the AI reasoning into Mathematica or Maple input which is not automatically executed, its result is not validated, may contain syntactic or mathematical errors, and often does not represent all the steps. Correctness is the user's responsibility to verify, as the CAS FAQs state.
In summary: ExaktAI delivers a validated answer with a CAS document where the answer can be reproduced step by step, while CAS systems with AI assistants deliver a solving narrative, possibly including untested computational instructions corresponding to parts of the reasoning.
The problem tackled with ExaktAI: Claude as the AI and Mathematica as the CAS. ExaktAI’s reasoning context guides Claude through a structured decomposition; each step is executed by Mathematica and validated against Claude’s result.
Three validated steps, result: −1/8 rad/s. The Mathematica notebook is the auditable validation document.
Wolfram (Mathematica) 14.1 is one of the world’s most capable computer algebra systems. Its Notebook Assistant combines an AI narrative with a Wolfram Language evaluator and different ‘personas’ for the AI. The AI model used is ‘Wolfram’, the default in Wolfram (Mathematica) 14.1.
Selecting Code Assistant as the ‘persona’, the AI-assistant presented an AI narrative followed by one evaluator section. The evaluator generated a code block covering three sections: computing y from the Pythagorean theorem, computing dy/dt, and solving for dθ/dt. Clicking “Insert and evaluate” inserts the block into the notebook and evaluates it.
The inserted and automatically evaluated code shows Out[8] = Solve[False, 0]
followed by Out[11] = {{θ′[t] → −1/8}}.
How the computation goes from Solve[False, 0] to the correct result is not shown.
The closing conclusion reads: “the rate…
is approximately −1/8 radians per second.” The correct answer
is exactly −1/8, not approximately. No assertion about
correctness is presented.
A second session using the Code Writer ‘persona’ instead of Code Assistant, same AI default model ‘Wolfram’, produced a different evaluator output:
−(1/10) * (1/0.8). Output: −0.125.The code block presented by the Code Writer as a multiplication is trivial and no Mathematica code is presented for the symbolic steps mentioned in the AI narrative for the problem: the Pythagorean step, the differentiation, or the substitution that produced sin(θ) = 8/10. No assertion about correctness is presented.
The same problem. ExaktAI with Gemini as the AI and Maple as the CAS. ExaktAI’s reasoning context guides Gemini through a structured decomposition; each step is executed by Maple and validated against Gemini’s result.
Each step validated. Result: −1/8 rad/s.
Maple is one of the world’s most capable computer algebra systems. Its built-in AI Assistant presented the AI narrative, a plausible solution to the ladder problem. The AI output was then inserted into the Maple document by clicking the last icon below the formula, then pressing enter to evaluate it.
The AI narrative is complete but Maple’s AI Assistant produced only one formula for the last step, which when inserted produced an incorrect, unexpected ‘false’.
| System | Displays AI output | All steps as CAS instructions | Result validated |
|---|---|---|---|
| Wolfram Notebook AI Assistant | Yes | No | No |
| Maple AI Assistant | Yes | No | No |
| ExaktAI + Mathematica or Maple | Yes | Yes | Yes |
All systems display a narrative produced by an AI. For this particular problem, no system other than ExaktAI succeeded in presenting all steps as CAS instructions, and ExaktAI is the only one that validated the results; using Mathematica and Maple.
In one run of the ladder problem above, DeepSeek produced a complete, well-structured derivation, but inferred the wrong final result: dθ/dt = −1/10 rad/s. The deeper issue is that, in higher mathematics, capable AI systems can return different mathematical answers depending on how the problem is presented, whether the same request is repeated, or which AI is asked. For a systematic benchmark documenting the reliability problem, see AI math reliability.
Claude, Gemini, and other AI systems provide the mathematical reasoning. ExaktAI’s context structures that reasoning into a sequence of declared steps, each one independently checkable.
Mathematica and Maple execute every step and provide a computational ground truth.
Each AI step is validated against the CAS result. The validation is explicit and auditable.
Not a smarter AI. Not a faster CAS. A validated, executable, editable document, where you can audit, reproduce, modify or extend the computation.
ExaktAI is in development. We can reach out when it's ready to try, beta scheduled for late summer or fall 2026.
* These runs were performed on April 4, 2026, using Wolfram (Mathematica) 14.1 and Maple 2026.0. Both software platforms and their AI assistants are actively evolving, and AI narratives vary from run to run. The observations below reflect what each system produced on that date. ↩ ↩