Codex: on the 100-problem benchmark

Model: gpt-5.3-codex · each problem submitted 6 times, asking only for the final mathematical answer. Below are the 3 problems where at least one run differed. Each analysis is written by an AI and audited by the ExaktAI team.

Summary
97%
All 6 equivalent
2%
5 of 6 equivalent
1%
4 of 6 equivalent
0%
3 of 6 equivalent
0%
2 of 6 equivalent
0%
All 6 different
5 of 6 equivalent

Problem 20

Find the extreme values of \(f \! \left(x , y\right) = x^{2}+2 y^{2}\) subject to the constraint \(x +y = 3\).

Run 1
\(6\)
Run 2
\(6\)
Run 3
\(6\)
Run 4
\(6\)
Run 5
\(6\)
Run 6
\(\left(6, \infty \right)\)

Analysis: Run 1 gives the single value \(6\) (a minimum value), while run 6 gives the open interval \((6, \infty)\) (a range of values with no maximum).

4 of 6 equivalent

Problem 38

Solve \(\frac{d^{2}}{d x^{2}}y \! \left(x \right) = \frac{x^{2} \left(\frac{d}{d x}y \! \left(x \right)\right)^{2}-2 x y \! \left(x \right) \left(\frac{d}{d x}y \! \left(x \right)\right)+y \! \left(x \right)^{2}}{x^{2}}\).

Run 1
\(x \left(\textit{\_C2} -{\mathrm e}^{-\textit{\_C1}} \mathrm{Ei}\! \left(\textit{\_C1} -\ln \! \left(x \right)\right)\right)\)
Run 2
\(\textit{\_C1} x -\frac{x \,\mathrm{Ei}\! \left(\textit{\_C2} -\ln \! \left({| x |}\right)\right)}{{\mathrm e}^{\textit{\_C2}}}\)
Run 3
\(x \left(\textit{\_C2} -{\mathrm e}^{-\textit{\_C1}} \mathrm{Ei}\! \left(\textit{\_C1} -\ln \! \left(x \right)\right)\right)\)
Run 4
\(x \left(\textit{\_C2} -{\mathrm e}^{-\textit{\_C1}} \mathrm{Ei}\! \left(\textit{\_C1} -\ln \! \left({| x |}\right)\right)\right)\)
Run 5
\(\textit{\_C2} x -x \,{\mathrm e}^{-\textit{\_C1}} \mathrm{Ei}\! \left(\textit{\_C1} -\ln \! \left(x \right)\right)\)
Run 6
\(x \left(\textit{\_C2} -{\mathrm e}^{-\textit{\_C1}} \mathrm{Ei}\! \left(\textit{\_C1} -\ln \! \left(x \right)\right)\right)\)

Analysis: The run 1 answer uses \(\ln(x)\) while the run 2 and run 4 answers use \(\ln(|x|)\), differing in their handling of negative \(x\).

5 of 6 equivalent

Problem 60

Express \(4 \sin \! \left(x \right) \sin \! \left(2 x \right) \sin \! \left(4 x \right)\) as a sum of cosines.

Run 1
\(\cos \! \left(\frac{\pi}{2}-5 x \right)+\cos \! \left(\frac{\pi}{2}-3 x \right)-\cos \! \left(\frac{\pi}{2}-7 x \right)-\cos \! \left(\frac{\pi}{2}-x \right)\)
Run 2
\(\cos \! \left(\frac{\pi}{2}-5 x \right)+\cos \! \left(\frac{\pi}{2}-3 x \right)-\cos \! \left(\frac{\pi}{2}-7 x \right)-\cos \! \left(\frac{\pi}{2}-x \right)\)
Run 3
\(\cos \! \left(\frac{\pi}{2}-5 x \right)+\cos \! \left(\frac{\pi}{2}-3 x \right)-\cos \! \left(\frac{\pi}{2}-7 x \right)-\cos \! \left(\frac{\pi}{2}-x \right)\)
Run 4
\(\cos \! \left(\frac{\pi}{2}-5 x \right)+\cos \! \left(\frac{\pi}{2}-3 x \right)-\cos \! \left(\frac{\pi}{2}-7 x \right)-\cos \! \left(\frac{\pi}{2}-x \right)\)
Run 5
\(\frac{\cos \! \left(\frac{\pi}{2}-5 x \right)}{4}+\frac{\cos \! \left(\frac{\pi}{2}-3 x \right)}{4}-\frac{\cos \! \left(\frac{\pi}{2}-7 x \right)}{4}-\frac{\cos \! \left(\frac{\pi}{2}-x \right)}{4}\)
Run 6
\(\cos \! \left(3 x -\frac{\pi}{2}\right)-\cos \! \left(x -\frac{\pi}{2}\right)-\cos \! \left(7 x -\frac{\pi}{2}\right)+\cos \! \left(5 x -\frac{\pi}{2}\right)\)

Analysis: Run 5's expression is divided by 4 while run 2's expression is not, so they differ by a factor of \(4\).