-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
math.gamma result slightly different on aarch64-apple-darwin #132763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It's a 2 ULP difference: >>> 0.36222232384328096 .hex()
'0x1.72ea68ab26ecdp-2'
>>> 0.36222232384328107 .hex()
'0x1.72ea68ab26ecfp-2' I don't consider this to be "a bug". CPython makes no promises about the accuracy of "advanced" math functions,, and is typically at the mercy of the platform's C compiler and libm. They can't be expected to deliver identical results across platforms, compilers, or even across releases (of the compiler or of Python) on a single platform. In this case, looks like the MacOS C compiler is exploiting the HW's fused multiply-add (FMA) support. There's nothing in our source code that asks for that, or that forbids it. It's up to the compiler, and that's fine. FMA is generally a good thing, running faster and with smaller error. Of course it can have tiny effects on the low-order bits of the final result (or, for algorithms that rely on it, which Python doesn't use, huge effects). |
BTW, in the OP case - aarch64 result seems slightly better (rather 1ULP off).
It depends on how you do this. If you compare computed value with some float by |
I agree this is not necessarily a bug. Checking whether this is a policy of CPython will be good enough. But it is not because it is 2ULP difference, but because no promise about advanced math funcitions, right? Finding bigger difference is possible. $ clang -lm tgamma.c && ./a.out
The result of tgamma(-3.8510064710745118) is: 3.809507
Bit representation of result: 4615760666384231000 $ clang -lm tgamma.c && ./a.out
The result of tgamma(3.6215752811868267) is: 3.809507
Bit representation of result: 4615760666384231005
I am not comparing the float values but comparing the bit representation of float values. Probably I didn't write the issue detailed enough. I am not insisting the result of math.gamma is incorrect, but only saying the bit representation of gamma function is different only in For more context, I was going to write the port of those functions in Rust. I know float operation result can be varied by based platform, libm or a few other reasons. |
Sorry, it seems you use different arguments in your latest examples. -3.8510064710745118 vs 3.6215752811868267. It's hard to say in which example the difference is bigger.
It's same story. Rather you should use something like |
Sorry, the log is hardcoded one. (The source code is linked above) @skirpichev I am not getting why you are emphasizing |
Or leave note at docs.python.org that the it could show different bit results according to the compiler options? |
@corona10 Different bit representations for floating-point operations across platforms are expected behavior in computer science. So I feel like that's a redundant commentary for general computer science knowledge. If CPython wants to try best to keep consistent cross-platform floating-point behavior, modifying the build configuration with appropriate compiler flags would be the solution. However, if the current default compiler options align with the project's policy of minimal compiler customization, then I believe we can simply close this issue as working as designed. |
Thanks, I guess in the second case you have result on aarch64 (on x86_64 linux box I got your first result). In this case, this is worse by 1ULP (3ULP vs 2ULP) than x86_64 result:
BTW, could you test libm's tgamma() on your systems?
It's a good idea to use something like this, unless you are sure that your computations are identical (i.e differ only by transformations, that are true for all floats, e.g. you can't use associativity to reorder operands). Optimization, using HW's FMA - is an example of transformation (permitted with
I doubt that forbidding floating-point contraction is a good idea. Besides a performance boost you will, probably, get more accurate results on average. |
[@youknowone wrote]
Right! Although, to be fair, CPython makes no guarantees about the results of simple float arithmetic (+ - * /) either. Those results are also inherited from the platform C. It just so happens that almost all platforms today have HW that implements the IEEE-754 standard, which does define the precise results of simple arithmetic. gamma is a little different in that CPython implements it itself. Our code doesn't rely on FMA, but it's up to the compiler whether to use it in lines like: num = num * x + lanczos_num_coeffs[i];
den = den * x + lanczos_den_coeffs[i]; Results can differ. Why does CPython implement gamma itself? Because some platforms' implementations at the time were too sloppy to bear. From the comments: /* Implementation of the real gamma function. Kept here to work around
issues (see e.g. gh-70309) with quality of libm's tgamma/lgamma implementations
on various platforms (Windows, MacOS). In extensive but non-exhaustive
random tests, this function proved accurate to within <= 10 ulps across the
entire float domain. Note that accuracy may depend on the quality of the
system math functions, the pow function in particular. Special cases
follow C99 annex F. The parameters and method are tailored to platforms
whose double format is the IEEE 754 binary64 format. ...
*/ Note that errors as large as 10 ULP were seen at the time it was checked in. It's possible that the platforms in question have improved their implementations since then. The kind of testing done in gh-70309 could be done again to see whether we still "need" to supply our own. |
Thank you all. I am understanding the situation better.
I confirmed this is the exact line why this happned. When I changed this operation to fma, tgamma always returns the "different" value of aarch64 macos. lgamma seems to have more place than that. I also confirmed this is only happening on aarch64 macos, not on x86_64 macos. Probably compiler default is set to be more aggressive for aarch64. Rather than changing compiler ops, how do you think making some operations in gamma to be explicitly using fma or hostile to be automatically fma applied by which one is more accurate to real function? Explicitly using fma will make FMA-supporting architectures to return always the same value and another group works as before. Making hostile to fma will make everything agree on the other way. |
The second part of lgamma is here: https://github.com/python/cpython/blob/main/Modules/mathmodule.c#L516
I confirmed changing this to use fma makes lgamma to return the value of aarch64 macos build |
libm tgamma vs py tgamma
on my aarch64 macos machine:
on my x86_64 linux machinie:
on my x86_64 macos machine:
|
As I recall, CPython never uses the platform C's
I don't believe anyone has the time and patience to give to resolving that, so we don't use So I'm inclined to let this go.
If If we absolutely have to "do something" about this, I'd favor a rule that compiler options to suppress FMA be used for compiling CPython. Nothing in our code base relies on it, and we have no internal algorithms that were designed with it in mind. So it's just a potential source of mysterious low-order-bit platform differences we really don't care about. |
@tim-one, I'm not sure to which pr you refer, but I think that latest attempt to remove custom code was ~2years ago: #101678. Probably this even not reported to upstream (I've no idea how to do this) and/or they don't care (MacOS/Window$, eh).
@youknowone, you now printed only 15 digits instead of 17. That's why now your results show no difference on aarch64 wrt x86_64 (i.e. for original issue).
In fact, we do in (Ah, and the platform's libm
If it's not the only possible source of differences - we don't buy too much. I.e. CPython's tgamma/lgamma code rely on platform's libm's. Unless same libc is used (e.g. Glibc) on all platforms - OP still should expect that results will not be bit-to-bit identical. It's relatively easy to find differences even on FOSS systems, especially for non-elementary functions, e.g.
|
@skirpichev I am sorry to confuse you about it. since the value correctness is out of my interest, I usually only checked uint64 transformed binary representations. Please check the last digit of second lines' integer values. You will see a few bits of difference in x86. If you want to check float values, I will share you them with enough digits. Thanks to you all, I now feel like the comments of this issue contain enough explanation how it is going, and I am good with it. Please feel free to close the issue if the conclusion is changing nothing. Otherwise I will do my best for what I can help anything. |
Sorry, it was given in the C comment I pasted, but Github didn't make it clickable in that context. So I copy/pasted it again, but dropped a trailing digit by mistake. I edited the msg to repair that, and here it is again:
Good catch! More reason to just leave this all alone.
I think that's irrelevant. We supply wrappers for all standard libm functions, and So I would close this issue as "not planned". Heroic efforts to hide low-order bit differences is a rat hole CPython should stay out of. Unless we write our own libm from the ground up, we'll never succeed. And we don't have anywhere near the human resources needed to even contemplate that. And, if we did, users would complain that our libm sometimes gives different results than their platform C libm gives. The best long-term approach is "benign neglect". Over time, libm authors move closer to correct rounding in ever more cases. For that reason, it was highly arguable whether Python should have taken over the gamma functions to begin with. Consider that if we had not, this issue would never have been opened (if we used the platform implementations, the options used to compile CPython would be irrelevant). |
@youknowone, I see. It's my bad. BTW, next time you could use It looks, like on aarch64 and for libm's tgamma() - results within 1ULP (and same for libm's tgamma() and CPython's tgamma() - maybe Glibc use fma() explicitly?), while for CPython's version: ~3ULP. Though, one point is not enough to conclude that CPython implementation is worse. Ah, and that was an attempt right before (+ ~5 years) of mine. As you can see, there is no progress at all.
I'm closing this on that ground.
I hoped we are slowly moving rather from this state (previously CPython had much more workarounds for buggy libm's). |
In the absence of rigorous error analysis, the best that can be done is to compare an implementation to correctly rounded results across at least billions of points. That's very expensive. in effect requiring emulating arbitrary-precision float math. That was done at the time, and the comments noted that errors as large as 10 ULP were seen. Not disastrous by any means, but certainly room for improvement. By now, I'm sure some platforms' implementations are better than ours.
Not just libm! All sorts of C library functions have been problems. Cute: the only reason I started writing sorts for Python was that platform C |
I am not opposed to the conclusion itself, but opposite to the decision-making flow. While I agree that a single data point is insufficient to conclude that the CPython implementation is inferior, the value is not the worse case and just randomly picked up. I simply chose the first failing value to illustrate the difference, without focusing on which implementation is more correct (as I have repeatedly emphasized not correctness but difference). From your response, I understand that the correctness of values appears to influence your decision. If correctness is indeed a factor in your decision-making, it would be problematic to conclude that one implementation is not worse than others based on comparing just a single value. If you are open to adopting FMA for its potential accuracy benefits, I believe we should conduct a more comprehensive statistical analysis of a properly sampled set of double values, or even the entire double type domain, before determining which approach is more correct. I would be happy to provide the necessary computing resources and machines to support such an analysis. I previously indicated I am good to close this issue because I understood it not as an issue of correctness, but somewhat other than that like less maintaining cost, betting to wait for future library improvement, or something I don't know well not about the correctness. However, I am uncomfortable with reaching conclusions based on correctness claims without properly measuring and establishing that correctness. |
For another value CPython's result without fma() seems slightly better:
You can use mpmath to get precise results and compare them with CPython's tgamma() with or without fma(). The libm's answer also interesting (though, using libm's tgamma() in CPython seems to be not an option).
It's not just correctness. As Tim noted, fma() may be unavailable in HW, then this will slowdown gamma functions. |
FMA makes little difference unless a numeric algorithm suffers massive cancellation without it. The algorithms Python uses do not--by design--suffer massive cancellation. FMA wouldn't help them except by accident depending on specific inputs. May help, may hurt, depending. And we don't care about low-order bit differences. Python could theoretically use entirely different algorithms that exploit FMA, but there's essentially no chance anyone will do that. The code we have was written by a world-class numerical analyst (who is, alas, no longer active in the project), and as has been said several times now, his extensive testing found maximum error of 10 ULP. "Good enough". There's nothing more to be done here with reasonable effort. Testing numerical functions for accuracy is not easy. Doubles are a 64-bit type, and there are quintillions of them. Exhaustive testing will never be done. What can be done is to have an expert numerical analyst run some (mere!) billions of randomized tests, and "white box" tests based on deep understanding of the algorithm (to know in advance where it's likely to do worst). That was already done for the code we have. I doubt anyone would volunteer the effort to do it again. I know I won't 😉. For a start, I have scant understanding of the algorithms we're using, and "easy" randomized testing is insufficient. Good testing on a space so large requires bona fide expertise in the algorithm being tested. |
I did and do (again) totally agree about the conclusion. Just pointed unmeasured correctness doesn't back the conclusion. (Whether another value is better or worse, another single value still shouldn't back the conclusion.) Now I learned another point the algorithm is already carefully designed not to use fma. I didn't know that. That helps a lot to understand the conservative view of those algorithms. I also realized I also should turn off fma for my ported code too by that. (Well tested is better than general expectation) Thank you so much and sorry for bothering about it again. |
Here are quick tests, on the CPython data points for tgamma(). Second and fourth columns measure difference wrt computed by mpmath value, in the third column I get expected value from CPython tests (btw they look less precise). Without fma():
With fma() (see patch), this affects the second column:
As you can see, on my system - libm's code seems better on the given set. The fma()'s effect seems less impressive (max 4ULP vs 3ULP). Data:
Script: def parse_mtestfile(fname):
with open(fname, encoding="utf-8") as fp:
for line in fp:
# strip comments, and skip blank lines
if '--' in line:
line = line[:line.index('--')]
if not line.strip():
continue
lhs, rhs = line.split('->')
id, fn, arg = lhs.split()
rhs_pieces = rhs.split()
exp = rhs_pieces[0]
flags = rhs_pieces[1:]
yield (id, fn, float(arg), float(exp), flags)
import math
import ctypes
import mpmath
import statistics
mpmath_diff = []
libm_diff = []
expected_diff = []
libm = ctypes.CDLL('libm.so.6')
libm.tgamma.argtypes = [ctypes.c_double]
libm.tgamma.restype = ctypes.c_double
print(f'{"x":>25} {"mpmath diff (ULP)":>20} {"expected diff (ULP)":>20} '
f'{"libm diff (ULP)":>20}')
print('-'*(25+1+20+1+20+1+20))
for id, fn, arg, expected, flags in parse_mtestfile('gamma.txt'):
if flags or not math.isfinite(expected):
continue
mp_arg = mpmath.mpf(arg)
with mpmath.mp.workprec(100):
mp_expected = mpmath.mp.gamma(mp_arg)
mp_expected = +mp_expected
with mpmath.mp.workprec(1000):
mp_expected2 = mpmath.mp.gamma(mp_arg)
mp_expected2 = +mp_expected2
assert mp_expected == mp_expected2
ulp_mp = math.ulp(mp_expected)
ulp_expected = math.ulp(expected)
mpmath_diff.append(float(abs(mp_expected - math.gamma(arg))/ulp_mp))
expected_diff.append(float(abs(expected - math.gamma(arg))/ulp_expected))
libm_diff.append(float(abs(mp_expected - libm.tgamma(arg))/ulp_mp))
print(f'{arg:>25.17g} {mpmath_diff[-1]:>20.1f} '
f'{expected_diff[-1]:>20.1f} {libm_diff[-1]:>20.1f}')
print('-'*(25+1+20+1+20+1+20))
print(f'{"max diff":>25} {max(mpmath_diff):>20.1f} {max(expected_diff):>20.1f} '
f'{max(libm_diff):>20.1f}')
print(f'{"mean diff":>25} {statistics.mean(mpmath_diff):>20.1f} '
f'{statistics.mean(expected_diff):>20.1f} {statistics.mean(libm_diff):>20.1f}') Patch: diff --git a/Modules/mathmodule.c b/Modules/mathmodule.c
index 11d9b7418a..58363f1ac4 100644
--- a/Modules/mathmodule.c
+++ b/Modules/mathmodule.c
@@ -363,8 +363,8 @@ lanczos_sum(double x)
this resulted in lower accuracy. */
if (x < 5.0) {
for (i = LANCZOS_N; --i >= 0; ) {
- num = num * x + lanczos_num_coeffs[i];
- den = den * x + lanczos_den_coeffs[i];
+ num = fma(num, x, lanczos_num_coeffs[i]);
+ den = fma(den, x, lanczos_den_coeffs[i]);
}
}
else { |
Did I mention "rat hole"? 😉 A "proper" way to test is not to take some library's results as the expected results. The expected results have infinite precision - the "true answers". Then various library implementation results are measured against those. Of course we don't have infinite precision, not even with
with mpmath.extraprec(same extra precision used to compute `expected`):
diff = (got - expected) / math.ulp(got)
diff = float(diff) # round back to native float The magnitude of If the Example:: >>> import math, mpmath
>>> arg = math.pi / 2
>>> got = math.tan(arg)
>>> got
1.633123935319537e+16
>>> with mpmath.extraprec(100):
... expected = mpmath.tan(arg)
... diff = (got - expected) / math.ulp(got)
...
>>> diff
mpf('0.12201613147923554')
>>> float(diff)
0.12201613147923554 So Pyrhon's result is about 0.12 ULP larger than the infinitely precise result. That's the best possible result in native precision (because it's < 0.5 ulp, it's the correctly rounded result). Note that, as @youknowone discovered, FMA can also make a difference in our: r += (absx - 0.5) * (log(absx + lanczos_g - 0.5) - 1); |
I'll include the guts of what I consider to be "good" randomized testing. It's crude but effective. A pair of adjacent output lines:
says that there were For Python's gamma, this run (the results depend on the random number generator's initial state) showed that "correctly rounded" is not the most likely outcome, but it's close. Errors seen were in the range -7.5 ULP to 7.5 ULP. About 70% were within 1 ULP (sum of the -1.0, -0.5, 0.0, and 0.5 bin counts), which is in "excellent" territory for portable and reasonably fast transcendental functions even much simpler to implement than gamma. If there are several libraries you want to compare, I urge not trying to do them all at once. Do something like this for each one on its own, and save the bin counts (e.g., pickle). Of course then you also want to force the random number generator to start with the same seed. Boosting Details
import math, mpmath
from collections import defaultdict
import random
from random import random
from math import ulp, floor
ULP_SCALE = 2.0
bins = defaultdict(int)
ref = mpmath.gamma
lib = math.gamma
## def lib(arg, base=ref):
## with mpmath.workprec(53):
## return base(arg)
count = 1
while True:
arg = random() * 170.0
with mpmath.extraprec(30):
expected = ref(arg)
got = lib(arg)
diff = (got - expected) / ulp(got)
diff = floor(float(diff) * ULP_SCALE)
bins[diff] += 1
if not count & 0xfffff:
print(format(count, '_'))
for k, v in sorted(bins.items()):
print(k / ULP_SCALE, v)
count += 1 Output from the first block of one run:
|
Interesting: uncomment the redefinition of The results are horrid, with results hundreds of ULPs wrong. Which is fine for If you leave the 53 as 53, the results are stellar. From one run:
So out of the 2**20 inputs, all results were correctly rounded except for 46, and those were within 1 ULP. However, there is slight bias in that all of those were "too small". That's often a symptom of summing a power series with positive terms, and cutting it off "too early". But hundreds of ULP error is consistent with that 2**10 is about 1000, and the algorithm probably knows it doesn't really care about the last 10 bits (since it added 10 bits to absorb predictable errors). |
No, it was taken not for results, but for some set of data points from the gamma() domain. Probably, it's representative enough to take into account specific behavior of the given function (big/small arguments, nearby poles, etc). (Something you miss with just scaled Expected results (from the data set) are checked too, but just for completeness.
Good point (both for sign and keeping extra precision). BTW, why you take ulp's of
Yes, but I did no tests for lgamma().
Yes, this is not a good idea in general. But my point was to argue, that using libm's version seems to be a better option than trying to "improve" CPython's algorithm with explicit BTW, here are results without/with fma() for slightly adapted your script (include negative input and use autoprec): >>> # with seed=1 & count=0xfffff
>>> [...]
>>> for k in sorted(set(bins_ref)|set(bins_patch)|set(bins_libm)):
... v1 = bins_ref[k]
... v2 = bins_patch[k]
... v3 = bins_libm[k]
... print(f"{k/ULP_SCALE:+3} {v1:7} {v1/count:6.2%} "
... f"{v2:7} {v2/count:6.2%} {v3:7} {v3/count:6.2%}")
...
-7.0 6 0.00% 5 0.00% 0 0.00%
-6.5 13 0.00% 12 0.00% 0 0.00%
-6.0 38 0.00% 32 0.00% 1 0.00%
-5.5 114 0.01% 105 0.01% 6 0.00%
-5.0 293 0.03% 259 0.02% 36 0.00%
-4.5 768 0.07% 718 0.07% 145 0.01%
-4.0 1971 0.19% 1871 0.18% 453 0.04%
-3.5 4812 0.46% 4619 0.44% 1769 0.17%
-3.0 11521 1.10% 11240 1.07% 5911 0.56%
-2.5 25373 2.42% 25050 2.39% 18148 1.73%
-2.0 51834 4.94% 51586 4.92% 47917 4.57%
-1.5 95846 9.14% 95735 9.13% 101661 9.70%
-1.0 147871 14.10% 148090 14.12% 166747 15.90%
-0.5 186481 17.78% 187028 17.84% 211659 20.19%
+0.0 186207 17.76% 186613 17.80% 203320 19.39%
+0.5 147213 14.04% 147560 14.07% 148221 14.14%
+1.0 94756 9.04% 94953 9.06% 84277 8.04%
+1.5 50741 4.84% 50668 4.83% 37419 3.57%
+2.0 24463 2.33% 24378 2.32% 14023 1.34%
+2.5 10770 1.03% 10705 1.02% 4745 0.45%
+3.0 4556 0.43% 4511 0.43% 1497 0.14%
+3.5 1853 0.18% 1801 0.17% 465 0.04%
+4.0 698 0.07% 683 0.07% 120 0.01%
+4.5 257 0.02% 242 0.02% 29 0.00%
+5.0 83 0.01% 78 0.01% 6 0.00%
+5.5 25 0.00% 23 0.00% 0 0.00%
+6.0 10 0.00% 8 0.00% 0 0.00%
+6.5 1 0.00% 1 0.00% 0 0.00%
+7.0 1 0.00% 1 0.00% 0 0.00% Again, almost no effect of
No, working precision increased in a much more tricked way (see mpf_gamma - it's called with the current context precision & rounding). # a.py
import math, mpmath
from collections import defaultdict
from random import random, seed
from math import ulp, floor
from sys import argv
from pickle import dump
ULP_SCALE = 2.0
bins = defaultdict(int)
ref = mpmath.gamma
lib = math.gamma
seed(argv[1])
count = 1
while count & 0xfffff:
arg = 2*(random()-0.5) * 170.0
if arg <= 0 and arg.is_integer():
continue # pole
with mpmath.extraprec(100):
expected = mpmath.autoprec(ref)(arg)
got = lib(arg)
diff = (got - expected) / ulp(got)
diff = floor(float(diff) * ULP_SCALE)
bins[diff] += 1
count += 1
with open(argv[2], 'wb') as f:
dump(bins, f) # b.py
import math, mpmath, ctypes
from collections import defaultdict
from random import random, seed
from math import ulp, floor
from sys import argv
from pickle import dump
ULP_SCALE = 2.0
bins = defaultdict(int)
ref = mpmath.gamma
libm = ctypes.CDLL('libm.so.6')
libm.tgamma.argtypes = [ctypes.c_double]
libm.tgamma.restype = ctypes.c_double
lib = libm.tgamma
seed(argv[1])
count = 1
while count & 0xfffff:
arg = 2*(random()-0.5) * 170.0
if arg <= 0 and arg.is_integer():
continue # pole
with mpmath.extraprec(100):
expected = mpmath.autoprec(ref)(arg)
got = lib(arg)
diff = (got - expected) / ulp(got)
diff = floor(float(diff) * ULP_SCALE)
bins[diff] += 1
count += 1
with open(argv[2], 'wb') as f:
dump(bins, f) |
Right, I was talking about "good randomized testing". You're also moving toward "white box" testing, which is also essential, but beyond what I was talking about.
The true expected result has conceptually infinite precision. The concept of "ULP" makes no sense for it. If you round it back to native precision, then ULP is defined for it, but as already noted the computed ULP difference will always be an exact integer. The only thing the user sees is
Cool! I never used that. I'll start to now 🥰. Thanks!
FMA is systematically better in every respect except for max error, but very slightly so - not enough so to be worth any effort to pursue. And, yes, that platform's gamma is clearly better, and quite significantly so on the "max error" measure. But still lots of room for improvement. |
No, I'm not about rounding back
autoprec is just another heuristics. So, probably you will be disappointed.
Yes, this seems to be worst 1-arg libm function on my system. With erfc and... cbrt. |
Sorry, but then I have no idea what you have in mind.
Ya, gamma is notoriously hard to compute. It grows faster than exp, and its derivatives are also messy. Approximations based on polynomials struggle as a result. I'm surprised, though. to hear it has a sloppy cube root! That one shouldn't be particularly challenging. |
Ah. I expect you have in mind keeping The problem with that is that the user has no idea what |
Let me illustrate with an example using 2-digiit decimal floats:
The absolute difference is 0.005. That's 1/2 ULP of EDIT: Or leave rounding out of it: if the infinitely precise result is 1.0, then the difference of 0.01 is 1 ULP of |
Bug report
Bug description:
The difference is watched from distributions.
Because the result is changed by fp-contract. It may not be strictly a bug in source code, but can be a bug of build & distrubition.
On x86_64 linux:
On aarch64 macOS:
tgamma.cfile is separated from mathmodule.c to test easy.new tgamma.c
The result below is the result of tgamma.c, but they are the same in CPython distribution.
Test results:
tgamma.c in x86_64 linux:
tgamma.c in aarch64 macos:
The original test I found this mismatch:
On
aarch64-apple-darwin
,x is input.
Left is CPython, right is pymath
On
x86_64-unknown-linux-gnu
,Turning off fp-contract will fix aarch64 macos build to match x86 linux/windows build.
CPython versions tested on:
3.13
Operating systems tested on:
macOS
The text was updated successfully, but these errors were encountered: