Skip to content

Segfault in union_repr from list_repr_impl in free-threaded build #132713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
devdanzin opened this issue Apr 19, 2025 · 14 comments
Closed

Segfault in union_repr from list_repr_impl in free-threaded build #132713

devdanzin opened this issue Apr 19, 2025 · 14 comments
Assignees
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading topic-typing type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@devdanzin
Copy link
Contributor

devdanzin commented Apr 19, 2025

Crash report

What happened?

Calling repr in many threads on a list containing a large typing.Union segfaults in a free-threaded build:

import abc
import builtins
import collections.abc
import itertools
import types
import typing
from functools import reduce
from operator import or_

abc_types = [cls for cls in abc.__dict__.values() if isinstance(cls, type)]
builtins_types = [cls for cls in builtins.__dict__.values() if isinstance(cls, type)]
collections_abc_types = [cls for cls in collections.abc.__dict__.values() if isinstance(cls, type)]
collections_types = [cls for cls in collections.__dict__.values() if isinstance(cls, type)]
itertools_types = [cls for cls in itertools.__dict__.values() if isinstance(cls, type)]
types_types = [cls for cls in types.__dict__.values() if isinstance(cls, type)]
typing_types = [cls for cls in typing.__dict__.values() if isinstance(cls, type)]

all_types = (abc_types + builtins_types + collections_abc_types + collections_types + itertools_types
             + types_types + typing_types)
all_types = [t for t in all_types if not issubclass(t, BaseException)]
BIG_UNION = reduce(or_, all_types, int)

from threading import Thread
from time import sleep

for x in range(100):
    union_list = [int | BIG_UNION] * 17  

    def stress_list():
        for x in range(3):
            try:
                union_list.pop()
            except Exception:
                pass
            repr(union_list)
            sleep(0.006)
        union_list.__getitem__(None, None)

    alive = []
    for x in range(25):
        alive.append(Thread(target=stress_list, args=()))

    for t in alive:
        t.start()

Example segfault backtrace:

Thread 60 "Thread-59 (stre" received signal SIGSEGV, Segmentation fault.

0x0000555555d21751 in _Py_TYPE (ob=<unknown at remote 0xdddddddddddddddd>) at ./Include/object.h:270
270             return ob->ob_type;

#0  0x0000555555d21751 in _Py_TYPE (ob=<unknown at remote 0xdddddddddddddddd>) at ./Include/object.h:270
#1  union_repr (self=<optimized out>) at Objects/unionobject.c:296
#2  0x0000555555b8949a in PyObject_Repr (v=<unknown at remote 0x7fffb48464b0>) at Objects/object.c:776
#3  0x0000555555cc06a1 in PyUnicodeWriter_WriteRepr (writer=writer@entry=0x7fff660907f0,
    obj=<unknown at remote 0x207c>) at Objects/unicodeobject.c:13951
#4  0x0000555555aeba03 in list_repr_impl (v=0x7fffb4a8dc90) at Objects/listobject.c:606
#5  list_repr (self=[]) at Objects/listobject.c:633
#6  0x0000555555b8949a in PyObject_Repr (v=[]) at Objects/object.c:776
#7  0x0000555555e07abf in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>,
    throwflag=<optimized out>) at Python/generated_cases.c.h:2306
#8  0x0000555555ddcf53 in _PyEval_EvalFrame (tstate=0x529000325210, frame=0x529000384328, throwflag=0)
    at ./Include/internal/pycore_ceval.h:119
#9  _PyEval_Vector (tstate=0x529000325210, func=0x7fffb4828dd0, locals=0x0, args=<optimized out>,
    argcount=1, kwnames=0x0) at Python/ceval.c:1917
#10 0x0000555555a4f42b in _PyObject_VectorcallTstate (tstate=0x529000325210,
    callable=<function at remote 0x7fffb4828dd0>, args=0x207c, nargsf=3, nargsf@entry=1,
    kwnames=<unknown at remote 0x7fff66090450>, kwnames@entry=0x0) at ./Include/internal/pycore_call.h:169
#11 0x0000555555a4ccbf in method_vectorcall (method=<optimized out>, args=<optimized out>,
    nargsf=<optimized out>, kwnames=<optimized out>) at Objects/classobject.c:72
#12 0x0000555555e8005b in _PyObject_VectorcallTstate (tstate=0x529000325210,
    callable=<method at remote 0x7fff660c0eb0>, args=0x7fff93df06d8, nargsf=0, kwnames=0x0)
    at ./Include/internal/pycore_call.h:169
#13 context_run (self=<_contextvars.Context at remote 0x7fffb4a8ebf0>, args=<optimized out>,
    nargs=<optimized out>, kwnames=0x0) at Python/context.c:728
#14 0x0000555555e0a3e7 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>,
    throwflag=<optimized out>) at Python/generated_cases.c.h:3551

Once it aborted with message:

python: Objects/unionobject.c:296: PyObject *union_repr(PyObject *): Assertion `PyTuple_Check(alias->args)' failed.

Found using fusil by @vstinner.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.14.0a7+ experimental free-threading build (heads/main:741c6386b86, Apr 18 2025, 15:04:45) [Clang 19.1.7 (++20250114103320+cd708029e0b2-1exp120250114103432.75)]

Linked PRs

@devdanzin devdanzin added the type-crash A hard crash of the interpreter, possibly with a core dump label Apr 19, 2025
@ZeroIntensity ZeroIntensity added topic-free-threading interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.13 bugs and security fixes 3.14 new features, bugs and security fixes labels Apr 19, 2025
@sobolevn sobolevn self-assigned this Apr 19, 2025
@devdanzin
Copy link
Contributor Author

Here's a reduced repro in case it helps:

import abc
from threading import Thread
from typing import Union

class NonUniqueMeta(abc.ABCMeta):
    i = 0
    def __hash__(self):
        self.i += 1
        return self.i
class NonUnique(metaclass=NonUniqueMeta): pass

nonun = (NonUnique,) * 350
big_union = Union[*nonun]

for x in range(100):
    alive = []
    obj = []
    alive.append(Thread(target=obj.__str__, args=()))
    alive.append(Thread(target=obj.__str__, args=()))
    alive.append(Thread(target=obj.clear, args=()))
    obj.append([big_union] * 100)

    for obj in alive:
        obj.start()

@sobolevn
Copy link
Member

Looks like the whole Union object is not thread-safe :(

@sobolevn
Copy link
Member

cc @colesbury and @JelleZijlstra

@ZeroIntensity
Copy link
Member

Looks like the whole Union object is not thread-safe :(

Hmm. I don't see how the issue could be with Union, because it's an immutable type. I guess it could be something related to a borrowed reference?

@colesbury
Copy link
Contributor

Two different issues:

List can be modified concurrently or reentrantly during PyUnicodeWriter_WriteRepr/PyObject_Repr invocation, making the v->ob_item[i] pointer invalid.

if (PyUnicodeWriter_WriteRepr(writer, v->ob_item[i]) < 0) {
goto error;
}

Lazy initialization of alias->parameters isn't thread-safe (and may leak memory; probably won't crash):

// Populate __parameters__ if needed.
if (alias->parameters == NULL) {
alias->parameters = _Py_make_parameters(alias->args);
if (alias->parameters == NULL) {
return NULL;
}
}

vstinner added a commit to vstinner/cpython that referenced this issue Apr 22, 2025
Hold a strong reference to the item while calling repr(item).
vstinner added a commit to vstinner/cpython that referenced this issue Apr 22, 2025
Hold a strong reference to the item while calling repr(item).
vstinner added a commit to vstinner/cpython that referenced this issue Apr 22, 2025
Hold a strong reference to the item while calling repr(item).
@vstinner
Copy link
Member

List can be modified concurrently or reentrantly during PyUnicodeWriter_WriteRepr/PyObject_Repr invocation, making the v->ob_item[i] pointer invalid.

I wrote #132801 to fix this issue.

vstinner added a commit to vstinner/cpython that referenced this issue Apr 22, 2025
Use a critical section in union_getitem() to initialize the
'parameters' member.
@vstinner
Copy link
Member

Lazy initialization of alias->parameters isn't thread-safe (and may leak memory; probably won't crash)

Oh, right, I wrote #132802 to fix this other (minor) issue.

vstinner added a commit that referenced this issue Apr 22, 2025
Hold a strong reference to the item while calling repr(item).
vstinner added a commit to vstinner/cpython that referenced this issue Apr 22, 2025
Hold a strong reference to the item while calling repr(item).

(cherry picked from commit a4ea80d)
vstinner added a commit to vstinner/cpython that referenced this issue Apr 22, 2025
vstinner added a commit that referenced this issue Apr 23, 2025
Hold a strong reference to the item while calling repr(item).

(cherry picked from commit a4ea80d)
vstinner added a commit that referenced this issue Apr 23, 2025
Add union_init_parameters() helper function. Use a critical section
to initialize the 'parameters' member.
@vstinner
Copy link
Member

Fixed. Thanks for the bug report @devdanzin.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Apr 23, 2025
)

Add union_init_parameters() helper function. Use a critical section
to initialize the 'parameters' member.
(cherry picked from commit dc3e963)

Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit that referenced this issue Apr 23, 2025
…132839)

gh-132713: Fix typing.Union[index] race condition (GH-132802)

Add union_init_parameters() helper function. Use a critical section
to initialize the 'parameters' member.
(cherry picked from commit dc3e963)

Co-authored-by: Victor Stinner <vstinner@python.org>
@devdanzin
Copy link
Contributor Author

Fusil has hit a very similar crash while running an outdated commit of main, also pointing to union repr, but this time from a classmethod repr. It makes be wonder whether the same issue we had in list repr is present for classmethod.

Here's the traceback. Should a new issue be created for this, or are the current fixes enough? I can try to get a MRE if it's interesting.

Thread 399 "Thread-398 (__r" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffecc0886c0 (LWP 2395450)]
0x0000555555b3970f in _Py_TYPE (ob=<unknown at remote 0xdddddddddddddddd>)
    at ./Include/object.h:270
270             return ob->ob_type;
(gdb) bt
#0  0x0000555555b3970f in _Py_TYPE (ob=<unknown at remote 0xdddddddddddddddd>)
    at ./Include/object.h:270
#1  union_repr (self=<unknown at remote 0x7fffb94dd750>)
    at Objects/unionobject.c:296
#2  0x0000555555a1f01a in PyObject_Repr (v=<unknown at remote 0x7fffb94dd750>)
    at Objects/object.c:776
#3  0x0000555555afbbc4 in unicode_fromformat_arg (
    writer=writer@entry=0x7ffec9b84220, f=0x555555f1b26e "R)>",
    f@entry=0x555555f1b26d "%R)>", vargs=vargs@entry=0x7ffec9b842a0)
    at Objects/unicodeobject.c:3097
#4  0x0000555555afc23d in unicode_from_format (
    writer=writer@entry=0x7ffec9b84220,
    format=format@entry=0x555555f1b260 "<classmethod(%R)>",
    vargs=vargs@entry=0x7ffec9b841a0) at Objects/unicodeobject.c:3216
#5  0x0000555555afc5ef in PyUnicode_FromFormatV (
    format=format@entry=0x555555f1b260 "<classmethod(%R)>",
    vargs=vargs@entry=0x7ffec9b841a0) at Objects/unicodeobject.c:3250
#6  0x0000555555afc78f in PyUnicode_FromFormat (
    format=format@entry=0x555555f1b260 "<classmethod(%R)>")
    at Objects/unicodeobject.c:3264
#7  0x00005555559b1a7b in cm_repr (
    self=self@entry=<classmethod at remote 0x7fffb94bd1c0>)
    at Objects/funcobject.c:1491
#8  0x0000555555a7437c in wrap_unaryfunc (
    self=<classmethod at remote 0x7fffb94bd1c0>, args=<optimized out>,
    wrapped=0x5555559b1a32 <cm_repr>) at Objects/typeobject.c:9220
#9  0x00005555559752ab in wrapperdescr_raw_call (descr=0x7fffb40a63d0,
    self=<classmethod at remote 0x7fffb94bd1c0>, args=args@entry=(),
    kwds=kwds@entry={}) at Objects/descrobject.c:532

@JelleZijlstra
Copy link
Member

Is it still reproducible on latest main? If classmethods are mutable they may need a similar fix to list.

@devdanzin
Copy link
Contributor Author

It's reproducible in latest main, just checked (will double check when I get on a computer).

@vstinner
Copy link
Member

Would you mind to open a new issue?

@devdanzin
Copy link
Contributor Author

devdanzin commented Apr 24, 2025

I reduced the test case and it depends on calling __init__ concurrently on classmethod (or staticmethod) objects, which is not supported. So I guess no new issue needed?

Sorry for the noise.

import abc
import builtins
import collections.abc
import itertools
import types
import typing
from functools import reduce
from operator import or_

abc_types = [cls for cls in abc.__dict__.values() if isinstance(cls, type)]
builtins_types = [cls for cls in builtins.__dict__.values() if isinstance(cls, type)]
collections_abc_types = [cls for cls in collections.abc.__dict__.values() if isinstance(cls, type)]
collections_types = [cls for cls in collections.__dict__.values() if isinstance(cls, type)]
itertools_types = [cls for cls in itertools.__dict__.values() if isinstance(cls, type)]
types_types = [cls for cls in types.__dict__.values() if isinstance(cls, type)]
typing_types = [cls for cls in typing.__dict__.values() if isinstance(cls, type)]
all_types = (abc_types + builtins_types + collections_abc_types + collections_types + itertools_types
             + types_types + typing_types)
all_types = [t for t in all_types if not issubclass(t, BaseException)]
BIG_UNION = reduce(or_, all_types, int)

from threading import Thread
union_cm = classmethod(int | BIG_UNION)

def stress_cm():
    for x in range(3):
        repr(union_cm)
        union_cm.__init__(str|BIG_UNION)
        repr(union_cm)
        
alive = []
for x in range(10):
    alive.append(Thread(target=stress_cm, args=()))
for t in alive:
    t.start()
for t in alive:
    t.join()

vstinner added a commit to vstinner/cpython that referenced this issue Apr 25, 2025
Hold a strong reference to the callable while calling
repr(classmethod).
@vstinner
Copy link
Member

I reduced the test case and it depends on calling init concurrently on classmethod (or staticmethod) objects, #127192 (comment). So I guess no new issue needed?

I wrote a basic fix for repr(classmethod): #132899. But this fix was not enough, the reproducer still crashed randomly. A full fix would require to harden classmethod constructor (cm_init()). It may slow down the code whereas we don't want to support calling __init__() concurrently. So I prefer to give up and ignore this crash found by fuzzing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-free-threading topic-typing type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

6 participants