-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPython 3.12 embedded in WeeChat causes segfault on subsequent calls to Py_EndInterpreter #116510
Comments
The same thing happens in Kodi in various situations (e.g. xbmc/xbmc#24440 and reports on https://forum.kodi.tv/), the stack trace is basically the same: ASAN output
GDB (with debug symbols)
Are there plans to address this in the 3.12 release cycle? |
I tried unsuccessfully backporting 7a7bce5 as threading tests failed. If someone could attach a 3.12 backport patch I could test Kodi. |
I have a backport of the patch mostly done. I can finish it and then you can test. I think backporting this change would be a good idea given that it seems to work without issue in 3.13 and would solve a few problems for users of Python 3.12. |
ping @Yhg1s |
Hi, |
I can investigate but I need some direction on how to compile weechat with a specific version of Python. E.g. if I have Python installed with prefix /usr/local/python-3.12.4, how do I build weechat that uses it? I'm not familiar with CMake and my attempts at making it use that Python failed. It uses /usr/bin/python3.11 from my OS. |
@nascheme: you must compile Python with
|
I managed to get weechat running with the Python plugin today and did some debugging. The problem occurs during the teardown of the Python interpreter (within
The gc_next and gc_prev pointers on that object (a module method for the "weechat" module) are not valid. Using
The type object being deallocated doesn't seem to be associated with weechat. It is https://gist.github.com/nascheme/e807f03fd15312bbae52595f21ad0957 The idea is to break the reference cycles in the This is a band-aid and not fixing the real problem. More investigation would be needed to find it. It could be a bug in Python or perhaps a reference counting bug in weechat. |
3.12 is going to go security-only relatively soon (sometime in the next few months, according to PEP 693), and this is still causing many problems with subinterpreters (namely, things like lists are completely broken under multithreaded isolated interpreters). If we don't backport the fix, 3.12 will remain with this problem forever. Re-pinging @Yhg1s |
I did a lot of debugging on this crash and here's what I found.
It seems this bug can be avoided by setting According to some discussion with Eric Snow, if |
Fix a bug that can cause a crash when sub-interpreters use "basic" single-phase extension modules. Shared objects could refer to PyGC_Head nodes that had been freed as part of interpreter shutdown.
Fix a crash caused by immortal interned strings being shared between sub-interpreters that use basic single-phase init. In that case, the string can be used by an interpreter that outlives the interpeter that created and interned it. For interpreters that share obmalloc state, also share the interned dict with the main interpreter.
FTR, this case of single-phase init modules with |
gh-124541) Fix a crash caused by immortal interned strings being shared between sub-interpreters that use basic single-phase init. In that case, the string can be used by an interpreter that outlives the interpreter that created and interned it. For interpreters that share obmalloc state, also share the interned dict with the main interpreter.
Fix a bug that can cause a crash when sub-interpreters use "basic" single-phase extension modules. Shared objects could refer to PyGC_Head nodes that had been freed as part of interpreter shutdown.
…ythongh-124646) (cherry picked from commit 98b2ed7e239c807f379cd2bf864f372d79064aac) Co-authored-by: Neil Schemenauer <[email protected]>
Fix a bug that can cause a crash when sub-interpreters use "basic" single-phase extension modules. Shared objects could refer to PyGC_Head nodes that had been freed as part of interpreter shutdown.
…124645) Fix a bug that can cause a crash when sub-interpreters use "basic" single-phase extension modules. Shared objects could refer to PyGC_Head nodes that had been freed as part of interpreter shutdown. (cherry picked from commit 6f9525d) Co-authored-by: Neil Schemenauer <[email protected]>
… (#124649) gh-116510: Fix crash during sub-interpreter shutdown (gh-124645) Fix a bug that can cause a crash when sub-interpreters use "basic" single-phase extension modules. Shared objects could refer to PyGC_Head nodes that had been freed as part of interpreter shutdown. (cherry picked from commit 6f9525d) Co-authored-by: Neil Schemenauer <[email protected]>
Crash report
What happened?
WeeChat embeds CPython in order to run Python scripts inside WeeChat. It can load multiple scripts and they each get their own interpreter. When a script is loaded
Py_NewInterpreter
is called, and when it's unloadedPy_EndInterpreter
is called.With CPython 3.12 loading two scripts and then unloading them in the same order causes a segmentation fault. Interestingly, the segmentation fault doesn't happen if the script that was loaded last is unloaded first.
I bisected this and found it was introduced in commit de64e75. I also noticed that the crash doesn't occur in the main branch, and did another bisect and found it was fixed in commit 7a7bce5.
This issue seems similar to the one reported in #115649 which is also introduced by the same commit, but that one still crashes on the main branch (commit 735fc2c).
I haven't been able to reproduce this outside of WeeChat unfortunately, but here is a backtrace from the crash with WeeChat, with commit de64e75 of CPython and commit ec56a1103f47b15a641ff93528fd6f50025dd524 of WeeChat.
This was produced by creating these two python scripts:
dummy1.py
:dummy2.py
And then running
weechat -t -r '/script load dummy1.py; /script load dummy2.py; /quit'
.Also, here is the issue report for WeeChat: weechat/weechat#2046
Since it's fixed in main it seems there won't be a problem with 3.13, but I wonder if the fix can be backported to 3.12?
CPython versions tested on:
3.12, CPython main branch
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.12.0a7+ (tags/v3.12.0a7-340-gde64e75616:de64e75616, Mar 8 2024, 19:43:39) [GCC 13.2.1 20230801]
Linked PRs
The text was updated successfully, but these errors were encountered: