The Wayback Machine - https://web.archive.org/web/20240106171254/https://github.com/python/cpython/issues/111798
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_super_deep() for test_call triggers a stack overflow protection under a debug build for WASI #111798

Open
brettcannon opened this issue Nov 6, 2023 · 24 comments
Labels
tests Tests in the Lib/test dir

Comments

@brettcannon
Copy link
Member

brettcannon commented Nov 6, 2023

Tested using WASI-SDK 20 and wasmtime 14.

Linked PRs

@brettcannon brettcannon added topic-WebAssembly tests Tests in the Lib/test dir and removed tests Tests in the Lib/test dir labels Nov 6, 2023
@brettcannon
Copy link
Member Author

cpython/Lib/test/test_call.py

Lines 992 to 1026 in 804575b

@skip_on_s390x
def test_super_deep(self):
def recurse(n):
if n:
recurse(n-1)
def py_recurse(n, m):
if n:
py_recurse(n-1, m)
else:
c_py_recurse(m-1)
def c_recurse(n):
if n:
_testcapi.pyobject_vectorcall(c_recurse, (n-1,), ())
def c_py_recurse(m):
if m:
_testcapi.pyobject_vectorcall(py_recurse, (1000, m), ())
depth = sys.getrecursionlimit()
sys.setrecursionlimit(100_000)
try:
recurse(90_000)
with self.assertRaises(RecursionError):
recurse(101_000)
c_recurse(100)
with self.assertRaises(RecursionError):
c_recurse(90_000)
c_py_recurse(90)
with self.assertRaises(RecursionError):
c_py_recurse(100_000)
finally:
sys.setrecursionlimit(depth)

Looks like the C stack blows up past a depth of 85.

@brettcannon
Copy link
Member Author

@markshannon what's the best way to update this test so it passes with a more limited stack size? Should I update Py_C_RECURSION_LIMIT in Include/cpython/pystate.h to something like 75 when under a debug build for WASI based on the "85" value I found to work?

@brettcannon
Copy link
Member Author

I can also simply skip it much like on x360. I tried setting Py_C_RECURSION_LIMIT to pass various tests that were exhsusting their stacks (e.g., #111803 and #111802), but the limit was getting ridiculously low, to the point that test_pickle was failing on non-deep tests.

@vstinner
Copy link
Member

I also saw this issue on Linux (Fedora). But I failed to reproduce it in a reliable way.

cc @hroncok

@vstinner
Copy link
Member

When Python is built in debug mode, -Og or event -O0 optimization level is used. It's possible that Python consumes more stack memory and so crash earlier in test_super_deep(): with less deep recursive call.

We already had issues with other tests which rely on how much stack memory is available: https://pythondev.readthedocs.io/unstable_tests.html#unlimited-recursion

Well, I suppose that here the purpose of the test is to prove that Python no longer consumes any stack memory. But maybe this assumption is wrong when Python is built in debug mode?

I propose to reduce the maximum depth is test.support.Py_DEBUG is true. How much? I don't know.

@vstinner
Copy link
Member

I can reproduce the issue by building Python with gcc -O0 (disable all compiler optimizations):

./configure --cache-file=../python-config.cache --with-pydebug CFLAGS=-O0
make
./python -m test -v test_call -m test_super_deep 

Output:

vstinner@mona$ ./python -m test -v test_call -m test_super_deep 
== CPython 3.13.0a1+ (heads/main:e5dfcc2b6e1, Nov 15 2023, 19:17:40) [GCC 13.2.1 20231011 (Red Hat 13.2.1-4)]
== Linux-6.5.11-300.fc39.x86_64-x86_64-with-glibc2.38 little-endian
== Python build: debug
== cwd: /home/vstinner/python/main/build/test_python_worker_674715æ
== CPU count: 12
== encodings: locale=UTF-8 FS=utf-8
== resources: all test resources are disabled, use -u option to unskip tests

Using random seed: 1935379936
0:00:00 load avg: 1.79 Run 1 test sequentially
0:00:00 load avg: 1.79 [1/1] test_call
test_super_deep (test.test_call.TestRecursion.test_super_deep) ... Fatal Python error: Segmentation fault

Current thread 0x00007fcd6b199740 (most recent call first):
  File "/home/vstinner/python/main/Lib/test/test_call.py", line 1007 in c_recurse
  File "/home/vstinner/python/main/Lib/test/test_call.py", line 1007 in c_recurse
  File "/home/vstinner/python/main/Lib/test/test_call.py", line 1007 in c_recurse
  File "/home/vstinner/python/main/Lib/test/test_call.py", line 1007 in c_recurse
  File "/home/vstinner/python/main/Lib/test/test_call.py", line 1007 in c_recurse
  (...)

I tested commit e5dfcc2b6e1c8450e47acbc6e7d3fd3380ef0f03.

@vstinner
Copy link
Member

Using gcc -O1, I cannot reproduce the issue: so it seems like Python uses more and less stack memory depending on the compiler optimization level.

@vstinner
Copy link
Member

Measure with c_limit=90 py_limit=1,000 recursion_limit=100,000 on Fedora 39 (gcc 13.2.1 on x86-64):

  • gcc -O0 + --with-pydebug: c_py_recurse: 13 618.7 bytes/call
  • gcc -O1 + --with-pydebug: c_py_recurse: 689.8 bytes/call
  • gcc -Og + --with-pydebug: c_py_recurse: 705.6 bytes/call
  • gcc -O3 + --with-pydebug: c_py_recurse: 609.6 bytes/call
  • gcc -O0: c_py_recurse: 24 690.7 bytes/call
  • gcc -Og: c_py_recurse: 577.1 bytes/call
  • gcc -O3: c_py_recurse: 464.9 bytes/call

I suppose that the fact that -O0 disable function inlining has a high cost on stack memory consumption.

Script to measure c_py_recurse() stack memory consumption:

import _testcapi
import sys

sys.setrecursionlimit(100_000)
py_limit = 1000
c_limit = 90

lowest = None

def py_recurse(n, m):
    global lowest
    sp = _testcapi.stack_pointer()
    if lowest is None or sp < lowest:
        lowest = sp

    if n >= 1:
        py_recurse(n-1, m)
    else:
        c_py_recurse(m-1)


def c_py_recurse(m):
    global lowest
    sp = _testcapi.stack_pointer()
    if lowest is None or sp < lowest:
        lowest = sp

    if m >= 1:
        _testcapi.pyobject_vectorcall(py_recurse, (py_limit, m), ())


def measure(func, depth):
    global lowest
    lowest = None

    high = _testcapi.stack_pointer()
    func(depth)
    low = lowest
    per_call = (high - low) / depth
    print(f"{func.__name__}: {per_call:.1f} bytes/call "
          f"-- measured with {c_limit=:,} {py_limit=:,} "
          f"recursion_limit={sys.getrecursionlimit():,}")

measure(c_py_recurse, c_limit)

@vstinner
Copy link
Member

Latest major change related to recursion:

This change added test_super_deep() and skips it on s390x architecture.

vstinner added a commit to vstinner/cpython that referenced this issue Nov 15, 2023
Reenable test_call.test_super_deep() on s390x architecture.
vstinner added a commit to vstinner/cpython that referenced this issue Nov 15, 2023
Reenable test_call.test_super_deep() on the s390x architecture.
vstinner added a commit to vstinner/cpython that referenced this issue Nov 15, 2023
Reenable test_call.test_super_deep() on the s390x architecture.
@brettcannon
Copy link
Member Author

I'm personally tempted to skip this on debug builds under WASI since 85 is not a reasonable stack depth for Py_C_RECURSION_LIMIT when it's 500 otherwise. But maybe test.support.infinite_recursion(25) would also work? Issue is I don't know if the test is more about sys.getrecursionlimit() and how that interacts with C Code or more about C stack consumption.

@hroncok
Copy link
Contributor

hroncok commented Nov 15, 2023

I submitted 5 debug builds and 5 normal builds to the Fedora builds system.

All normal builds succeeded on each architecture.

All debug builds failed on x86_64/aarch64/ppc64le and succeeded on i686/s390x.

All failures were test_super_deep (test.test_call.TestRecursion.test_super_deep) ... Fatal Python error: Segmentation fault

@brettcannon
Copy link
Member Author

@markshannon could you shed some light as to the purpose of the failing test so we can update it appropriately? This no longer seems to be a WebAssembly issue.

@brettcannon
Copy link
Member Author

brettcannon commented Nov 15, 2023

Maybe we should skip the test on all debug builds?

@hroncok
Copy link
Contributor

hroncok commented Nov 15, 2023

Did you mean on the debug builds?

@vstinner
Copy link
Member

Maybe we should skip the test on all non-debug builds?

My PR #112124 does the opposite: it enables again test_call.test_super_deep() on the s390x architecture and it enables again test_ast_recursion_limit() on WASI platform. I'm now running buildbots on my change to see how it goes.

I prefer to reduce limits rather than skipping tests on debug builds. Most of our CIs are running tests on Python built in debug mode. Also, if a test crash on a debug build, it also means that end users can be affected by the issue. My change should fix the crash on deep recursive C function calls.

@brettcannon
Copy link
Member Author

Did you mean on the debug builds?

Yes. 😅 Fixed my comment.

@vstinner
Copy link
Member

@markshannon what's the best way to update this test so it passes with a more limited stack size? Should I update Py_C_RECURSION_LIMIT in Include/cpython/pystate.h to something like 75 when under a debug build for WASI based on the "85" value I found to work?

My PR adds _testinternalcapi.get_c_recursion_remaining() function and uses it in some tests to use lower limits.

vstinner added a commit to vstinner/cpython that referenced this issue Nov 16, 2023
* Run again test_ast_recursion_limit() on WASI platform.
* Add _testinternalcapi.get_c_recursion_remaining().
* Fix test_ast and test_sys_settrace: test_ast_recursion_limit() and
  test_trace_unpack_long_sequence() now adjust the maximum recursion
  depth depending on the the remaining C recursion.
vstinner added a commit that referenced this issue Nov 16, 2023
* Run again test_ast_recursion_limit() on WASI platform.
* Add _testinternalcapi.get_c_recursion_remaining().
* Fix test_ast and test_sys_settrace: test_ast_recursion_limit() and
  test_trace_unpack_long_sequence() now adjust the maximum recursion
  depth depending on the the remaining C recursion.
@vstinner
Copy link
Member

On Fedora, I cannot reproduce the test_call crash on Python 3.12 when I build Python with ./configure CLFAGS=-O0 nor ./configure CLFAGS=-O0 --with-pydebug. So for now, I don't plan to backport my change bd89bca

@vstinner
Copy link
Member

@brettcannon: How do you build Python for WASI in debug mode?

I made the following quick & dirty changes:

diff --git a/Tools/wasm/wasm_assets.py b/Tools/wasm/wasm_assets.py
index ffa5e303412..f198782f48f 100755
--- a/Tools/wasm/wasm_assets.py
+++ b/Tools/wasm/wasm_assets.py
@@ -99,6 +99,7 @@
     "_sysconfigdata__emscripten_wasm32-emscripten",
     "_sysconfigdata__wasi_wasm32-wasi",
     "_sysconfigdata__wasi_wasm64-wasi",
+    "_sysconfigdata_d_wasi_wasm32-wasi",
 )
 
 
diff --git a/Tools/wasm/wasm_build.py b/Tools/wasm/wasm_build.py
index c0b9999a5da..1cb7b510597 100755
--- a/Tools/wasm/wasm_build.py
+++ b/Tools/wasm/wasm_build.py
@@ -329,7 +329,7 @@ def _check_wasi() -> None:
         # workaround for https://github.com/python/cpython/issues/95952
         "HOSTRUNNER": (
             "wasmtime run "
-            "--env PYTHONPATH=/{relbuilddir}/build/lib.wasi-wasm32-{version}:/Lib "
+            "--env PYTHONPATH=/{relbuilddir}/build/lib.wasi-wasm32-{version}-pydebug:/Lib "
             "--mapdir /::{srcdir} --"
         ),
         "PATH": [WASI_SDK_PATH / "bin", os.environ["PATH"]],
@@ -475,7 +475,7 @@ def configure_cmd(self) -> List[str]:
         # use relative path, so WASI tests can find lib prefix.
         # pathlib.Path.relative_to() does not work here.
         configure = os.path.relpath(CONFIGURE, self.builddir)
-        cmd = [configure, "-C"]
+        cmd = [configure, "-C", "--with-pydebug"]
         platform = self.host.platform
         if platform.configure_wrapper:
             cmd.insert(0, os.fspath(platform.configure_wrapper))

I'm using ./Tools/wasm/wasm_build.py wasi build command in a podman run --rm -ti -v $(pwd):/python-wasm/cpython:Z -w /python-wasm/cpython quay.io/tiran/cpythonbuild:emsdk3 container.

On the current main branch (at commit bd89bca9e2a57779c251ee6fadf4887acb364824), the test_call test runs successfully:

root@eb26f779f55c:/python-wasm/cpython/builddir/wasi# make test TESTOPTS="test_call --header"
== CPython 3.13.0a1+ (heads/main-dirty:bd89bca9e2a, Nov 16 2023, 15:22:22) [GCC 9.4.0]
== Linux-6.5.11-300.fc39.x86_64-x86_64-with-glibc2.31 little-endian
== Python build: debug
== cwd: /python-wasm/cpython/builddir/wasi/build/test_python_worker_22601æ
== CPU count: 12
== encodings: locale=UTF-8 FS=utf-8
== resources: all,-cpu
== cross compiled: Yes
== host python: wasmtime run --env PYTHONPATH=/builddir/wasi/build/lib.wasi-wasm32-3.13-pydebug:/Lib --mapdir /::/python-wasm/cpython -- python.wasm
== host platform: wasi-0.0.0-wasm32-32bit

Using random seed: 264547689
0:00:00 load avg: 0.34 Run 1 test in parallel using 1 worker process (timeout: 10 min, worker timeout: 15 min)
0:00:01 load avg: 0.34 [1/1] test_call passed

== Tests result: SUCCESS ==

10 slowest tests:
- test_call: 1.2 sec

1 test OK.

Total duration: 1.5 sec
Total tests: run=183
Total test files: run=1/1
Result: SUCCESS

@brettcannon
Copy link
Member Author

How do you build Python for WASI in debug mode?

I'm developing https://github.com/brettcannon/cpython/blob/wasi.py/Tools/wasm/wasi.py to make this all easier, so that's how I have been doing this personally.

I made the following quick & dirty changes

That works and essentially what my script does.

On the current main branch (at commit bd89bca9e2a57779c251ee6fadf4887acb364824), the test_call test runs successfully

I'm still seeing a failure under WASI SDK 20 and wasmtime 14 after merging up to 762eb58 .

@vstinner
Copy link
Member

Alright, i'm not surprised to see a different behavior on this specific recursion stress test depending on the compiler version and flags. I'm always using the same container, i don't know which versions are used inside.

@markshannon
Copy link
Member

could you shed some light as to the purpose of the failing test so we can update it appropriately?

It is to test that the interpreter doesn't crash when the recursion limit is set to some silly number.
Rather than disable the test we should probably set the C recursion limit to different values for different builds and platforms.

@vstinner
Copy link
Member

I'm still seeing a failure under WASI SDK 20 and wasmtime 14 after merging up to 762eb58 .

Tools/wasm/README.md says:

WASI builds require the WASI SDK 16.0+

I'm only using quay.io/tiran/cpythonbuild:emsdk3 container to build Python on WASI. I have no idea what is the WASI SDK version of this container. It would be nice to document supported WASI SDK versions and wasmtime versions.

So the problem is that the test_call fails if Python is built in debug mode with WASI SDK 20 and wasmtime 14?

@brettcannon
Copy link
Member Author

I'm only using quay.io/tiran/cpythonbuild:emsdk3 container to build Python on WASI. I have no idea what is the WASI SDK version of this container.

Neither do I, hence why we install the requisite tools into the devcontainer.

It would be nice to document supported WASI SDK versions and wasmtime versions.

I'm working on it (and it's technically WASI-SDK 16+ and any version of wasmtime that fully implements WASI preview1, but in general just some recent version).

So the problem is that the test_call fails if Python is built in debug mode with WASI SDK 20 and wasmtime 14?

Correct as that's how I discovered the failure, but I don't know how strict those versions are in terms of causing the issue. But since those are the common/preferred tools for working on WASI and (nearly) the latest versions it's at least worth my time to try and make sure they work (and it's what the WASI buildbots use today, so fixing this means I can make them do pydebug builds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Tests in the Lib/test dir
Projects
Status: Todo
Development

No branches or pull requests

4 participants