Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords #50333

billm · 2009-05-22T08:18:55Z

BPO	6083
Nosy	@loewis, @birkenfeld, @gpshead, @abalkin, @taleinat, @serhiy-storchaka, @imz, @ananthan-123
Dependencies	bpo-20191: resource.prlimit(int, int, str) crashs
Files	python-bug-01.patch: Patch to fix the problem test-resource.py: Test for Modules/resource.c test-ctypes.py: Test for Modules/_ctypes/_ctypes.c test-functools.py: Test for Modules/_functoolsmodule.c (py3k only) issue6083.diff PyArg_ParseTuple_refcount.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2009-05-22.08:18:54.577>
labels = ['interpreter-core', 'type-crash']
title = 'Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords'
updated_at = <Date 2020-02-20.07:51:55.428>
user = 'https://bugs.python.org/billm'

bugs.python.org fields:

activity = <Date 2020-02-20.07:51:55.428>
actor = 'taleinat'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2009-05-22.08:18:54.577>
creator = 'billm'
dependencies = ['20191']
files = ['14040', '19616', '19617', '19618', '20400', '27567']
hgrepos = []
issue_num = 6083
keywords = ['patch']
message_count = 25.0
messages = ['88181', '88204', '88215', '121285', '126226', '126234', '172912', '181297', '181316', '181327', '181592', '181600', '182108', '182110', '201454', '314264', '314268', '314273', '314285', '314286', '314287', '314289', '322381', '362299', '362302']
nosy_count = 11.0
nosy_names = ['loewis', 'georg.brandl', 'gregory.p.smith', 'belopolsky', 'taleinat', 'billm', 'abacabadabacaba', 'python-dev', 'serhiy.storchaka', 'imz', 'Ananthakrishnan']
pr_nums = []
priority = 'high'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue6083'
versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

Linked PRs

gh-50333: Deprecate support of non-tuple sequences in PyArg_ParseTuple() #128374

billm · 2009-05-22T08:18:50Z

The code for resource_setrlimit in Modules/resource.c does not handle
reference counting properly. The following Python code segfaults for me
on Ubuntu 8.10 in Python 2.5.2 and also a custom-built 2.6.1.

--

import resource

l = [0, 0]

class MyNum:
    def __int__(self):
        l[1] = 20
        return 10

    def __del__(self):
        print 'byebye', self

l[0] = MyNum()
l[1] = MyNum()
resource.setrlimit(resource.RLIMIT_CPU, l)

--

The problem is that setrlimit gets its arguments by calling:

   PyArg_ParseTuple(args, "i(OO):setrlimit", 
                    &resource, &curobj, &maxobj)

The references curobj and maxobj are borrowed. The second argument can
be passed as a mutable list rather than a tuple, so it's possible to
update the list in the middle of setrlimit, causing maxobj to be
destroyed before setrlimit is done with it.

I've attached a patch that INCREFs both variables immediately after
parsing them to avoid this problem.

In my opinion it seems dangerous to allow format strings with the 'O'
specifier appearing in parentheses. You normally expect that objects
returned from PyArg_ParseTuple are pretty safe, but the fact that the
inner sequence may be mutable violates this assumption. Might it make
sense to ban this use case? I only found one other instance of it in the
Python source tree, inside ctypes. This one may also be a crashing
bug--I didn't look at it carefully enough.

birkenfeld · 2009-05-22T17:09:01Z

That is a good point. IMHO we'll be fine with a warning in the docs,
and fixing our own two instances. Martin, what do you think?

loewis · 2009-05-22T20:14:45Z

IMO, any refcounting bug has the potential as a security risk. So I
think we should deprecate this with a warning, and eventually remove it,
as billm proposes.

It's probably debatable whether to backport the warning to 2.6 or
earlier; I think we shouldn't, as many applications are probably valid.

abacabadabacaba · 2010-11-16T12:33:46Z

Actually, this can't be fixed without modifying C API methods PyArg_ParseTuple and PyArg_ParseTupleAndKeywords, because it's possible to make an object deallocated before PyArg_ParseTuple returns, so Py_INCREF immediately after parsing would be already too late.

Here are my test cases:
test-resource.py - in Modules/resource.c, and python-bug-01.patch won't work against it.
test-ctypes.py - in Modules/_ctypes/_ctypes.c.
test-functools.py - in Modules/_functoolsmodule.c (py3k only).

abalkin · 2011-01-14T04:55:18Z

Let me summarize the issue: the PyArg_ParseTuple format code 'O' returns a borrowed reference. However, when the 'O' code appears inside parenthesis, there may not be an object to hold the reference to borrow from. This is what happens in the test-functools.py crasher: partial.__setstate__() takes a 4-tuple argument that is unpacked using a "(OOOO)" format. The test case passes an instance instead of a tuple that supports the sequence methods, but does not hold the reference to the "items" that its []-operator returns. This is not a problem at the top level because args argument to PyArg_ParseTuple is always a real tuple.

I think that rather than deprecating the use of 'O' format inside parentheses, "(..O..)" unpacking should reject to unpack arguments other than tuples or maybe lists.

abalkin · 2011-01-14T06:42:07Z

Attached patch passes the regrtest and makes test-functools.py raise an exception rather than crash. The proposed change will make functions like partial.__setstate__ require tuple argument even though currently it would accept any container. This is not an issue with __setstate__ because it should only be called with arguments produced by __reduce__ and in the case of partial, __reduce__ produces state as a tuple. Other functions may need to be modified if they need to continue to accept arbitrary sequences.

serhiy-storchaka · 2012-10-14T20:18:26Z

Here is a patch which get rid of all three PyArg_ParseTuple usage with parsing nested sequences. Thanks Evgeny for reproducers.

gpshead · 2013-02-03T20:31:35Z

Serhiy's patch looks good to me.

python-dev · 2013-02-04T11:05:36Z

New changeset a4c85f9b8f58 by Serhiy Storchaka in branch '2.7':
Issue bpo-6083: Fix multiple segmentation faults occured when PyArg_ParseTuple
http://hg.python.org/cpython/rev/a4c85f9b8f58

New changeset 4bac47eb444c by Serhiy Storchaka in branch '3.2':
Issue bpo-6083: Fix multiple segmentation faults occured when PyArg_ParseTuple
http://hg.python.org/cpython/rev/4bac47eb444c

New changeset e0ee10f27e5f by Serhiy Storchaka in branch '3.3':
Issue bpo-6083: Fix multiple segmentation faults occured when PyArg_ParseTuple
http://hg.python.org/cpython/rev/e0ee10f27e5f

New changeset 3e3a7d825736 by Serhiy Storchaka in branch 'default':
Issue bpo-6083: Fix multiple segmentation faults occured when PyArg_ParseTuple
http://hg.python.org/cpython/rev/3e3a7d825736

serhiy-storchaka · 2013-02-04T13:33:24Z

I do not have possibility and desires blind-repair a test on alien platform, so just temporarily disable a new test in Lib/ctypes/test/test_returnfuncptrs.py on Windows. If someone has a desire to fix it fell free to do this.

I do not close this issue because committed patch only fix existing crashes in Python. There should be plenty of such bugs in third-party code. We have to deprecate this unsafe feature or reject any sequences except tuple as Alexander proposed.

skrah · 2013-02-07T11:41:03Z

The FreeBSD 6.4 bot is failing, too. Note that the other functions
in test_returnfuncptrs.py do this in order to get strchr():

    dll = CDLL(_ctypes_test.__file__)
    get_strchr = dll.get_strchr
    get_strchr.restype = CFUNCTYPE(c_char_p, c_char_p, c_char)
    strchr = get_strchr()

serhiy-storchaka · 2013-02-07T13:05:08Z

There are 6 different ways to get a function (see comment around PyCFuncPtr_new() in Modules/_ctypes/_ctypes.c). The other tests just use other ways.

I'm more carefully read ctype code and found my mistake. Need to import "my_strchr", and not "strchr".

serhiy-storchaka · 2013-02-14T16:11:02Z

FreeBSD 6.4 and Windows test failures was fixed in changesets 8fb98fb758e8 and ec70abe8c886.

serhiy-storchaka · 2013-02-14T16:16:05Z

Oh, I shouldn't close this until this dangerous feature will be deprecated.

serhiy-storchaka · 2013-10-27T12:49:10Z

Accepting an arbitrary sequence when "(...)" is used in the format string was introduced in changeset 0ef1071cb7fe.

imz · 2018-03-22T14:54:50Z

New changeset a4c85f9b8f58 by Serhiy Storchaka in branch '2.7':
Issue bpo-6083: Fix multiple segmentation faults occured when PyArg_ParseTuple
http://hg.python.org/cpython/rev/a4c85f9b8f58

This test has a problem: though it tests not the ability to set a CPU hard limit, it fails if the hard limit is limited. Perhaps, ignore any exception there? Could you please help me re-write it correctly, so that I can run it on gyle--ALT's builder host--successfully):

    # Issue 6083: Reference counting bug
    def test_setrusage_refcount(self):
        try:
            limits = resource.getrlimit(resource.RLIMIT_CPU)
        except AttributeError:
            self.skipTest('RLIMIT_CPU not available')
        class BadSequence:
            def __len__(self):
                return 2
            def __getitem__(self, key):
                if key in (0, 1):
                    return len(tuple(range(1000000)))
                raise IndexError

        resource.setrlimit(resource.RLIMIT_CPU, BadSequence())

The failure:

[builder@team ~]$ python /usr/lib64/python2.7/test/test_resource.py
test_args (main.ResourceTest) ... ok
test_fsize_enforced (main.ResourceTest) ... ok
test_fsize_ismax (main.ResourceTest) ... ok
test_fsize_toobig (main.ResourceTest) ... ok
test_getrusage (main.ResourceTest) ... ok
test_setrusage_refcount (main.ResourceTest) ... ERROR

======================================================================
ERROR: test_setrusage_refcount (main.ResourceTest)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "/usr/lib64/python2.7/test/test_resource.py", line 117, in test_setrusage_refcount
    resource.setrlimit(resource.RLIMIT_CPU, BadSequence())
ValueError: not allowed to raise maximum limit

Ran 6 tests in 0.085s

FAILED (errors=1)
Traceback (most recent call last):
  File "/usr/lib64/python2.7/test/test_resource.py", line 123, in <module>
    test_main()
  File "/usr/lib64/python2.7/test/test_resource.py", line 120, in test_main
    test_support.run_unittest(ResourceTest)
  File "/usr/lib64/python2.7/test/support/__init__.py", line 1577, in run_unittest
    _run_suite(suite)
  File "/usr/lib64/python2.7/test/support/__init__.py", line 1542, in _run_suite
    raise TestFailed(err)
test.support.TestFailed: Traceback (most recent call last):
  File "/usr/lib64/python2.7/test/test_resource.py", line 117, in test_setrusage_refcount
    resource.setrlimit(resource.RLIMIT_CPU, BadSequence())
ValueError: not allowed to raise maximum limit

[builder@team ~]$

serhiy-storchaka · 2018-03-22T15:59:24Z

What does resource.getrlimit(resource.RLIMIT_CPU) return?

imz · 2018-03-22T16:59:09Z

>>> import resource
>>> resource.getrlimit(resource.RLIMIT_CPU) 
(7200, 7260)

serhiy-storchaka · 2018-03-22T18:39:18Z

The simplest way is to try passing the limit as a tuple

    resource.setrlimit(resource.RLIMIT_CPU, (1000000, 1000000))

and skip the test if it failed.

imz · 2018-03-22T18:47:34Z

Thanks! I also thought about this simplest way. What about this:

diff --git a/Python/Lib/test/test_resource.py b/Python/Lib/test/test_resource.py
index de29d3b..bec4440 100644
--- a/Python/Lib/test/test_resource.py
+++ b/Python/Lib/test/test_resource.py
@@ -102,16 +102,21 @@ class ResourceTest(unittest.TestCase):
 
     # Issue 6083: Reference counting bug
     def test_setrusage_refcount(self):
+        howmany = 1000000
         try:
             limits = resource.getrlimit(resource.RLIMIT_CPU)
         except AttributeError:
             self.skipTest('RLIMIT_CPU not available')
+        try:
+            resource.setrlimit(resource.RLIMIT_CPU, (howmany, howmany))
+        except _:
+            self.skipTest('Setting RLIMIT_CPU not possible')
         class BadSequence:
             def __len__(self):
                 return 2
             def __getitem__(self, key):
                 if key in (0, 1):
-                    return len(tuple(range(1000000)))
+                    return len(tuple(range(howmany)))
                 raise IndexError
 
         resource.setrlimit(resource.RLIMIT_CPU, BadSequence())

What should I write instead of _?

imz · 2018-03-22T18:48:20Z

And will the next call be effective (do anything), if we have already set the limit with the testing call?

serhiy-storchaka · 2018-03-22T20:23:31Z

LGTM.

What should I write instead of _?

(ValueError, OSError)

And will the next call be effective (do anything), if we have already set the limit with the testing call?

This doesn't matter. We test that it doesn't crash when parse arguments.

taleinat · 2018-07-25T18:16:01Z

Ivan, can you supply a PR or would you like someone else to do so?

ananthan-123 · 2020-02-20T07:20:49Z

I want to do a PR,if this is still needeed.

taleinat · 2020-02-20T07:51:55Z

Please do, Ananthakrishnan!

arhadthedev · 2023-04-29T19:01:11Z

No longer relevant because resource.setrlimit is now ported to Argument Clinic.

serhiy-storchaka · 2024-12-31T08:36:23Z

While the issue was fixed for resource.setrlimit and others (before it was ported to Argument Clinic), the root issue has not yet been fixed.

…seTuple() Non-tuple sequences are deprecated as argument for the "(items)" format unit in PyArg_ParseTuple() and other argument parsing functions if items contains format units which store borrowed buffer or reference (e.g. "s" and "O"). str and bytearray are no longer accepted as valid sequences.

serhiy-storchaka · 2024-12-31T13:49:30Z

#128374 implements a more lenient variant of the initial deprecation plan. Non-tuple sequences are only deprecated if the nested format units store borrowed buffer or reference (e.g. "s" and "O"). If "(items)" only contains format units like "i" or "d" (or "s*"), general sequences are still accepted.

billm mannequin added extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump labels May 22, 2009

birkenfeld assigned loewis May 22, 2009

loewis mannequin removed their assignment May 22, 2009

abacabadabacaba mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed extension-modules C modules in the Modules dir labels Nov 16, 2010

abacabadabacaba mannequin changed the title ~~Reference counting bug in setrlimit~~ Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords Nov 16, 2010

serhiy-storchaka self-assigned this Dec 29, 2012

serhiy-storchaka removed their assignment Feb 4, 2013

serhiy-storchaka closed this as completed Feb 14, 2013

serhiy-storchaka self-assigned this Feb 14, 2013

serhiy-storchaka reopened this Feb 14, 2013

serhiy-storchaka removed their assignment Feb 14, 2013

ezio-melotti transferred this issue from another repository Apr 10, 2022

arhadthedev closed this as not planned Won't fix, can't repro, duplicate, stale Apr 29, 2023

serhiy-storchaka reopened this Dec 31, 2024

bedevere-app bot mentioned this issue Dec 31, 2024

gh-50333: Deprecate support of non-tuple sequences in PyArg_ParseTuple() #128374

Open

serhiy-storchaka mentioned this issue Dec 31, 2024

Deprecating support for nested non-tuple sequences in PyArg_ParseTuple() capi-workgroup/decisions#52

Open

6 tasks

erlend-aasland added the topic-C-API label Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords #50333

Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords #50333

billm mannequin commented May 22, 2009 •

edited by bedevere-app bot

Loading

billm mannequin commented May 22, 2009 •

edited by arhadthedev

Loading

birkenfeld commented May 22, 2009

loewis mannequin commented May 22, 2009

abacabadabacaba mannequin commented Nov 16, 2010

abalkin commented Jan 14, 2011

abalkin commented Jan 14, 2011

serhiy-storchaka commented Oct 14, 2012

gpshead commented Feb 3, 2013

python-dev mannequin commented Feb 4, 2013

serhiy-storchaka commented Feb 4, 2013

skrah mannequin commented Feb 7, 2013

serhiy-storchaka commented Feb 7, 2013

serhiy-storchaka commented Feb 14, 2013

serhiy-storchaka commented Feb 14, 2013

serhiy-storchaka commented Oct 27, 2013

imz mannequin commented Mar 22, 2018

serhiy-storchaka commented Mar 22, 2018

imz mannequin commented Mar 22, 2018

serhiy-storchaka commented Mar 22, 2018

imz mannequin commented Mar 22, 2018

imz mannequin commented Mar 22, 2018

serhiy-storchaka commented Mar 22, 2018

taleinat commented Jul 25, 2018

ananthan-123 mannequin commented Feb 20, 2020

taleinat commented Feb 20, 2020

arhadthedev commented Apr 29, 2023

serhiy-storchaka commented Dec 31, 2024

serhiy-storchaka commented Dec 31, 2024

Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords #50333

Reference counting bug in PyArg_ParseTuple and PyArg_ParseTupleAndKeywords #50333

Comments

billm mannequin commented May 22, 2009 • edited by bedevere-app bot Loading

Linked PRs

billm mannequin commented May 22, 2009 • edited by arhadthedev Loading

birkenfeld commented May 22, 2009

loewis mannequin commented May 22, 2009

abacabadabacaba mannequin commented Nov 16, 2010

abalkin commented Jan 14, 2011

abalkin commented Jan 14, 2011

serhiy-storchaka commented Oct 14, 2012

gpshead commented Feb 3, 2013

python-dev mannequin commented Feb 4, 2013

serhiy-storchaka commented Feb 4, 2013

skrah mannequin commented Feb 7, 2013

serhiy-storchaka commented Feb 7, 2013

serhiy-storchaka commented Feb 14, 2013

serhiy-storchaka commented Feb 14, 2013

serhiy-storchaka commented Oct 27, 2013

imz mannequin commented Mar 22, 2018

serhiy-storchaka commented Mar 22, 2018

imz mannequin commented Mar 22, 2018

serhiy-storchaka commented Mar 22, 2018

imz mannequin commented Mar 22, 2018

imz mannequin commented Mar 22, 2018

serhiy-storchaka commented Mar 22, 2018

taleinat commented Jul 25, 2018

ananthan-123 mannequin commented Feb 20, 2020

taleinat commented Feb 20, 2020

arhadthedev commented Apr 29, 2023

serhiy-storchaka commented Dec 31, 2024

serhiy-storchaka commented Dec 31, 2024

billm mannequin commented May 22, 2009 •

edited by bedevere-app bot

Loading

billm mannequin commented May 22, 2009 •

edited by arhadthedev

Loading