oss-sec mailing list archives

Re: CPython: CVE-2024-8088: Infinite loop when iterating over zip archive entry names


From: Fay Stegerman <flx () obfusk net>
Date: Fri, 23 Aug 2024 17:38:16 +0200

* Fay Stegerman <flx () obfusk net> [2024-08-22 23:12]:
* Alan Coopersmith <alan.coopersmith () oracle com> [2024-08-22 20:56]:
-------- Forwarded Message --------
Subject:    [Security-announce][CVE-2024-8088] Infinite loop when iterating
over zip archive entry names
Date:       Thu, 22 Aug 2024 13:40:20 -0500
From:       Seth Larson <seth () python org>
Reply-To:   security-sig () python org
To:         security-announce () python org

There is a HIGH severity vulnerability affecting the CPython "zipfile" module.

When iterating over names of entries in a zip archive (for example, methods
of "zipfile.ZipFile" like "namelist()", "iterdir()", "extractall()", etc)
the process can be put into an infinite loop with a maliciously crafted
zip archive. This defect applies when reading only metadata or extracting
the contents of the zip archive. Programs that are not handling
user-controlled zip archives are not affected.

Please see the linked CVE ID for the latest information on affected versions:

* https://www.cve.org/CVERecord?id=CVE-2024-8088
* https://github.com/python/cpython/pull/122906
* https://github.com/python/cpython/issues/122905

A small correction/addendum based on reading the vulnerability report and the PR
that fixes this (as well as being quite familiar with Python zipfile.ZipFile
internals and confused how this would affect it): it's not zipfile.ZipFile and
its methods that are affected, at least not directly, but zipfile.Path.  The
issue being this code in zipfile._path._ancestry():

  path = path.rstrip(posixpath.sep)
  while path and path != posixpath.sep:
      yield path
      path, tail = posixpath.split(path)

Which results in an infinite loop because for example posixpath.split("//") ==
("//", "") but "//" != posixpath.sep:

  >>> it = zipfile._path._parents("//foo")
  >>> next(it)
  '//'
  >>> next(it)
  '//'
  >>> next(it)
  '//'

The infinite loop has been fixed by sanitising the paths.

Forgot to mention this: the infinite loop is triggered when zipfile.Path adds
"implied directories" -- using _parents(), which calls _ancestry() -- in the
overridden .namelist() for the custom zipfile.ZipFile subclass it wraps.  Which
is (indirectly) used by almost all of the zipfile.Path methods like .iterdir(),
.glob(), .exists(), .joinpath() etc.

  >>> zf = zipfile.ZipFile(io.BytesIO(), "w")
  >>> zf.filename = "foo.zip"
  >>> zf.writestr("a/b/c", "abc")
  >>> zf.writestr("d/e", "de")
  >>> zf.namelist()
  ['a/b/c', 'd/e']
  >>> zf.__class__
  <class 'zipfile.ZipFile'>

  >>> p = zipfile.Path(zf)
  >>> p.root.namelist()
  ['a/b/c', 'd/e', 'a/b/', 'a/', 'd/']
  >>> list(p.iterdir())
  [Path('foo.zip', 'a/'), Path('foo.zip', 'd/')]
  >>> zf.__class__
  <class 'zipfile._path.CompleteDirs'>

  >>> zf.writestr("//oops", "oops")
  >>> # infinite loop via joinpath -> resolve_dir -> _name_set -> namelist ->
  >>> # _implied_dirs -> _ancestry
  >>> p / "foo"

As zipfile.Path modifies the class of the original ZipFile, calling .namelist()
or .extractall() on the original ZipFile used to create the Path afterwards is
also affected even though zipfile.ZipFile as such is not.

- Fay


Current thread: