oss-sec mailing list archives

Re: CVE-2025-54988: Apache Tika PDF parser module: XXE vulnerability in PDFParser's handling of XFA


From: Hanno Böck <hanno () hboeck de>
Date: Thu, 21 Aug 2025 07:51:06 +0200

On Wed, 20 Aug 2025 15:45:33 -0400
Tim Allison <tallison () apache org> wrote:

Critical XXE in Apache Tika (tika-parser-pdf-module) in Apache Tika

Probably this commit:
https://github.com/apache/tika/commit/bfee6d5569fe9197c4ea947a96e212825184ca33

I recently looked into XXE vulnerabilities, and I believe this is
primarily a vulnerabiltiy in Java's standard library, not in any
single piece of software. I also consider it to be a flaw in the XML
spec itself.

XXE vulnerabilities are a well-known problem, and overwhelmingly, XML
libraries and APIs have adopted safer defaults, which is the right thing
to address this. Java is the exception, where XML parsing is still
insecure-by-default. (That XXE and other XML security flaws aren't
addressed in the XML spec itself is also a problem.)

The idea that any parsing of an untrusted XML file automatically opens
a can of security vulnerability worms, and expecting that every software
using an XML parsing API has to do something extra to avoid it is an
absurd security footgun.

-- 
Hanno Böck - Independent security researcher
https://itsec.hboeck.de/
https://badkeys.info/


Current thread: