Overview
CVE-2025-66516 details a critical XML External Entity (XXE) injection vulnerability affecting Apache Tika. This vulnerability resides in the tika-core (versions 1.13-3.2.1), tika-pdf-module (versions 2.0.0-3.2.1), and tika-parsers (versions 1.13-1.28.5) modules. An attacker can exploit this flaw by crafting a malicious XFA file embedded within a PDF, potentially allowing them to access sensitive data on the server or execute arbitrary code.
This CVE effectively expands upon the scope of CVE-2025-54988, clarifying that the underlying vulnerability and its fix are within tika-core. Furthermore, it highlights that Tika 1.x releases include the PDFParser within the tika-parsers module, making them equally susceptible.
Technical Details
The vulnerability stems from improper handling of XML External Entities (XXE) within the processing of XFA forms inside PDF documents. When Tika parses a PDF containing a specially crafted XFA file, it can be tricked into resolving external entities defined within the XML structure. This allows an attacker to potentially:
- Read arbitrary files from the server’s file system.
- Perform Server-Side Request Forgery (SSRF) attacks.
- In some cases, achieve remote code execution (RCE), depending on the server’s configuration and Tika’s environment.
The crucial point is that the fix requires upgrading tika-core. Simply updating tika-pdf-module is insufficient, as the core parsing logic resides in the former.
CVSS Analysis
At the time of writing, a formal CVSS score has not been assigned. However, given the potential for data leakage, SSRF, and even RCE, this vulnerability should be considered Critical.
Possible Impact
Successful exploitation of this XXE vulnerability can have severe consequences:
- Data Breach: Attackers can access sensitive data stored on the server’s file system, including configuration files, credentials, and confidential documents.
- Server Compromise: In vulnerable configurations, attackers could potentially execute arbitrary code on the server, gaining complete control.
- Denial of Service: An attacker could potentially trigger resource exhaustion or application crashes through carefully crafted XXE payloads.
- SSRF: Attackers could use the vulnerable server as a proxy to access internal resources or external services that are not directly accessible from the internet.
Mitigation and Patch Steps
The recommended mitigation is to upgrade Apache Tika to the following versions or later:
- tika-core: Version 3.2.2 or later
- tika-pdf-module: Automatically updated when tika-core is updated.
- tika-parsers: If using Tika 1.x releases, ensure this module is updated as well when updating tika-core.
Ensure that tika-core is upgraded, even if you are primarily using the tika-pdf-module. Pay close attention to the version of Tika being used (1.x or 2.x/3.x) and update the relevant modules accordingly.
References
- CVE Record: https://cve.org/CVERecord?id=CVE-2025-54988
- Apache Tika Mailing List: https://lists.apache.org/thread/s5x3k93nhbkqzztp1olxotoyjpdlps9k
