Overview
CVE-2025-62372 describes a vulnerability in vLLM, an inference and serving engine for large language models (LLMs). Versions 0.5.5 through 0.11.0 (inclusive) are susceptible. The vulnerability allows a malicious actor to crash the vLLM engine when serving multimodal models by providing malformed multimodal embedding inputs. Specifically, inputs with the correct number of dimensions (ndim) but an incorrect shape (e.g., a wrong hidden dimension size) trigger the crash. This occurs regardless of whether the model is explicitly designed to support such inputs. A fix is available in version 0.11.1 of vLLM.
Technical Details
The vulnerability stems from insufficient input validation within vLLM’s handling of multimodal embeddings. The engine attempts to process the input without properly verifying the shape of the embedding against the expected input shape of the model. This leads to errors during processing, ultimately resulting in a crash of the vLLM service. The issue arises because the check focuses on the number of dimensions but fails to adequately scrutinize the size of each dimension. This lax validation opens the door for attackers to craft specially crafted inputs that bypass initial checks but trigger errors further down the processing pipeline.
CVSS Analysis
Currently, both the Severity and CVSS score for CVE-2025-62372 are listed as N/A (Not Available). This may be due to the specific circumstances required to trigger the vulnerability, or the limited impact beyond a denial-of-service (DoS) condition. While a DoS can be disruptive, the lack of data breach or privilege escalation possibilities may explain the missing severity score. A more complete CVSS assessment will depend on further analysis and real-world exploitation data.
Possible Impact
The primary impact of this vulnerability is a denial-of-service (DoS). An attacker can intentionally crash the vLLM engine, making the model unavailable to legitimate users. This can disrupt applications that rely on the LLM for critical functionality. While the vulnerability doesn’t appear to allow for data exfiltration or remote code execution, repeated crashing of the service can still lead to significant operational disruptions and potentially financial losses.
Mitigation or Patch Steps
The recommended mitigation is to upgrade to vLLM version 0.11.1 or later. This version includes a patch that addresses the vulnerability by implementing stricter input validation for multimodal embeddings. Ensure you test the upgrade in a non-production environment before deploying it to production systems.
