Cybersecurity Vulnerabilities

CVE-2025-40261: Critical Race Condition in Linux Kernel NVMe-FC Could Lead to Data Corruption

Overview

CVE-2025-40261 is a vulnerability identified in the Linux kernel’s Non-Volatile Memory Express over Fibre Channel (NVMe-FC) subsystem. This flaw arises from a race condition during the deletion of NVMe-FC controllers, potentially leading to a “list_del corruption” error and subsequent kernel panic, which can result in data corruption. A fix has been implemented to address this issue by ensuring proper synchronization during controller deletion.

Technical Details

The vulnerability stems from the timing of operations within the nvme_fc_delete_ctrl() function. Specifically, the nvme_fc_delete_assocation() function waits for pending I/O to complete before returning. However, under certain error conditions, the ->ioerr_work workqueue item could be queued after cancel_work_sync() has already been called. This incorrect order of operations can result in a double-free scenario or an attempt to manipulate a freed list element, causing the kernel to panic.

The original code sequence was problematic because it could lead to ->ioerr_work being executed after the nvme_fc_ctrl object had been freed. The fix moves the call to cancel_work_sync() to occur *after* the call to nvme_fc_delete_association(). This ensures that the ->ioerr_work is not running when the memory is deallocated.

The observed crash scenario involved a list_del corruption error, indicating that the kernel attempted to remove an entry from a doubly-linked list that was already invalid. The kernel log snippets provided highlight the call stack leading to the crash:

[ 1135.911754] list_del corruption, ff2d24c8093f31f8->next is NULL
[ 1135.917705] ------------[ cut here ]------------
[ 1135.922336] kernel BUG at lib/list_debug.c:52!
[ 1135.926784] Oops: invalid opcode: 0000 [#1] SMP NOPTI
...
[ 1135.954673] RIP: 0010:__list_del_entry_valid_or_report.cold+0xf/0x6f
...
[ 1136.120806] move_linked_works+0x4a/0xa0
[ 1136.124733] worker_thread+0x216/0x3a0
        

CVSS Analysis

As reported, the Severity and CVSS Score are N/A. However, given the potential for data corruption and kernel panic, it’s reasonable to assume that without a score, this could be classified as a High severity issue, particularly in environments where data integrity is critical.

Possible Impact

The exploitation of this vulnerability can lead to several severe consequences:

  • Data Corruption: The most critical impact is the potential for data corruption due to the memory corruption issues.
  • Kernel Panic: The vulnerability can trigger a kernel panic, resulting in system instability and downtime.
  • Service Disruption: Systems experiencing kernel panics become unavailable, leading to service disruption.

Mitigation and Patch Steps

The recommended mitigation is to apply the patch that addresses this vulnerability. The patch involves reordering the calls within nvme_fc_delete_ctrl() to ensure that cancel_work_sync() is called after nvme_fc_delete_association(). Update to a kernel version containing the fix. The specific commit hashes are listed in the references below.

References

Cybersecurity specialist and founder of Gowri Shankar Infosec - a professional blog dedicated to sharing actionable insights on cybersecurity, data protection, server administration, and compliance frameworks including SOC 2, PCI DSS, and GDPR.

Leave a Reply

Your email address will not be published. Required fields are marked *