Researchers at Hidden Layer discovered a vulnerability (CVE-2024-27322) in R that allows attackers to execute arbitrary code through malicious RDS files, which exploits R’s deserialization process by injecting code within a promise object and leveraging lazy evaluation.
R is an open-source language and environment for statistical computing, data visualization, and machine learning that offers a core language and a rich library ecosystem, making it a popular choice for statistics and data science.
Its strength in statistical analysis of large datasets extends its applications to healthcare, finance, government, and AI/ML, while the prevalence of R is evident in numerous R conferences featuring speakers from leading organizations in these sectors.
It has a robust ecosystem that contributes to its popularity in scientific computing, and CRAN, the main repository for R packages, boasts over 20,000 packages with millions of downloads.
R’s extensive use across government agencies, medical institutions, and financial institutions makes It vulnerable to code execution attacks. Exploiting such vulnerabilities, like the recently discovered deserialization vulnerability (CVE-2024-27322), can have far-reaching consequences.
It utilizes the RDS (R Data Serialization) format for saving and loading objects, which efficiently stores data structures for later use or network transfer. Similar to other serialization methods, RDS converts objects into a transferable format and reconstructs them (deserialization) when needed.
Notably, R packages leverage RDS during compilation, and a compiled package creates two files: .rdb stores serialized object representations, and .rdx holds metadata for the .rdb binary data. While loading the package, R utilizes the .rdx index to locate and load the serialized data from .rdb into RDS format.
The focus of the exploit lies in manipulating the RDS header and bytecode instructions (ReadItem function) to achieve code execution without disrupting the object structure, which is challenging because RDS lacks explicit termination instructions and relies on object boundaries.
Researchers discovered a vulnerability in deserialization code that exploits lazy evaluation, as the code uses a promise object with a symbol and an expression, while accessing the symbol triggers the expression’s evaluation.
By crafting a promise with an unbounded value instead of a symbol, they created a payload that executes the expression when the deserialized object is used, bypassing immediate code execution within the deserialization process itself.
A critical vulnerability in R allows attackers to execute arbitrary code through malicious RDS files, which can be done by compromising user-provided data or infecting R packages, which store data in .rdb files with metadata in .rdx files.
When loaded, the .rdx file instructs how to extract data from the .rdb file, where an attacker can exploit this by replacing the .rdx file or injecting malicious code directly into the .rdb file, making detection difficult.
The vulnerability affects a wide range of users due to the prevalence of vulnerable code referencing readRDS and the potential for attacks through popular R package repositories.
Stay updated on Cybersecurity news, whitepapers, and Infographics. Follow us on LinkedIn & Twitter.