GitHub allows perpetual access to data from deleted and private forks via Cross-Fork Object Reference (CFOR) vulnerabilities. Attackers can exploit this by supplying commit hashes to directly access otherwise inaccessible commit data.
It occurs when a fork’s data can be accessed from another fork, even if the former is deleted or private. Even after deleting a forked repository with committed code, the data remains retrievable indefinitely through the original repository, posing a significant security risk for organizations using GitHub.
Data from deleted forks can be recovered surprisingly often. A limited survey of three popular public repositories from a large AI company yielded 40 valid API keys embedded within example files of deleted forks.
Users frequently adopt a straightforward pattern: they fork a repository, hardcode an API key into an example file for their work, and subsequently delete the fork upon completion, which not only exposes sensitive information but can also be exploited in reverse, potentially leading to unauthorized access or data breaches.
Deleting a public GitHub repository does not permanently erase its data. Any code committed to the repository remains accessible indefinitely through existing forks, even if those forks never synced with the original repository.
Sensitive information, such as private keys, accidentally committed to a public repository can potentially be exposed forever through a fork, regardless of subsequent deletion of the original repository.
A common practice of forking a private repository into a public one can inadvertently expose confidential code. Commits made to the private repository before they’re made public become accessible in the public version due to GitHub’s repository network structure.
It exposes organizations to potential data leakage and security risks as private features and related code might be visible to the public.
Commit data can still be accessed even after it’s removed from the GitHub UI or normal Git operations if users know the commit hash because the commit data still exists and can be reached by directly constructing a URL with the commit hash at the end.
Hackers can brute-force short commit hashes (minimum 4 characters) to gain access to hidden commits, while public events API endpoints and third-party archives of past GitHub events can expose commit hashes even for deleted repositories.
Truffle Security reported a potential security issue to GitHub’s VDP program, where they claimed their design intended the documented behavior.
While GitHub offers transparency in its documentation, there’s a disconnect between user expectations and reality. Users believe private repositories act as security barriers, isolating data from public access.
The findings demonstrate that private repository data, including commit information, can still be indirectly accessed through forked repositories even after deletion, which contradicts the typical user understanding of data destruction upon deletion.