A recent analysis of Github repositories has highlighted the extensive use of dependencies in the Top 100 AI projects. The study found that, on average, these projects reference a staggering 208 direct and transitive dependencies. What’s more concerning is that 11% of the projects were found to rely on 500 or more dependencies, showcasing the heavy reliance on external code libraries.
However, the presence of numerous dependencies comes with its own set of risks. The analysis revealed that 15% of these Github repositories contain 10 or more known vulnerabilities. One of the notable examples is the package distributed by Hugging Face Transformers, the architecture behind ChatGPT. This package alone has over 200 dependencies, including four known vulnerabilities. These vulnerabilities can pose significant security threats if exploited by malicious actors.
The use of dependencies also raises concerns about the security of these projects. Endor, a platform that tracks applications, discovered that 55% of the applications make calls to security-sensitive APIs. These APIs are programming interfaces that link to critical resources, and if compromised, they could potentially impact the overall security of an asset. However, when considering the dependencies of software component packages, this number jumps to a staggering 95%.
“Every considerable application includes dependencies that call into a big share of JCL’s – Java Class Library, which comprises the core APIs provided by the Java runtime – sensitive APIs,” explained Plate, a researcher involved in the study. This highlights the widespread usage of these sensitive APIs in various applications, emphasizing the need for robust security measures.
The research also delves into the specific usage of security-sensitive APIs in Java packages. The findings show that 71% of Census II java packages call five or more categories of security-sensitive APIs when all their dependencies are taken into account. This indicates the extensive reach of these APIs and the potential vulnerabilities associated with them.
The study highlights a crucial issue in the software development industry – the lack of awareness and understanding of cascading dependencies. Developers often integrate open-source components without fully comprehending the underlying dependencies. This can lead to a lack of transparency and potentially expose organizations to security risks.
In order to address these concerns, organizations need to go beyond basic Software Bill of Materials (SBOMs) and adopt more comprehensive practices. SBOMs provide a list of components used in a software application, but they may not capture the full extent of dependencies and their associated vulnerabilities. A more robust approach is required to ensure transparency and protect brand reputation.
The analysis of Github repositories sheds light on the complex web of dependencies that underpins the Top 100 AI projects. While these projects demonstrate incredible technological advancements, they also expose the inherent vulnerabilities associated with extensive reliance on external code libraries. As the software development landscape continues to evolve, it is imperative for organizations to prioritize security, thoroughly understand their dependencies, and implement proactive measures to safeguard their applications and assets.
