essential dependency information for safe provide chains

Immediately, we’re excited to announce the API, which offers free entry to the dataset of safety metadata, together with dependencies, licenses, advisories, and different essential well being and safety alerts for greater than 50 million open supply package deal variations.

Software program provide chain assaults are more and more widespread and dangerous, with excessive profile incidents corresponding to Log4Shell, Codecov, and the latest 3CX hack. The overwhelming complexity of the software program ecosystem causes bother for even essentially the most diligent and well-resourced builders.

We hope the API will assist the neighborhood make sense of complicated dependency information that enables them to answer—and even forestall—some of these assaults. By integrating this information into instruments, workflows, and analyses, builders can extra simply perceive the dangers of their software program provide chains.

As a part of Google’s ongoing efforts to enhance open supply safety, the Open Supply Insights workforce has constructed a dependable view of software program metadata throughout 5 packaging ecosystems. The information set is repeatedly up to date from a spread of sources: package deal registries, the Open Supply Vulnerability database, code hosts corresponding to GitHub and GitLab, and the software program artifacts themselves. This contains 5 million packages, greater than 50 million variations, from the Go, Maven, PyPI, npm, and Cargo ecosystems—and also you’d higher imagine we’re counting them!

We accumulate and combination this information and derive transitive dependency graphs, advisory affect reviews, OpenSSF Safety Scorecard data, and extra. The place the web site permits human exploration and examination, and the BigQuery dataset helps large-scale bulk information evaluation, this new API permits programmatic, real-time entry to the corpus for integration into instruments, workflows, and analyses.

The API is utilized by plenty of groups internally at Google to help the safety of our personal merchandise. One of many first publicly seen makes use of is the GUAC integration, which makes use of the information to counterpoint SBOMs. We’ve got extra thrilling integrations within the works, however we’re most excited to see what the better open supply neighborhood builds!

We see the API as being helpful for software builders, researchers, and tinkerers who wish to reply questions like:

  • What variations can be found for this package deal?
  • What are the licenses that cowl this model of a package deal—or all of the packages in my codebase?
  • What number of dependencies does this package deal have? What are they?
  • Does the most recent model of this package deal embody modifications to dependencies or licenses?
  • What variations of what packages correspond to this file?

Taken collectively, this data might help reply crucial overarching query: how a lot danger would this dependency add to my challenge?

The API might help floor essential safety data the place and when builders can act. This information may be built-in into:

  • IDE Plugins, to make dependency and safety data instantly accessible.
  • CI/CD integrations to forestall rolling out code with vulnerability or license issues).
  • Construct instruments and coverage engine integrations to assist guarantee compliance.
  • Put up-release evaluation instruments to detect newly found vulnerabilities in your codebase.
  • Instruments to enhance stock administration and thriller file identification.
  • Visualizations that can assist you uncover what your dependency graph truly seems to be like:

    The API has a few nice options that aren’t accessible via the web site.

    Hash queries

    A novel characteristic of the API is hash queries: you possibly can search for the hash of a file’s contents and discover all of the package deal variations that comprise that file. This might help determine what model of which package deal you may have even absent different construct metadata, which is helpful in areas corresponding to SBOMs, container evaluation, incident response, and forensics.

    Actual dependency graphs

    The dependency information isn’t just what a package deal declares (its manifests, lock recordsdata, and so forth.), however relatively a full dependency graph computed utilizing the identical algorithms because the packaging instruments (Maven, npm, Pip, Go, Cargo). This offers an actual set of dependencies just like what you’ll get by truly putting in the package deal, which is helpful when a package deal modifications however the developer doesn’t replace the lock file. With the API, instruments can assess, monitor, or visualize anticipated (or sudden!) dependencies.

    API in motion

    For an indication of how the API might help software program provide chain safety efforts, contemplate the questions it might reply in a scenario just like the Log4Shell discovery:

    • Am I affected? – A CI/CD integration powered by the free API would routinely detect {that a} new, essential vulnerability is affecting your codebase, and provide you with a warning to behave.
    • The place? – A dependency visualization software pulling from the API transitive dependency graphs would enable you determine whether or not you possibly can replace one in all your direct dependencies to repair the difficulty. In case you had been blocked, the software would level you on the package deal(s) which might be but to be patched, so you might contribute a PR and assist unblock your self additional up the tree.
    • The place else? – You could possibly question the API with hashes of vendored JAR recordsdata to examine if susceptible log4j variations had been unexpectedly hiding therein.
    • How a lot of the ecosystem is impacted? – Researchers, package deal managers, and different observers might use the API to know how their ecosystem has been affected, as we did in this weblog publish about Log4Shell’s affect.

    The API service is globally replicated and extremely accessible, that means that you just and your instruments can depend upon it being there once you want it.

    It is also free and instantly accessible—no have to register for an API key. It is only a easy, unauthenticated HTTPS API that returns JSON objects:

    # Record the advisories affecting log4j 1.2.17
    $ curl 
            | jq '.advisoryKeys[].id'

    A single API name to checklist all of the GHSA advisories affecting a selected model of log4j.

    Try the API Documentation to get began, or bounce straight into the code with some examples.

    Software program provide chain safety is difficult, but it surely’s in all our pursuits to make it simpler. Day by day, Google works onerous to create a safer web, and we’re proud to be releasing this API to assist just do that, and make this information universally accessible and helpful to everybody.

    We sit up for seeing what you may do with the API, and would recognize your suggestions. (What works? What would not? What makes it higher?) You possibly can attain us at, or by submitting a difficulty on our GitHub repo.

Leave a Reply

Your email address will not be published. Required fields are marked *