The secret to successful exit: Software audit


Have you ever thought about the legal implications of excessive use of NPM? Project A’s Tech team shares some insights gleaned from numerous due diligence processes and offers best practices for working with open-source and third-party libraries

By Ronny Shani

Security, performance, dependency hell — working with third-party software can be exhausting. Rarely a priority for the individual developer, we thought it would be helpful to highlight a lesser-known aspect of working with packages: Legal implications.

Part of our job at Project A’s tech team is to perform due diligence and evaluate the codebases of our portfolio companies. One of the critical-yet-often-missed topics we go over in this process is compliance issues related to third-party software in general and open-source in particular.

More than a few of our portfolio companies have learned first-hand that prevention is usually better than cure.

1. THE SITUATION

Your dependency’s dependency is your responsibility

Many developers swear by the mantra of Don’t Repeat Yourself (DRY) coupled with the “don’t reinvent the wheel” practice. But what if aspiring to abstraction and modularity might come back to haunt you? That’s especially risky when you consider how Agile software development methods — touting fast, iterative, and incremental cycles — can create the sort of technical debt that eventually causes 11 lines of code to break the internet.

A comic GIF from the TV show The IT Crowd shows a man introducing the Internet to a woman (the Internet is supposedly a random black box)
The IT Crowd knows best (source: Giphy)

As we mentioned in our previous article dedicated to dependency hell, a developer named Brandon Nozaki Miller (aka, RIAEvangelist) has recently injected malicious code into their npm package, node-ipc, protesting against the war in Ukraine.

A quick recap: RIAEvangelist’s so-called protestware was live for less than 24 hours, but with over a million weekly downloads, it became a major incident. The current iteration of node-ipc imports a module named peacenotwar, a multilingual call for world peace and responsible code architecture:

“If you do not like what this module does, please just lock your dependencies […] Also, please code-review your other modules for vulnerabilities”.

One of the libraries listed as a dependency in the compromised versions of node-ipc was colors.js, the star of another incident that happened earlier this year. Like RIAEvangelist, developer Marak Squires, the maintainer of several popular libraries, also poisoned his own dish.

In January, he deliberately introduced an infinite loop into two of his “best sellers” — colors.js and faker.js — breaking builds worldwide. When it landed, colors.js had over 23 million weekly downloads and more than 19k projects using it, while faker.js had more than 2.8 million weekly downloads and 2,500 dependents.

A longtime open-source contributor with 177 npm packages under his name, Squires wasn’t trigger happy — he was triggered and apparently had enough of big corporations freeloading his work.

The Census II of Free and Open Source Software (FOSS) report, mapping the most popular packages, echoed these grievances with a concise observation:

“136 developers were responsible for more than 80% of the lines of code added to the top 50 packages. In the open-source world, time and talent may indeed be the most important investments”.

2 THE REGULATION

Multi-billion dollar connective tissue

The tension between open-source developers and commercial organizations using their work isn’t new. For example, a comprehensive study that examined the economic impact of open-source software in the EU found it contributes €65-€95 billion to the block’s GDP.

Still, the needle appears to be moving lately, as governments are actively trying to streamline open-source adoption across the private and public sectors.

Dozens of industry players — developers, software companies, and US government officials — gathered for the Open Source Software Security Summit II in mid-May in Washington, DC, to discuss alternatives. The outcome was a 10-step plan backed by significant financial investments: Amazon, Ericsson, Google, Intel, Microsoft, and VMWare already pledged $30M out of the $150M funding needed over the next two years.

All parties stress the urgent need to standardize and fund open-source projects to make them more sustainable and secure:

“Open source software is a form of digital public good, creating wealth and capability for society as a whole in a continuously-renewing form […] Like any critical infrastructure, we must invest in open source if it is to continue to support the demands made upon it”.

And the demand has never been higher. According to the 2019 State of Software Supply Chain report:

“analysis of over 500 applications revealed the average application contains over 460 software component releases, of which 85% were open source […] it was not uncommon to see applications assembled from 2,000–4,000 OSS component releases”.

GitHub’s definition is more valid than ever — open-source is “the connective tissue for much of the information economy”.

3. THE TOOLING

Standard procedures

The EC’s report mentioned above recommends addressing another elephant in the room: The crucial aspects of licensing, IP, and compliance.

The latest version of node-ipc, for example, is licensed under GPLv3. This copyleft license potentially requires users to distribute any derivative work under the same terms. These prerequisites can compromise companies that seek funding, negotiate M&A deals, etc, botch an otherwise promising exit, and have grave ramifications on IP.

Check licenses

If you’re focused on IP, license-checker can help automate the otherwise tedious process of checking licenses across the dependency tree. This handy tool scans your node_modules directory, identifies licenses documented in every package.json, and lists them one by one.
Note that the tool seems to no longer be maintained and, more importantly, doesn’t list the licenses of sub-dependencies, but it can still give you a general overview of potential weak points.

Comply with standards

When it comes to apps or digital services that handle payments and process credit card data, there’s another crucial detail to consider: In January 2019, the Payment Card Industry Security Standards Council introduced

  • PCI Secure Software Standards and
  • PCI Secure Software Lifecycle (Secure SLC).

These guidelines require companies to ensure that any open-source software components and third-party libraries they or their vendors use are handled as securely as their proprietary code.

Codebase inventory

Although optional unless you work with the American government, it’s worth considering generating a Software Bill of Materials (SBOM), an inventory of all your codebase’s software components and dependencies.

  • Snyk, a security company focused on scanning codebases, provides an overview of this “digital nutrition label and how to produce and maintain it, highlighting both the regulatory and security aspects.
  • Google Cloud Platform (GCP) users can leverage the company’s recently launched Open Source Insights dataset. The tool scans npm, Go, Maven, PyPI, and Cargo packages, compiling security advisories, license information, popularity metrics, and other metadata related to your codebase’s dependency graph. Check out Google’s announcement post for more information and code examples.

The bottom line

As precarious as all this may sound, a closer look reveals a fascinating finding: Most software vulnerabilities are presumably caused by mistakes, not malicious intent. GitHub’s analysis of 521 random advisories suggests that only 17% were explicitly harmful, and most “were generally in seldom-used packages”.

Nonetheless, third-party code has long-term implications. As an enlightening post on Socket’s blog reminds us,

“if you’re shipping code to production that includes open-source code, then you must treat the open source code as part of your app. You are ultimately responsible for the behavior of that code.”

More than a few of our portfolio companies have learned first-hand that prevention is usually better than cure. At the end of the day, the best way to ensure you won’t be surprised during future due diligence is to get in front of it and perform your own independent third-party code audit.