Company attributes
Other attributes
Phylum is a cybersecurity company intended to solve software supply chain vulnerabilities. The company assists developers to ensure that the software they depend on is free of malicious additions. This is done through the analysis of open-source software packages for indicators of risk, and enables Phylum to protect software developers and software products from vulnerabilities, malicious code, and bad actors.
The company works to quantify open-source risk in an automated way. This is done through Phylum's platform, which ingests all software packages in a given ecosystem and uses graph theory, machine learning, and other analytical techniques to develop a risk score and develop a given understanding of the security of an open-source software. Phylum also works to ingest and mine massive datasets from around the web for risk analysis.
The Phylum Package Score is done to provide a simple score that creates an understanding of how safe or unsafe a piece of software is at a glance. This can help developers understand where the dangers are in the ecosystem, as well as identify potential upstream problems and find abandoned or poorly-maintained packages and possible vulnerabilities therein. In turn, this is a tool attempting to help developers understand the dependencies in a given software environment and who is able to make changes to the code in a given repository.
The value score offered by Phylum is developed from five key domains of risk: malicious code, technical debt, license, author, and software vulnerability. The score is calculated through the ingesting and processing of information about a package and related dependencies. Following ingestion, analysis occurs using analytics, heuristics, and machine learning models. Ingested datasets include:
- Static analysis of package source code
- File analysis of all files in a package
- History analysis of any attached source code repositories
- Metadata analysis of all artifacts captured from package manager and hosting repository
- Known vulnerabilities for a package-version iteration
- Author reputation from previous activities and behaviors
- Full composition analysis of all dependencies required for package use
The data collected is maintained and curated over the lifetime of the package, and changes made over time trigger updates to the Phylum Package Score.
The analysis layer of Phylum's platform works through the package data to identify low indicators of risk and combines them with associated information to extract high indicators of risk. The techniques used vary, but work together to extract meaningful indicators to better understand the risk in an open source package. These indicators, once identified, are weighted to create the Phylum Package Score.
This analysis works to detect usage of libraries and packages that have been or appear to have been abandoned by their author. This is done because often open-source libraries and packages are maintained by a single developer or group of developers, and the work is generally unpaid and is additional work to that performed as part of a full-time job. This results in abandoned or poorly maintained libraries, unlikely to receive updates, bug fixes, or feature improvements. In this analysis, packages are deemed abandoned if:
- they have not been updated in two or more years,
- open issues exist in the issue tracker without a response from package maintainer(s), and/or
- unmerged PRs exist that are largely ignored by the package author(s).
This analysis works to check software packages for deep source structures. Constructed software should produce fairly wide syntax trees, while malware attempts to hide its behavior by making numerous extraneous function calls, which tends to produce syntax trees that are very deep, rather than wide. Deep syntax trees may be indicative of some technical debt needing to be addressed, but at worst it is a sign that the underlying machinery is unravelling a tangle of code prior to the execution of a malicious payload. AST depth analysis is generally considered a weak indicator of risk; however, Phylum considers its inclusion as one more indicator of increased or possible risk.
This analysis works to identify encrypted or encoded data in dependencies that may be indicative of attempts to hide data and activity. In the obfuscations are often where malware attempts to hide its activities, and works to mask critical mechanisms the malicious code uses to operate. These are typically easy to unravel with common encodings. But the presence of high entropy, or encoded blocks of data, are not indicative of malware; all malware includes some encrypted or encoded data blocks. And large numbers of obfuscated strings tend to be uncommon in benign software.
This analysis is used by Phylum to check the license in use in the dependency tree and identifies licenses that may pose a risk to commercial use. As most open-source software has an associated license, the license may be permissive to commercial use or may mandate the release of internal source code as a result of using the open-source software package. Checking the license can be important to ensure an organization is not intending to use an open-source software package and license conditions that may not be able to be adhered to in a commercial setting.
This analysis is used to identify projects in a dependency tree that were contributed by a new developer. As a package is typically controlled and maintained by a small number of authors, these open source packages generally welcome contributions from any author that improves the project. But in some instances, authors contributing to a project do not always offer a clear impact of what their contributions will have, and may be capable of infecting a large number of the files in an open-source project. By following those new commits, Phylum works to identify contributions from new authors and to understand, if possible, if the contribution could be malicious.
This analysis is used to identify dependencies that may expose users to repo jacking. This is when a package manager allows users to set dependencies directly on resources external to the package manager ecosystem and can include links to Github, Gitlab, or other version control systems. This is important in the case when a dependency is linked to a user who deletes their account, which puts the username in an area where it can be exploited by any developer who wishes to use it. Phylum monitors these dependencies, as the company suggests a user of the software could find themselves executing code from a malicious actor, especially as malicious code may be introduced without changes to the users codebase.
This is an analysis to check customer dependencies for leaked secrets, especially as these secrets are often used for authentication against a system or service which may be accidentally added to a source code repository and subsequently leaked onto the internet. Phylum identifies leaked secrets in open source packaging by performing a variety of pattern matching operations across the package source code.
This analysis works to check open source dependencies to ensure they are not relying on a malicious package, due to an accidental misspelling of the legitimate package name. This can happen as a developer may make a typo when including open source package names in a project. A malicious actor can take advantage of this and release a package under the misspelled name and can even serve the real name to make the issue more difficult to detect.
Phylum offers an API that the company suggests was built to help developers, including tools and integrations those developers may need to make use of Phylum easier for those developers. As well, the company offers CI/CD and IDE integrations with Jenkins, Github, Gitlab, and Visual Studio Code.