Static program analysis is a practice and set of technologies which are designed to analyze application source code, byte code, and binaries for features and behaviors without actually needing to run the code. Static program analysis has been used since the 1960s to optimize compilers, but has been applied more recently to the problems of finding software bugs, verifying and validating code, and in integrated development environments (IDEs) to support program development.
In the context of application security, wherein static program analysis techniques are used to detect security vulnerabilities, the practice is often referred to as static application security testing (SAST).
Static program analysis can be applied across several different domains of software development practice. The ability to predict program behavior can aid in program optimization, security audits, automatic parallelization, and correctness verification.
Software testing typically occurs in one of two basic ways: static analysis and dynamic analysis. Static analysis evaluates the correctness and soundness of code without needing to actually run the code. Dynamic analysis, however, evaluates software based on its runtime and outputs, requiring the tester to run the program, sometimes in a virtual environment. Each has its advantages and disadvantages.
- Static analysis can scale well across a large software development organization. Automated static analysis can be run repeatedly, such as when shipping nightly builds or as part of a continuous integration workflow.
- Dynamic analysis checks that a software project achieves its specified functionality.
- Static analysis can decrease the amount of testing and debugging required to consider a piece of software ready to ship.To pass most static analysis tests, code must be written in a way that is reliable, readable, and less prone to errors on future tests.
- The results of dynamic analysis are not generalized for future executions. There is no certainty that the set of inputs over which the program was run is characteristic of all possible program executions.
- Static analysis can be sound because it evaluates the logic and structure of code. A sound static analysis tool will occasionally produce false positives, suggesting that a piece of code may be buggy or when it is just fine.
- Support for the programming language or framework(s) which needs to be analyzed.
- Types of errors and vulnerabilities the static analysis testing tool can detect.
- The accuracy of the tool, including its rate of false positives.
- Ease of use and integration into existing development tools and deployment workflows.
- In the case of commercial solutions, cost and licensing presents its own set of decisions and trade-offs
Static analysis is frequently used in evaluating code validity and code security. Static analysis tools, including those designed for security testing, have evolved over time.
The first generation of static analysis software tools emerged in the 1970s.
In the context of security, these early tools primarily consisted of homegrown scripts which would apply 'grep'-like functions to source code to determine whether the code employed unsafe functions which could expose vulnerable information or crash the program.
These first generation tools would create many false positives which would require manual review by the programmer or security analyst.
Second generation tools took advantage of increased processing power and system memory to perform more complex forms of analysis of code. These tools emerged in the 1990s and early 2000s. Moving beyond spot checks and what was essentially simple keyword search within source code, second generation tools implemented more complex methods like path analysis to evaluate the deeper reasoning of a program's behavior at runtime.
Open source security testing tools like RATS from Secure Software, David Wheeler's Flawfinder, and ITS4 from Cigital are part of this second generation.
The third generation of static analysis tools emerged in the 2000s and facilitated deeper code evaluation and analysis, as well as testing for more languages and frameworks.
Abstract syntax tree modeling helped give many of these tools a leg up in terms of accuracy and completeness. Using an abstract structure to represent the logic of a program captures the essence of a codebase's logic. This results in a more efficient way to predict the codebase's behavior.
In the 2010's and later, viewing code as data became more of a norm.