How to Perform Static Code Analysis in PHP

  1. Use the lint Mode to Perform Static Code Analysis in PHP
  2. Use the PHPMD or PHP Depend Project to Perform Static Code Analysis in PHP
  3. Use the pfff Tool to Perform Static Code Analysis in PHP
  4. Use HHVM to Perform Static Code Analysis in PHP
How to Perform Static Code Analysis in PHP

The vital part of development is identifying errors and quickly eliminating them from your codebase, and you can code or perform static code analysis to achieve this in PHP. This tutorial teaches how lint mode and a few other methods perform static code analysis in PHP.

Static code analysis is an incredible way to detect bugs, increase general developer productivity, auto-completion, and refactor your code using many type-related features for your strongly-typed PHP code. Before execution or compilation, you can check your source code to eliminate syntax errors and enforce PHP coding standards and styles to detect security vulnerabilities in your code.

The lint mode is one of the best ways to perform static code analysis in PHP, and you will further learn about php lint, PHPMD, pfff, and HHVM to adopt the one approach that suits your needs better. It leans heavily on the PHP’s type system, and the more information you provide for static analysis, the better results you will get, and declaring types in your code is one way to add more information.

A function function exp_funct ($args) {} that gets a list of featured posts can be declared as function exp_funct (array $args) : array {} to provide further information for static code analysis. Alternatively, you can add a PHPDoc comment to declare or initialize the function input and output types with something like // @return array<exp_var> before declaring a function.

Use the lint Mode to Perform Static Code Analysis in PHP

The Lint PHP mode is one of the best ways to perform static code analysis to check syntax errors and identify unused variable assignments, assigned arrays without any initialization, possibly code style warnings, and many more. You can use php -l FILENAME by running PHP and validate the syntax without an execution in the lint mode from shell or any other command line.

There are many high-level and low-level static analyzers on the internet based on PHP Lint. For example, php-sat, PHPStan, PHP-CS-Fixer, and phan are some of the higher-level static analyzers; on the other hand, PHP Parser and token get all (primitive function) are some of the lower-level analyzers based on PHP Lint.

You can split the given source into PHP tokens like token_get_all(string $code, int $flags = 0): array, and the token_get_all will help you parse the given code string into the PHP language tokens using the Zend engine’s lexical scanner. The TOKEN_PARSE parameter flag recognizes the ability to use reserved words in a specific context and helps return an array of token identifiers.

Each token identifier returned from this method is either a single character or a three-element array containing the token index, the string content of the original token, and the line number in elements 0, 1, and 2, respectively. You will find two examples of the token_get_all(); one is a general use, and the other is performing it on a class using a reserved word.

<?php
    $userQuota_getToken = token_get_all('<?php echo; ?>');

    foreach ($userQuota_getToken as $get_tokenQ) {
        if (is_array($get_tokenQ)) {
            echo "Line {$get_tokenQ[2]}: ", token_name($get_tokenQ[0]), " ('{$get_tokenQ[1]}')", PHP_EOL;
        }
    }
?>

// 2nd example on class

/*

$token_quota_source = <<<'code'

class A
{
    const PUBLIC = 1;
}
code;

$userQuota_getToken = token_get_all($token_quota_source, TOKEN_PARSE);

foreach ($userQuota_getToken as $get_tokenQ) {
    if (is_array($get_tokenQ)) {
        echo token_name($get_tokenQ[0]) , PHP_EOL;
    }
}

*/

// its output will be something similar
/*

T_OPEN_TAG
T_WHITESPACE
.
.
.
T_CLASS
T_WHITESPACE
T_STRING
.
.

*/

Output:

Line 1: T_OPEN_TAG ('<?php ')
Line 1: T_ECHO ('echo')
Line 1: T_WHITESPACE (' ')
Line 1: T_CLOSE_TAG ('?>')

Additionally, runtime analyzers work in lint mode and are more useful for some things due to the dynamic nature of this programming language. The Xdebug is a runtime analyzer with code coverage and function tracers.

The phpweaver has Xdebug function traces and uses a combined static/dynamic approach to perform code analysis. If you are looking for a static code analyzer for production servers, xhprof is the best lint mode static analyzer, similar to Xdebug, but lighter and includes a PHP-based interface.

Use the PHPMD or PHP Depend Project to Perform Static Code Analysis in PHP

It stands for PHP Mess Detector and is a spin-off project of PHP Depend aims to be the equivalent static code analyzer of the well-known Java PMD tool. You can use composer to install PHP_Depend, curl -s http://getcomposer.org/installer | php and the php composer.phar requires pdepend/pdepend:2.12.0 or if you have any globally installed composer.

On the other hand, PHPMD is preferable over the PHP_Depend as it is more user-friendly and has an easy-to-configure front-end for raw metrics measured by PHP_Depend. It takes a given source code (PHP code) base and, as it has a straightforward working principle, looks or tries to find potential bugs or cautions within that source.

It can easily detect bugs and syntax errors, overcomplicated expressions, unused properties, methods, parameters, and suboptimal code. As a mature PHP project and static code analyzer, PHP Mass Detector offers a vast library of pre-defined rules to analyze the PHP source code.

// Type phpmd [filename|directory] [report format] [ruleset file]

hassan@kazmi ~ $ phpmd PHP/Depend/DbusUI/ xml rulesets/codesize.xml

<?xml version="1.0" encoding="UTF-8" ?>
<pmd version="0.0.1" timestamp="2009-12-19T22:17:18+01:00">
  <file name="/projects/pdepend/PHP/Depend/DbusUI/PHPMD.php">
    <violation beginline="54"
               endline="359"
               rule="TooManyProperties"
               ruleset="Code Size Rules"
               package="PHP_Depend\DbusUI"
               class="PHP_Depend_DbusUI_ResultPrinter"
               priority="1">
      This class has too many properties; consider refactoring it.
    </violation>
  </file>
</pmd>

Output:

This class has too many properties; consider refactoring it.

The command line usage of the PHP Mess Detector can be activated or used by typing phpmd [filename|directory] [report format] [ruleset file], and it is possible to pass a file/directory name to PHPMD as a container for PHP source code for analyzing. The codesize.xml or rulesets parameters can look like a filesystem reference as its Phar distribution includes the rule set files inside its archive.

Furthermore, it enables PHP programmers to use shortened names or references to refer to built-in rule sets like phpmd Depend XML codesize. The command line interface of PHPMD also accepts optional arguments like --min-priority, --report-file, --suffixes, --strict, and many more.

You can apply the ~ $ phpmd /path/to/source text codesize configuration by using the multiple rules sets applied against the source code under the test and enabling a call to its CLI tools with a set name. Furthermore, it allows in-depth configuration for programmers to mix custom rule sets files with build-in rule sets, and the ~ $ phpmd /path/to/source text codesize,/my/rules.xml command is a perfect example of it to specify your custom rule sets to analyze the source code.

Use the pfff Tool to Perform Static Code Analysis in PHP

As a set of APIs and tools, it can perform static code analysis to index, search, navigate, visualize, refactor source code, and style-preserving source-to-source PHP code transformation.

It is easy to compile and install pfff; however, it produces results in a complex format like go-automatic.php:14:77: CHECK: Use of undeclared variable $goUrl or login-now.php:7:4: CHECK: Unused Local variable $title. You can access the pfff on GitHub using $ ~/sw/pfff/scheck ~/code/github/sc/.

Furthermore, you can embed the parsing library in your own OCaml application by copying the commons/ and parsing_php/ directories in your project directory and adding a recursive make; in the end, link the application with the parsing_php/parsing_php.cma & commons/commons.cma library. Also, observe the pfff/demos/Makefile for better understanding, and once the source is compiled, you can test pfff with the following:

$ cd demos/
$ ocamlc -I ../commons/ -I ../parsing_php/ \
../commons/commons.cma ../parsing_php/parsing_php.cma \
show_function_calls1.ml -o show_function_calls
$ ./show_function_calls foo.php

Afterward, you must be able to see on stdout some helpful information on the function calls in foo.php according to the code in show_function_calls1.m1 in the pfff project on the Facebook archives. The pfff parser is extraordinarily productive, and you can test it on the phpbb website.

// source code of pfff command-line
$ cd /tmp
$ wget http://d10xg45o6p6dbl.cloudfront.net/projects/p/phpbb/phpBB-3.0.6.tar.bz2
$ tar xvfj phpBB-3.0.6.tar.bz2
$ cd <pfff_src_directory>
$ ./pfff -parse_php /tmp/phpBB3/

The pfff program should then iterate over all the source code files (.php source files) and run the parser on each source file and will output some statistics showing, like: NB total files = 265; perfect = 265; =========> 100% and nb good = 183197, nb bad = 0 =========> 100.000000% which means pfff was able to parse 100% of your PHP source code.

As a command line program, it features different commands like pfff to test the PHP language parsers. You can use scheck to find bugs, and it works like lint and stag for the Emacs tag generator, which is more precise than any other.

The sgrep is a synthetical grep to make it easy to find precise code patterns, and spatch is a syntactical patch to make it easy to refactor PHP code, as well as codemap, pfff_db, codegraph, and codequery, are some of the latest additions to pfff tool to perform global analysis on a set of source files or query information about the structure of your PHP codebase.

Use HHVM to Perform Static Code Analysis in PHP

It has built-in Proxygen and FastCGI server-type support and can be one of the perfect static code analyzers. HHVM is known as a fully functional web server with Proxygen directly built into it, and its ease of use and processing source code make it highly recommendable for static code analysis.

It servers fast web requests and provides a high-performance web server equivalent to the FastCGI and nginx combined. You can implement hhvm -m server -p 8080 to use Proxygen when running HHVM in server mode and can set the port by command line configuration: hhvm.server.port=7777, or putting -d hhvm.server.port=7777 in your server.ini file.

You can use the -d hhvm.server.type=proxygen command to define the Proxygen server type without explicitly specifying it (Proxygen is the default). The init scripts HHVM packages start in FastCGI mode by default and require configuration tweaking before being automatically started as a server.

The following is an example of HHVM package configuration with different customizable options (server.ini or -d options) at the command line. Remember, some of these configuration options are optional since they are the default value, but they can help deliver more information or show illustrations to the user.

// initialize a server port
hhvm.server.port = 60
// the default server type is `proxygen`
hhvm.server.type - proxygen
hhvm.server.default_document = source.php
hhvm.error_document404 = source.php
hhvm.server.source_root = /edit/source/php

Using optional configuration options is good for documentation purposes to be explicit, and the hhvm.server.source_root and hhvm.server.port are most likely ones that need explicit values. HH Virtual Machine is open-source and written in Hack and uses a JIT (just-in-time) compilation to achieve superior performance while maintaining exceptional development flexibility.

The default directory HHVM binary launched in is the default_document that you can change based on your server. After installing HHVM to your OS in your PHP project, you can use the sudo update-rc.d hhvm defaults and sudo service hhvm restart commands to set HHVM to start up at boot as a server.

Syed Hassan Sabeeh Kazmi avatar Syed Hassan Sabeeh Kazmi avatar

Hassan is a Software Engineer with a well-developed set of programming skills. He uses his knowledge and writing capabilities to produce interesting-to-read technical articles.

GitHub