PSR guidelines: The original version of this document was written long before the PHP FIG project was started. You may wish to follow their PSR-12 guide for coding style, most of which I agree with.
These are the guidelines that I follow when writing my PHP scripts, unless a coding standard already exists for the project I’m working on. It can be helpful to have something like this if you’re working on a joint project.
These are only the guidelines that I personally chose to follow for the code I write. I’m not suggesting that any other coding styles, guidelines or standards are wrong. Feel free to use this document as a template to manage your own coding guideline and change whatever you wish as you see fit.
Why are guidelines important?
First of all, it doesn’t matter what your guidelines are, so long as everyone understands and sticks to them. I’ve worked on a team where the person in charge preferred to put braces following the expression, rather than on a line all by themselves. I prefer the latter method, but I stuck to the established convention because it made maintaining the whole project much easier.
It cannot be emphasised enough that guidelines are only useful if they are followed. It’s no use having a definitive set of coding guidelines for a joint programming project if everyone continues to write in their own style regardless. You can’t get away with different coding styles even if every team member works on a different section which is encapsulated, because at some point that person will go on holiday, leave, fall under a bus etc. and someone else will have to maintain their code.
If you are running a joint project, you might consider putting your foot down and basically refuse to accept any code that does not conform to your published guidelines, regardless of its technical merit. This may sound somewhat draconian and off-putting to developers at first, but once everyone begins to code to the guidelines you’ll find it a lot easier to manage the project and you’ll get more work done in the same time. It will require a lot of effort from some of your developers who don’t want to abandon their coding habits, but at the end of the day different coding styles will cause more problems than they’re worth.
Editor settings
Tabs v. spaces
Ahh, the endless debate of tabs v. spaces. I used to be a fan of tabs, but I’ve come round to the argument that spaces are better — apart from anything else you can guarantee that they will look the same regardless of editor settings. The other benefit to using two spaces (which is the number I work with) is that code doesn’t start to scroll off the right side of the screen after a few levels of indentation.
Linefeeds
The three major operating systems (Unix, Windows and Mac OS) use different ways to represent the end of a line. Unix systems use the newline character (\n
), Mac systems use a carriage return (\r
), and Windows systems use a carriage return followed by a line feed (\r\n
). If you’ve ever opened a file created in Windows on a Unix system, you will probably have seen lots of odd characters (possibly represented by ^M
) where you would expect to see a clean line break.
I use simple newlines all the time, because I develop on Linux and my deployment target is almost always a Linux system.
If you develop on Windows (and many people do), you can set up most editors to save files in Unix format.
Naming conventions
Variable names
A lot of textbooks (particulary those about Visual C++) will try to drum Hungarian notation into your head. Basically, this means having rules such as pre-pending g_
to global variables, i
to integer data types etc. Not only is a lot of this irrelevant to PHP (being a typeless language), it also produces variable names such as g_iPersonAge
which, to be honest, are not easy to read at a glance and often end up looking like a group of random characters strung together without rhyme or reason.
Variable names should be all lowercase, with words separated by underscores. For example, $current_user
is correct, but $currentuser
, $currentUser
and $CurrentUser
are not.
There’s no hard and fast rule when it comes to the length of a variable name, so just try and be as concise as possible without affecting clarity too much. Generally speaking, the smaller the scope of a variable, the more concise you should be, so global variables will usually have the longest names (relative to all others) whereas variables local to a loop might have names consisting only of a single character.
Constants should follow the same conventions as variables, except use all uppercase to distinguish them from variables. So USER_ACTIVE_LEVEL
is correct, but USERACTIVELEVEL
or user_active_level
would be incorrect. In some projects I’ve seen a single c character used as a prefix for project-defined constants – as opposed to those provided by PHP – in which case an example name would be: cUSER_ACTIVE_LEVEL
. This helps distinguish project-defined constants, as well as avoiding name clashes.
Loop indices
This is the only occasion where single character names are permitted. Unless you already have a specific counting variable, use $i
as the variable for the outermost loop, then go onto $j
for the next most outermost loop etc. However, do not use the variable $l
(lowercase ‘L’) in any of your code as it looks too much like the number ‘one’.
Example of nested loops using this convention:
for ( $i = 0; $i < 5; $i++ ) { for ( $j = 0; $j < 4; $j++ ) { for ( $k = 0; $k < 3; $k++ ) { for ( $m = 0; $m < 2; $m++ ) { foo($i, $j, $k, $m); } } } }
If, for some reason, you end up nesting loops so deeply that you get to $z
, consider re-writing your code. I’ve written programs (in Visual Basic, for my sins) with loops nested four levels deep and they were complicated enough. If you use these guidelines in a joint project, you may way to impose an additional rule that states a maximum nesting of x levels for loops and perhaps for other constructs too.
Function names
Function names should follow the same guidelines as variable names, although they should include a verb somewhere if at all possible. Examples include get_user_data()
and validate_form_data()
. Basically, make it as obvious as possible what the function does from its name, whilst remaining reasonably concise.
Function arguments
Since function arguments are just variables used in a specific context, they should follow the same guidelines as variable names.
It should be possible to tell the main purpose of a function just by looking at the first line, e.g. get_user_data($username)
. By examination, you can make a good guess that this function gets the data of a user with the username passed as the $username
argument.
Function arguments should be separated by spaces, both when the function is defined and when it is called. However, there should not be any spaces between the arguments and the opening/closing brackets.
Some examples of correct/incorrect ways to write functions:
get_user_data( $username, $password ); // incorrect: spaces next to brackets get_user_data($username,$password); // incorrect: no spaces between arguments get_user_data($a, $b); // ambiguous: what do variables $a and $b hold? get_user_data($username, $password); // correct
Code layout
Including braces
Braces must always be included when writing code using if
, for
, while
etc. blocks. There are no exceptions to this rule, even if the braces could be omitted. Leaving out braces makes code harder to maintain in the future and can also cause bugs that are very difficult to track down.
Some examples of correct/incorrect ways to write code blocks using braces:
/* These are all incorrect */ if ( condition ) foo(); if ( condition ) foo(); while ( condition ) foo(); for ( $i = 0; $i < 10; $i++ ) foo($i); /* These are all correct */ if ( condition ) { foo(); } while ( condition ) { foo(); } for ( $i = 0; $i < 10; $i++ ) { foo($i); }
Where to put the braces
Braces should always be placed on a line on their own; again there are no exceptions to this rule. Braces should also align properly (use two spaces to achieve this) so a closing brace is always in the same column as the corresponding opening brace. For example:
if ( condition ) { while ( condition ) { foo(); } }
I know that a lot of programmers prefer to put the first brace on the first line of the block they are encoding, to prevent wasting a line. This is sometimes called K&R Style, after the style used in The C Programming Language book. I personally disagree with this style, although I’ve adhered it in projects where it has been part of the coding guidelines.
Spaces between tokens
There should always be one space on either side of a token in expressions, statements etc. The only exceptions are commas (which should have one space after, but none before), semi-colons (which should not have spaces on either side if they are at the end of a line, and one space after otherwise). Functions should follow the rules laid out already, i.e. no spaces between the function name and the opening bracket and no space between the brackets and the arguments, but one space between each argument.
Control statements such as if
, for
, while
etc. should have one space on either side of the opening bracket, and one space before the closing bracket. However, individual conditions inside these brackets (e.g. ($i < 9) || ($i > 16)
) should not have spaces between their conditions and their opening/closing brackets.
In these examples, each pair shows the incorrect way followed by the correct way:
$i=0; $i = 0; if(( $i<2 )||( $i>5 )) if ( ($i < 2) || ($i > 5) ) foo ( $a,$b,$c ) foo($a, $b, $c) $i=($j<5)?$j:5 $i = ($j < 5) ? $j : 5
Operator precedence
I doubt very much that any developer knows the exact precedence of all the operators in PHP. Even if you think you know the order, don’t guess because chances are you’ll get it wrong and cause an unexpected bug that will be very difficult to find. Also, it will make maintaining your program a living nightmare for anyone who doesn’t know the precedence tables in so much depth. Always use brackets to make it absolutely clear what you are doing.
$i = $j < 5 || $k > 6 && $m == 9 || $n != 10 ? 1 : 2; // What *is* going on here?!? $i = ( (($j < 5) || $k > 6)) && (($m == 9) || ($n != 10)) ) ? 1 : 2; // Much clearer
N.B. If you are using expressions like the one above you should split them up into smaller chunks if possible, even if you are already laying them out in the correct format. No one should have to debug code like that, there are too many conditions and logic operators.
SQL code layout
When writing SQL queries, capitialise all SQL keywords (SELECT
, FROM
, VALUES
, AS
etc.) and leave everything else in the relevant case. If you are using WHERE
clauses to return data corresponding to a set of conditions, enclose those conditions in brackets in the same way you would for PHP if
blocks, e.g. SELECT * FROM users WHERE ( (registered = 'y') AND ((user_level = 'administrator') OR (user_level = 'moderator')) )
.
General guidelines
Quoting strings
Strings in PHP can either be quoted with single quotes (''
) or double quotes (""
). The difference between the two is that the parser will use variable-interpolation in double-quoted strings, but not with single-quoted strings. So if your string contains no variables, use single quotes and save the parser the trouble of attempting to interpolate the string for variables, like so:
$str = "Avoid this - it just makes more work for the parser."; $str = 'This is much better.'
Likewise, if you are passing a variable to a function, there is no need to use double quotes:
foo("$bar"); // No need to use double quotes foo($bar); // Much better
Finally, when using associative arrays, you should include the key within single quotes to prevent any ambiguities, especially with constants:
$foo = bar[example]; // Wrong: what happens if 'example' is defined as a constant elsewhere? $foo = bar['example']; // Correct: no ambiguity as to the name of the key
However, if you are accessing an array with a key that is stored in a variable, you can simply use:
$foo = bar[$example];
Shortcut operators
The shortcut operators ++
and --
should always be used on a line of their own, with the exception of for
loops. Failure to do this can cause obscure bugs that are incredibly difficult to track down. For example:
$foo[$i++] = $j; // Wrong: relies on $i being incremented after the expression is evaluated $foo[--$j] = $i; // Wrong: relies on $j being decremented before the expression is evaluated $foo[$i] = $j; $i++; // Correct: obvious when $i is incremented $j--; $foo[$j] = $i; // Correct: obvious when $j is decremented
Optional shortcut constructs
As well as the useful increment and decrement shortcuts, there are two other ways in which you can make your PHP code easier to use. The first is to replace if statements where you are assigning one of two values to a variable based on a conditional. You may be tempted to write something like this:
if ( isset($_POST['username']) ) { $username = $_POST['username']; } else { $username = ''; } if ( isset($_POST['password']) ) { $password = md5($_POST['password']); } else { $password = ''; }
Whilst the above code works and makes it obvious what you are doing, it’s not the easiest or clearest way if you want to run through a list of different variables and do a similar thing to all of them. A more compact way would be to use the ternary operator ? :
like so:
$username = isset($_POST['username']) ? $_POST['username'] : ''; $password = isset($_POST['password']) ? md5($_POST['password']) : '';
I would recommend using the latter notation wherever you are checking assigning a number of variables one of two values depending on a boolean expression, simply because it makes the code easier to scan and also makes it obvious what you are doing without being unnecessarily verbose.
Use constants where possible
If a value is not going to change throughout the execution of your script, then use a constant rather than a variable to define it. That way, if you do change the value by accident, the PHP parser will output an error and allow you to fix the problem, without it causing any unforeseen side effects.
Remember that constants should never be enclosed in strings, either single or double. You must always use concatenation if you wish to include a constant’s value in a string.
Turn on all error reporting
A lot of code I’ve downloaded from the web and tried to use has failed on my machines because the developers switched off the E_NOTICE
flag in their PHP configuration for some reason or another. As soon as I bring it onto my system, where error reporting is set to E_ALL
(i.e. show all errors, including references to variables being used without being initialised), the PHP interpreter spews out dozens of error messages which I then have to fix before I can use the script.
What you need to remember as a developer is that the person who uses your script may not have exactly the same php.ini configuration as you so you aim for the lowest common denominator, i.e. all error reporting enabled. If your code works with E_ALL
set, then it will also work with any other error reporting configuration, including when all error reporting is turned off (e.g. on sites where PHP errors are logged to a file instead).
Of course, on a production site you might want to turn off all errors, or at least redirect them to a file, to avoid admitting to users that your scripts are broken in some way. That’s perfectly fine and in many cases the recommended action to take. So long as your scripts work with all error reporting turned on, it doesn’t matter where they are deployed.
Side effects of short-circuit logic evaluation
One feature of PHP that can catch even expert developers out, as well as being hard to track down, is the shortcuts taken when evaluating boolean expressions. This is when PHP stops evaluating a boolean expression part way through because it already knows the result. For example, if you have two expressions combined with the &&
(AND) operator, you know that if the first expression is false then the whole thing must be false because anything AND’d with false is false.
This short-circuit evaluation can catch you out if one or more of the expressions performs and operation as part of being evaluated. For example, $a = 5
sets the value of $a
to 5
and evaluates to 5
. The safest route is to perform all of your operations first, store any results in variables and then evaluate them.