PHP Security Anti-Patterns


Free chapter from book on common coding patterns to avoid

PHP Security Anti-Patterns

Reasons Security Holes Exist

This chapter looks at many various contributors to insecure code. These range from cases of simple misinformation to cases of simple forgetfulness. Below are many common scenarios that can be identified and changed into better practices and habits.

Anti-Pattern #1

Not Matching Data Character Set To Filter Character Set

智慧的開始是用正確的名字叫事物 -The beginning of wisdom is to call things by their right names.

Mismatches between the character set of the data being parsed, and the functions performing the parsing is a systemic, root level problem. If web security, based in the scripting environment of PHP, Javascript, MySQL, and HTML is based on how characters are interpreted, then care must be taken from the very start to ensure that a string of user supplied text is comprised of the expected character encoding, and care must be taken to ensure that every filter and sanitizer operating on that data must be set to use the expected character encoding. The rules and the data have to match before anything else, or the rest doesn’t matter.

Security wrecks effect many
Security wrecks effect many

The practice of not specifying and ensuring character set uniformity is both widespread and largely ignored. It is a bad practice perpetuated by numerous poor examples which rely on function default parameters that do not match actual use. Because of character set differences, code that works correctly and properly in one environment will not necessarily work correctly in a different environment of different data sets with different defaults. An immediate example would be when the character set of the web page does not match the default settings of the PHP environment as set by php.ini. A second example is when attacker supplied data specifically alters a character set to bypass filters that do not enforce character set matching.

This book emphasizes UTF-8, but the security demand remains the same no matter which character encoding needs to be used. If ISO 8859-1 needs to be used in the application, then ensure that the text is valid IS0 8859-1 and that all application data filters are set to internally process ISO 8859-1. If Windows-1250, then likewise ensure all text and filters conform to Windows-1250.

The foundational basis of secure programming rests on properly parsing and examining data. This can only be done when the character set/encoding of the supplied data matches the character set of the filters used. Mismatched data is the root of much evil.

Take explicit action and explicitly set the character sets used by the application processes.


1. Decide 
    character encoding
2. Configure ALL 
    internal functions, 
    filters, and structures
3. Ensure 
    incoming data is chosen encoding
4. Convert or Drop 
    data not conforming


Misinformation Anti-Patterns

Misinformation is a subtly popular and common contributor to security problems. Answers to questions get posted based on opinion, or “I think this works”. These examples get copied into production applications over and over until becoming defacto standards.

聰明的人做出自己的決定,個無知的人跟隨公眾輿論 -A wise man makes his own decisions, an ignorant man follows public opinion.

This presents a real problem for at least two reasons. First, the advice is a poor solution. A security solution is something that needs to actually be tested and verified. Opinion is not good enough. Second, a bad habit, once formed is difficult to break. Copying bad examples leads to bad habits, which keep perpetuating.

One important example is this. A simple Google search found several available instances, ie page 1-3 listings, of advice explicitly advising to “Turn off SSL peer verification via the false parameter in order to make curl SSL work”, with no easily available advice on how to make it work with a secure setting of true. This is troubling because SSL Verify Peer is a critical confirmation action. It is not a convenience. SSL is not only about encryption, it is also about proper identity verification.

It is unwise and insecure to turn off peer verification and confirmation of who your application is talking to.

As a result of the previous bad and incomplete advice, the following line of code has been copied, reposted, and re-implemented many, many times.


curl_setopt($curl, 
            CURLOPT_SSL_VERIFYPEER, 
            FALSE);

The negatives effects of this are several. First, it cancels peer end-point verification altogether, which is a main purpose of SSL certificate verification, and allows for man in the middle interception attack. But because it is encrypted, this fact is easily forgotten. The fact that there even is a TRUE setting, and that it is needed, gets forgotten. The mind stops seeing it as a problem. It becomes a bad habit to just use this code, and just switch it off without checking.

The Mantra Anti-Pattern

Mantras take the form of “Always use….” A common new mantra is “Always use PDO Prepared Statements”. While there is some truth, and some benefit to that saying, it isn’t completely helpful. Implementing PDO prepared statements wherever possible is undeniably a best practice. The reality is simply that it is not always possible to use them in every situation. For example, PDO prepared statements cannot accommodate variable columns and cannot be implemented in this case. Legacy code cannot use PDO statements. Without understanding the problems solved by PDO prepared statements, or alternative defenses, then security problems will still persist when workarounds are needed or required. It would be nice if security problems could be put in a box, and one “always use” solution worked. It just isn’t the case. Mantras simply promote complacency and work against understanding.

A second common mantra is “Always use a framework”. This one borders on useless, and is the equivalent advice to build all homes the same way, with the same architect, and the same blueprints. The magnificent freedom of development does not rest on working with or depending on a 3rd party library. Frameworks simply cannot be leveraged in all cases. Also, a current project cannot be retrofitted with a framework simply to overcome an immediate problem. Frameworks are not perfect, and frameworks must be well understood and implemented properly to benefit from whatever protection they might provide. That task is not trivial. There is a learning curve involved. That said, frameworks are indeed powerful and highly useful libraries of reusable tools. It is advisable to learn at least one framework well because they solve many common implementation issues, which makes them valuable and rewards the time invested learning them. This fact however does not excuse a developer from understanding the issues solved, and knowing how to proceed correctly without one.

Another reason for learning a framework is that recently, Zend Frameworks, Yii, Symphony, WordPress, and others have implemented very capable, context aware security filters that are organized by name and context type, which allows them to be leveraged appropriately in securing an application. These filters are very powerful tools worth knowing about.

The last bad example of a mantra takes the form of “Always use function X” with either bad or insecure default parameters given. A very common example is “You only need to use htmlspecialchars(“$data”) instead of htmlentities($data) to avoid encoding all characters”, and very commonly this advice is taken as is, never being adjusted to use the correct parameters for the actual data type used in the environment. Both these functions, depending on environment, can be very insecure with default settings as shown. Depending on application, the default of ENT_COMPAT may not be sufficient since it will leave single quotes unencoded, and the default character set of PHP may not match the actual supplied data passed in. The result is improper filtering leading to an incorrect result.


Anti-Pattern #2

Not Having a Checklist

A checklist is a crucial element to any project. No one can remember it all.

We have a PHP Security Checklist here.

Anti-Pattern #3

Not Using a Content Security Policy

CSP is one of the strongest architectural enforcement tools you can employ. Use it right from the start. Not using it leaves you open to a variety of Cross-Site Attacks

We have a PHP Content Security Policy Guide here.


Critical Data Type Understanding and Analysis AntiPatterns

One of the most widely spread and perpetuated misunderstanding is the hope that there is a singular method to clean data. There is not.

Instead, consider the following case of a poor, single data type sanitization process, and an explicit, multi-data type sanitization process applied to incoming data.

Single Data Type Anti-Pattern

There are many postings on the web advising how to clean data by treating all data as the same, both in type and in purpose, and not many are correct. The truth is that when all data is treated the same way, a developer loses control over the process. Here is one example of a poor solution posted on the web. It is poor because it treats data as the same and applies improper filters for the task at hand.

A Poor Security Filter Application

function cleanID(){
    $id = 
        mysql_real_escape_string( 
            intval( 
                strip_tags($_GET['id'])
                )
            );
    return $id;
}
$id = cleanID();
$result = 
        mysql_query("SELECT name 
                FROM users 
                WHERE id = $id");

The cleanID() function applies three filters to a variable meant to be an integer in an effort to make it safe. Because $id is meant to be an integer, implied by the application of intval(), then only the intval() function is necessary in this case because actual number characters are not dangerous and do not need to be escaped.

If $id was by design comprised of alphanumeric characters, such as 456BBC, then $id would need to be treated as a string, and intval() could not be applied.

mysql_real_escape_string() works on strings, hence the name, and escapes potential SQL command characters according to the character set interpretation of the open MySQL connection. It makes sure ISO interpretation is applied to ISO data, or that UTF-8 interpretation is applied to UTF-8 data.

mysql_real_escape_string() has no effect whatsoever on integer values, and is open to exploitation if the variable value is not additionally quoted inside the SQL string. See example below.

strip_tags() is useful for removing tags from a variable when the variable, by design, is not supposed to contain HTML tags. It cannot be counted on to guarantee security, and has no effect on the integer ID value in this example.

As developers, there is a responsibility to understand how and why a mechanism works, so let’s break it down.

First, because the variable, $id, is intended for use as an integer, the sanitization process is quite simple, as we see below. The following code, using just one sanitization technique, is completely safe because the untrusted input is converted into a integer, therefore no interpretation anomalies exists, and the SQL will be unaffected. This explicitly converted integer can be safely inserted into the query without escaping because it is now a completely benign set of number characters (0-9).

A Surprisingly Safe Implementation

$id = intval($_GET['id']); 
if ($id > 0)
$result = pdo->query ("SELECT name 
                    FROM users 
                    WHERE id = $id"); 

This statement is secure. No escaping or quoting needed because of the explicitly converted integer. An actual integer does not need to be escaped for MySQL.

While the two above statements are secure if used together, as in this one case, this is NOT recommended, NOR a best practice, because it is open to mistakes. Anytime intval() is forgotten, a very large security hole is opened because the $id variable is not quoted, or escaped, inside the SQL statement.

The important point is to understand why it is safe in this case. The value is made safe because it is explicitly converted to an integer, consisting of characters (0-9), and the underlying integer bits cannot harm the SQL by causing the statement to be misinterpreted by the SQL engine compiler. Again, for the sake of repetition, an actual integer does not need to be escaped for MySQL.

This knowledge can be useful when it needs to be wielded by necessity for performance reasons in high transaction environments because the above statement is the fastest implementation of a query. A speedy legacy equivalent to the PDO implementation would be


$id = intval($_GET['id']); 
$result = mysql_query("SELECT name 
                FROM users 
                WHERE id = $id"); 

this.


$id = (int)$_GET['id']; 
$result = mysql_query("SELECT name 
        FROM users 
        WHERE id = $id"); 

The output is:
        SELECT name 
        FROM users 
        WHERE id = 55 

After the cast, $id is a numeric integer and no longer a string representation. Any part that is not numeric is removed. Quoting and escaping are not needed as long as the parameter is indeed an actual integer. Again, not a best practice, but important to know and understand. The lack of quotes here is a security hole whenever the value is a string. Adding quotes here is defense in depth, which is a best practice.

你不能阻止悲傷的鳥飛過你的頭,但你可以防止他們在你的頭髮上築巢 -You cannot prevent the birds of sorrow from flying over your head, but you can prevent them from building nests in your hair.

The latest best practice, when applicable and possible, is to instead use prepared statements. The reason is not because it is a more secure escape method, but because they automate the process of escaping. It is the automation which greatly helps prevent security holes through preventing accidents of forgetfulness. If all queries are implemented as prepared statements, and one is singled out and converted to a straight query for performance reasons, then that is the benefit of an applied best practice, an optimization easily made without compromising security elsewhere.

PHP String Security

STRINGS CHANGE EVERYTHING
Poor PHP Security
Don't be him

Manage your string security context.

A developer should consider the functions as they were meant to be used in a given context, as the following demonstrates. Consider the case where the parameter is an actual string value, and not an integer value.

For example, a name will most likely not have a need for HTML tags, so remove them with strip_tags(). This is done not for security purposes, but because the design specification says HTML tags shouldn't be part of the name.


A Legacy Best Practice

//The name string should not contain 
//HTML tags, remove them per design spec
//this is not a security measure
//this is design enforcement
$name = strip_tags($_GET['name']);

//This string now needs to be 
//properly escaped for output into database
//preserving the name as is
$name = 
    mysql_real_escape_string($name);

//make sure variable 
//is quoted as well as escaped
$result = mysql_query("SELECT id 
                FROM users 
                WHERE name = 
                ‘{$name}’");

The output is:
SELECT id FROM users 
            WHERE name = ‘BumbleBee’    


A PDO Query Best Practice

//The name string should not contain 
//HTML tags, remove them per design spec
//this is not a security measure
//this is design enforcement
$name = strip_tags($_GET['name']);

//The name string is quoted and escaped
$quotedName = “SELECT id 
                FROM users 
                WHERE name = 
                {$pdo->quote($name)}";

$result = $pdo->query($quotedName);

The output with a quoted parameter:
    SELECT id 
    FROM users 
    WHERE name = ‘OptimusPrime’    

The name string is quoted and escaped via PDO quote(). Visual inspection is easier because of no embedded quotes in SQL string. The quotes are produced in the output as part of what PDO::quote() does for you.


A Current Best Practice Approach

//The name string should not contain 
//HTML tags, remove them per design spec
//this is not a security measure
//this is design enforcement
$name = strip_tags($_GET['name']);

//ensure only safe ASCII characters
if(ctype_alnum($_GET['name']))
{
    $name = $_GET['name'];
    pdo->prepare(SELECT id 
                FROM members 
                HERE name = :name");
    pdo->bindvalue(":name", 
                    $name, 
                    PDO_STR);
    pdo->execute();
}

The above code has three levels of defense. First, ctype_alnum() assures that the untrusted string, $_GET parameter, name, only a contains the given characters a-z, A-Z, 0-9, which are safe. Second, a prepared statement is used, and third, the explicit treatment of the data as a string type in the bound parameter.

Alternatively, the code below is NOT SAFE, even though it is very similar.


$_GET['id']  =  
    “46; DELETE FROM members”; 

$id = mysql_real_escape_string(
    $_GET['id']);  
$result = mysql_query(
    "SELECT name 
    FROM members WHERE id = $id");   

First, note that the usage of $id remains a string, and not an explicit integer like in the previous example. Here, the BAD SQL IS NOT ESCAPED by mysql_real_escape_string(), which properly, but ineffectively filters the id variable. The root problem is the treatment of $id as a string and as an integer. Since $id is unquoted, and the DELETE keyword is allowed through, the resulting SQL statement is now turned into two SQL statements.


     'SELECT name 
        FROM users 
        WHERE id = 
        0; DELETE FROM users';

Notice that mysql_real_escape_string(), by design, was unable to eliminate this threat. There was nothing to escape. The input submitted was valid SQL.

For direct comparison, if $_GET['id'] = “46; DELETE FROM members”;

This is safe:


$id = (int)$_GET['id']; 
$result = pdo->query(
            "SELECT name 
            FROM members 
            WHERE id = $id");  

This is not:


$id = mysql_real_escape_string(
            $_GET['id']);  
$result = mysql_query(
            "SELECT name 
            FROM members 
            WHERE id = $id");

The difference is lies in the explicit casting to an integer, the ineffectiveness of mysql_real_escape_string() for this attack, and the lack of quotes surrounding the variable $id in the SQL statement.

The following defense in depth is safe because PDO::quote() escapes and quotes the variable, which in this case was converted to an actual integer via the cast.


$id = (int)$_GET['id']; 
$result = pdo->query(
        "SELECT name 
        FROM members 
        WHERE id 
        = pdo->quote($id)"); 

To prevent a security hole in this case, two things have to always be remembered, explicitly convert the value to an integer, and quote the value inside the SQL statement to prevent statement alteration. The quoting is not necessary for explicit integers, but mandatory for strings, or the string representation of an integer. PDO prepared statements solve these constant implementation problems. This is why it is a best practice to use them wherever possible. Not everyone gets to work with a codebase that supports the latest tools.

If the variable, $id, had been quoted, the resulting single SQL statement would have been:


    'SELECT name 
        FROM users 
        WHERE id = 
        "0; DELETE FROM users"';    

which would not have matched an ID. Internally, MYSQL would have converted the string "0; DELETE FROM users" to an integer for comparison.

There is one area where prepared statements offer a higher degree of protection. In true prepared statements, where two server calls are made, where the actual SQL statement is compiled first without user supplied variables, it is safer because untrusted input cannot ever then alter the SQL logic. In emulated prepared statements, where the untrusted input is automatically escaped first, then compiled with the SQL statement, mysql_real_escape_string() is equal to pdo->prepare() n terms of actual protection. Again, the real advantage here is the automation PDO offers. Proper automation is one of the best security tools, because forgetfulness is one of the worst offenders which can happen to anyone, at any time, and frequently does.

Emulated prepared statements serve an important purpose. They prevent dual round trips to the SQL server. It takes two trips to trips to the server to implement true prepared statements, a first trip to compile the SQL, and a second trip to actually execute the statement with the variables. In some cases, a single trip is needed to preserve high traffic performance. A developer needs to know when each type of prepared statement is required.

All Incoming HTTP Data Are Strings

It is important to understand the following basic fact.


    ALL INCOMING DATA 
            FROM HTTP REQUESTS 
                    ARE STRINGS

The data captured in $_post, $_get, and $_request super global arrays are strings, and not any other type. Two main categories of data are strings which represent everything, names, text, dates, alphanumeric IDs, etc… and strings which represent numbers. A simple, but important fact that seems to lead to a great deal of confusion regarding what type of filter to apply is that a string representation of an integer is not an integer. It is a string, and needs to be treated as such. After a string representation of an integer is explicitly converted to an integer, via intval(), or cast via (int), it needs to be treated as an actual integer type.

It is the job of the application logic to determine their actual value usage in the application as either strings, integers, or other types, and convert to explicit types as needed.

Treating Variables By Type

When $_POST['id'], as a string, is meant to be an integer, we explicitly make it one.


    $id = intval($POST['id']);

and treat it as an integer throughout the remainder of its life.


//defense in depth 
//safe 
//but not necessary database escaping
pdo->bindvalue(":id",
                $id, 
                PDO_INT);

//defense in depth 
//safe 
//but not necessary output escaping
echo htmlentities($id, 
                ENT_QUOTES, 
                "UTF-8");
//could also safely do the following
//there are no unsafe characters
//only integer
echo $id

Because $id is now an actual integer after conversion, and not a string, it can be safely output to HTML or the database without escaping. There are no dangerous characters present. The defense in depth, provided by the prepared statement function PDO::bindValue() and by specifying the INT integer type, handles errors of omission for when intval() is forgotten. This is important.

When $_POST['name'], as a string, is meant to be a string, we preserve it as a string,


$name = $_POST['name'];

and continue to treat it as an unsafe string throughout its life.


$name = strip_tags($name);
pdo->bindvalue(":name", 
                $name, 
                PDO_STR);

//safe AND necessary 
//output escaping
echo htmlentities($name, 
                    ENT_QUOTES, 
                    "UTF-8");

//this output not safe because 
//the string could contain 
//dangerous characters
echo $name;

Variable Type And Filter Relationship

There is a very important relationship between the SQL table column types that are defined in the database and the variables used in the application. They are tied together. When they do not match, data is irrecoverably lost. Variable data types and filtering should be applied based on table column type mapping.


For example:

User IDs and timestamps are integers, 
    and can have ranges

User names and comments are strings:
    A character set (ASCII or UTF-8)
    A defined length 
            (CHAR(20) or VARCHAR) 
    Allowable characters:  
        Names may have 
            underscores and dashes
            but not HTML tags or quotes 
        Comments may have 
            HTML BOLD tag only

This gives very clear and specific information for defining and applying the correct filtering to variables. In fact, one could say that security begins with good database planning. The filtering strategy flows from the database decisions and column construction.

Good technique recognizes that not all application variables are strings to be 'cleaned' by a single global function, mysql_real_escape_string( ( striptags($_GET['var']))).

Security is made more difficult by treating all undefined types equally. Defensive coding is made easier by proper identification and treatment of actual data types.

Validation By Type Process

A Complete Validation By Type

//remove possibility of vague and 
//unintended processing
unset($_REQUEST);
//remove GET this script 
//processes POST only
unset($_GET);

if(ctype_alnum($_POST['userName'])) 
{
$userName = $_POST['userName'];

$passHash = hash('sha256', 
            $_POST['password']); 

$pageID   = intval(
            $_POST['pageID']);

$email  = filter_var($_POST['email'], 
            FILTER_SANITIZE_EMAIL));

//immediately delete 
//clear text password
unset($_POST['password']); 

//remove possibility of future access 
//to raw data
unset($_POST);
}
else
    exit();

//update database with unescaped, 
//unquoted variables
pdo->query(‘INSERT 
            INTO users ($userName, 
                $passHash, pageID)’
            VALUES ($userName, 
                $passHash, $pageID));

//update database with escaped and 
//quoted email variable
if( filter_var($email, 
            FILTER_VALIDATE_EMAIL)))
{
    pdo->query(‘INSERT 
        INTO users (email)
        VALUES (pdo->quote($email)’);
}

//print to HTML without output escaping
echo $userName;
echo $passHash;
echo $pageID;
//print to HTML with output escaping
echo htmlspecialchars($email, 
                    ENT_QUOTES, 
                    “UTF-8”, 
                    false);

Surprisingly, $userName, $passHash and $pageID are safely inserted into the database without escaping, and safely echoed to HTML without escaping while $email is not. $email must be escaped in both context cases. Why is this so?

Validation Analysis

The resulting state of the variables after sanitization would be the following. The first three variables, $userName, $passHash, and $pageID, are quite harmless. If $userName passes the test, $userName will be guaranteed to only contain (0-9, A-Z, a-z). The result is that this could be echoed to HTML safely without escaping, or input directly into a database without escaping. While doing so would be a poor security practice, would not provide a double measure of defense in depth, and would not be recommended, it would be safe. $passHash would also only contain the harmless lower case hexit characters (0-9, a-f). It would not matter what the user entered. The most dangerous attack strings would be hashed into a completely different and harmless ASCII string containing only (0-9, a-f). For example SHA256 hash of the following dangerous string


//code sent from web to PHP input array
$_POST[‘password’]  = 
                “; DELETE FROM users; 
                -- ^#alert(1);”;

$pass       = $_POST[‘password’];
$passHash   = hash('sha256', $pass);

//would produce the following benign 
//and harmless 64 character string  

‘0e2e13c20cd1d80248cfd64b241fb976008bdb019eba32082f199857cd3adef1’   

The transformed string contains no possible control characters which would require escaping in order to be safe for either HTML output, or insertion into a SQL query statement.

The $pageID variable would be guaranteed to be an integer. An actual integer is safe to echo directly to HTML or insert directly into a SQL statement safely without escaping. This is both a scary and important concept to understand.

The $email variable is different and more complex to process. $email must undergo four separate processes:


    1. Email sanitization
    2. Email validation
    3. Database escaping
    4. HTML escaping.   

After sanitizing with filter_var(), $email will have had illegal email characters removed. That is an important step, but not complete. At this point, it is not guaranteed to be safe for insertion into a database, or echoed to HTML. It also will not be guaranteed to be a valid email yet. filter_var($_POST['email'], FILTER_VALIDATE_EMAIL)) will need to check that. If valid, $email will need to be protectively escaped for the database context. PDO::quote() then escapes and quotes the email string for safe insertion into the database.

All variables should be escaped according to context. There is no harm done to the data itself in escaping for context, at the point of context change. This assists in preventing disastrous errors of omission, while preserving the data.

Filter input, escape output. This provides security in depth, as well as greater understanding.

NOTE: $_POST[‘password’] is deleted by unsetting it with unset() after hashing. The application never knows, nor needs to know the original password. This will be utilized later in example code that stores and compares only hashes of user passwords.

Input Same As Output Anti-Pattern

Input filtering and Output Escaping are two different and critical aspects of Web Application security. Each one has to be dealt with and each has to be handled differently. The anti-pattern is when these two are treated either as the same, or simply making no distinction between the two. For example, here is a real example of the only filtering mechanism from an actual PHP application.


$cleanArray = array();
foreach($_REQUEST as $key => $value)
{
    $key = addslashes( 
        trim( strip_tags($key)));
    $value = addslashes(
        trim( strip_tags($value)));

    $cleanArray[$key] = $value;
}
mysql_query(“INSERT 
            INTO users (name) 
            VALUES($clean['key'])”);

//echo into HTML tag
echo  $_cleanArray[$key];

Obviously there is a sincere effort to sanitize the data and be ‘CLEAN’. The assumption here is that data is cleaned of any dangerous characters. The problems with this are many.


No differentiation between 
        GET, POST, and COOKIE
No accounting for variable type
No specific character set encoding
Not matching char set to DB char set
Only HTML tags are removed
JavaScript functions not filtered out
        ie: ‘onmouseover’ 
Only ‘, “, NULL and \ are escaped
No accounting of variable quoting
HTML attribute values not quoted
 

First, the SQL input is still not correctly escaped for the environment. Second, the variable being echoed to HTML has both data removed and embedded slashes which alter the original data.

By not differentiating between GET, POST and COOKIE, the application is open to GET request attack vectors when it might not need to be. Other escaped characters or escape characters in different character encodings are missed. addslashes() may or may not match the character set of the database client connection. If it does not match, it is open to encoding attack. While strip_tags() can remove many dangerous HTML script tags, other types of Javascript code can pass through and become active via the unquoted HTML attributes.

The Assumed Clean Anti-Pattern

Another side effect of assuming the variables are clean is that a SQL string like the following might be used.


SELECT * FROM Users 
            WHERE id = $cleanID;

The lack of quotes around $cleanID
still leaves it open to attack.
If $cleanID were to equal 

“22; DELETE FROM Users” 

There is nothing for addslashes() to do in this case, and because of the semi-colon, the query would now become two separate queries.


SELECT * 
    FROM Users 
    WHERE id = 22; DELETE FROM Users;

Lastly, there is no accounting for Output Context. Variables are output in any fashion because they are assumed to be clean.

Improper mysql_real_escape_string() Usage

First, it must be said, mysql_real_escape_string() filters strings, not integers. There is no need to escape integers in a SQL statement. mysql_real_escape_string() has no effect on integers, and can lead to the exact same problem above if the escaped variable is not quoted.


SELECT * FROM Users 
        WHERE id = 
        mysql_real_escape_string(
                $cleanID);

Since the variable is neither quoted, nor converted to an integer, the SQL string is open to this kind of attack:


$cleanID = “22 OR 1 = 1”;
SELECT * FROM Users 
        WHERE id = 22 OR 1 = 1; 

This statement is one statement that returns all user records. This occurred because the lack of quotes allowed the “OR 1 – 1” allowed it to become part of the SQL logic. If the variable in the statement had been quoted like this,


SELECT * FROM Users 
        WHERE id = 
        ‘{mysql_real_escape_string(
            $cleanID)}’;

Then the result would have been

SELECT * FROM Users 
        WHERE id = 
        ‘22 OR 1 = 1’; 

Where, because of the quotes surrounding the {}’s, the variable ‘id’ becomes the string “22 OR 1 = 1” that doesn’t match anything, and does not become part of the SQL logic. mysql_real_escape_string() is an improvement over addslashes() because it escapes based on the character set encoding of the current database client connection.

It is very important to match character encoding sets or data will get parsed incorrectly which obviously is not good. However, the function still must be used correctly as in the above example. Again, mysql_real_escape_string() is for strings and not for integers.

The better method in this case would be to not use mysql_real_escape_string() in this case, and use something like intval() instead.


$query = “SELECT * 
            FROM Users 
            WHERE id = "
            . intval($cleanID);

mysql_real_escape_string() is deprecated in favor of mysqli_real_escape_string() and PDO quote() and PDO prepared statements. PHP 5.2 is also now officially deprecated and no longer being updated. Security maintenance is at an end. We deal with it here because legacy code will be around for while and it’s important to understand it.

The rest of this book will focus exclusively on PDO prepared statements and PDO quote(). Example code use PDO prepared statements whenever it can, In cases where PDO prepared statements do not accommodate the type of query needed, such as variable columns names, the code will use the PDO quote() function to filter input to be used with PDO query(), along with white listing to provide secure SQL logic.

There is nothing wrong with the new mysqli library of database functions. It just won’t be used in the code base for this book. Use mysqli, and its prepared statements, if you prefer.

Filtering vs Escaping vs Encoding

When it comes to preventing attacks, and preserving user data, it is important to understand the difference. Filtering usually implies removing data from a stream. The function strip_tags() does this. Data is thrown out. The PDO::quote() function is not actually a filter. It is an escaper for the database context. It escapes the characters passed into. Nothing is removed. The function htmlentities() is an encoder. Data is physically altered when returned from this function. This can be destructive or non destructive based on when it is used. If you save data after being HTML encoded, the data is altered. If you send HTML encoded data to a browser, altered data is sent, but the browser decodes it back, preserving the data. If data is entity encoded, saved to the database, and then entity encoded again for output into HTML, then the data is double encoded, which looks terrible, and in some cases illegible.

Misunderstanding these effects has an effect on both the preservation of user data and the security of your application. A developer needs to be aware of how data needs to flow into and out of these transitions without being destroyed, or opening a security hole.

Only One Output Context Anti-Pattern

echo '<tr>';
foreach($row as $key=>$value)
{
    // value could be data or hyperlink
    echo '<td>'.$value.'</td>';  
}
echo '</tr>'

It is quite common to treat all output the same. This is no longer an acceptable practice. In the case above, the assumption is that the output is HTML, when it could have other contexts, such as a hyperlink that might need URL parameters escaped. Every effort needs to be made to be aware of the output context, and filter, escape or encode for the context.

Lack of Planning Ant-Patterns

A few general points here are ones that every developer knows well. Time is money, unreleased apps do not make money, and the longer something takes, the lower the profit margin. One of the first things to go is time spent on for security. It takes two things to implement security, planning, and coding. Both take time.

In practical terms, one of the main ways to combat this is create a reusable framework for issues you know are going to appear and be ready ahead of time so that lack of planning and lack of time have as minimal a negative impact as possible to your project.

Because security usually isn’t planned for from the beginning of a project, it becomes an add-on process. A process usually done last if there is time. From experience trying to review code and identity all attack vectors, proper input filters, and proper output filters, is neither fun, nor 100% accurate. There is too much to miss.

Lack of Consistency Anti-Patterns


Ln 4  $id = addslashes($_GET[‘id’]);
Ln 5  query = “SELECT * 
                FROM Users 
                WHERE id =  $id”;
…
Ln 46 $name = mysql_real_escape_string(
                $link, 
                (($_GET[‘name’]);

Ln 47 $query = “SELECT * 
        FROM Users 
        WHERE name = ” . $name . ”;

Here, in the same application, there are two different styles of escape operations, disguised as filtering operations, being performed. The four problems are the inconsistencies of addslashes() vs mysql_real_escape_string(), addslashes() does not negate SQL keywords such as DELETE, lack of character set recognition, and the inconsistencies of the SQL string construction. Is it escaped properly? Visual inspection is more difficult.

When there isn’t a template of some kind, when there isn’t a base of reusable functions that actually get reused, not just parts copied and pasted, then lack of consistency occurs, which negatively impacts security.

Lack of Testing Anti-Patterns

“It should work”. The fact is without testing, one just doesn’t know. If there is not a repeatable test that compares a result, chances are highly likely that small things will be missed. Not having tested it, one should not post, nor accept, “I think….” statements, yet the web is full of these kinds of statements.

Mistakes of Process Omission

When doing the quick, “I just want to get it working…..” it is startling to realize how much production code results from these simple quick implementations, and once it works, no one remembers to go back and review it.

Parameter Omission Anti-Pattern

It is safe to say that a majority of function examples on the web use default function parameters, and omit powerful parameter examples that could make the function secure. In each case below, an insecure function call is shown first, and then compared to a secure function call made with non-default secure parameters.

Common HTMLSpecialChars() Default Quoting Problem

htmlspecialchars($name);

        VS

htmlspecialchars($name, 
                ENT_QUOTES, 
                “UTF-8”);

This is a particularly bad set of defaults because it works so well most of the time. The reasons why it is a bad example are many. 1) The default usage visually reinforces a subtle, but strong mental acceptance that all data is equal, and encourages a seriously bad practice of data type avoidance. This kind of habit, once engrained, can be hard to overcome when coding a solution and it one of the leading causes of retrofitting code later in a project because the habitually easy thing was done first. This author confesses to still being bit by this bad habit. 2) The default second parameter is ENT_COMPAT, which does not encode single quotes. This opens a security hole that could allow a single quote to break out of an HTML attribute context. 3) The third parameter specifies character set. If this is not explicitly set, then there is a potential and likely mismatch between the incoming text to be filtered and how the filter is going to detect characters, which undermines the entire security process. Correct processing can only succeed when the character set of the text matches the character set the filter is using.

Once the mind accepts seeing things a certain way, it becomes difficult to see things any other way.

Common Default JSON Construction

json_encode($data);

        VS

json_encode($data, 
            JSON_FORCE_OBJECT);

Most examples on the web encourage the use of json_encode() using default settings, without examining how the array of data passed into json_ecode() is actually constructed. This construction method is important. json_encode() will return either,


an exploitable, top level array 

[ [1,2,3] ]

OR

safe, top level JSON object 

{"0":{"0":1,"1":2,"2":3}}

depending on how the PHP data is assembled before encoding.

Sending a JSON array to Javascript is a known security risk. Two options remain, properly construct the data before passing to json_encode(), in which case the second parameter is not needed, or use the JSON_FORCE_OBJECT parameter. This is discussed in detail with more specific examples in the JSON chapter. Also see OWASP JSON Guidelines.


$exploitable = json_encode( 
    array( 
        array("city" => "New York", 
        "state" => "NY"),
        array("city" => "Chicago", 
            "state" => "IL" )));


       
$safe = json_encode( array( "cities"=> 
    array( 
        array("city" => "New York", 
            "state" => "NY"),
        array("city" => "Chicago", 
            "state" => "IL" )));

To create a JSON object, safe for JavaScript consumption and not a JSON array, using json_encode() with default parameters, a named array element must be used when constructing the array.

Common Turn off Cookie Protections by Default

setcookie("cookieID", "", 0);

        VS

setcookie("cookieID", 
                "SecureUser", 
                1, 
                “/private”, 
                “www.test.com”, 
                true, 
                true );

Common Prevent Escaping with Correct Character Set

htmlentities($name);

    VS 

htmlentities($name, 
            ENT_QUOTES, 
            “UTF-8”, 
            false);

Common Example of Insecure SSL Practices

curl_setopt($curl, 
    CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, 
    CURLOPT_SSL_VERIFYHOST, FALSE);

            VS

curl_setopt($curl, 
    CURLOPT_SSL_VERIFYPEER, TRUE);
curl_setopt($curl, 
    CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($curl, 
    CURLOPT_CAINFO, 
    '/private/cacert.pem');

Common Example of PDO Connection Without Character Set

new PDO('mysql:host=localhost;
        dbname=myDB', 
            $user, 
            $pass);

                VS

new PDO('mysql:host=localhost;
            dbname=myDB ;
            charset=utf8', 
            user, 
            $pass);

                VS

new PDO('mysql:host=local;dbname=myDB',
        $user, 
        $pass,
        array(
        PDO::ATTR_ERR_MODE 
        => PDO::ERRMODE_EXCEPTION,
        PDO::ATTR_DEFAULT_FETCH_MODE 
        => PDO::FETCH_ASSOC,
        PDO::MYSQL_ATTR_INIT_COMMAND 
        => 'set names utf8'));

Common HTML MetaTag Without Character Set

meta http-equiv="Content-Type" 
        content="text/html" 

        VS

meta http-equiv="Content-Type" 
        content="text/html; 
        charset=UTF-8" 

Take it upon yourself to reverse the trend. Post function examples with explicit parameter settings for the required implementation. Doing this, over time, will contribute to the spread of more accurate knowledge, and higher levels of secure code being implemented everywhere.

No Clear Separation of HTML and PHP Code Anti-Pattern

The problem with the following code is that it becomes difficult to track ‘the quote’. By this I mean there can end up being so many concatenated strings that it becomes difficult to know if data is actually being escaped properly. There are just too many quotes to count.

 
if(mysql_num_rows($result))
 {
    echo '<table cellpadding="1" 
                    cellspacing="1" 
                    class="db-table">';

    echo '<tr><th>Post</th>
                <th>Date</th>
                <th>Info</th>
            </tr>';
    echo '<td>'.$value1.'</td>
         '.'<td>'.$value2.'</td>
         '.'<td>'.$value3.'</td>';
}

Too Many Database Function Calls

The code below is a code pattern to avoid. Code like this is common in tutorials. The basic problems with this example are that there are to many holes to plug. Too many SQL statements to protect, output context and data content is more difficult to determine. Code changes become more laborious. There are just too many places that output data needs to be filtered and escaped, and so a developer loses control of security measures.


echo '<h2>Blog List</h2>';
$result = mysql_query('SELECT * 
                        FROM Blogs”);
if(mysql_num_rows($result)) {
        echo '<table cellpadding="0" 
        cellspacing="0" 
        class="db-table">';
        echo '<tr><th>Blog</th>
        <th>Date</th>
        <th>Info</th></tr>';
while($row = mysql_fetch_row($result))
 {
    echo '<tr>';
    foreach($row as $key=>$value) {
        echo '<td>',$value,'</td>';
    }
    echo '</tr>';
}
echo '</table><br />';
echo '<h2>Post List</h2>';
$result2 = mysql_query('SELECT * 
                        FROM Post”);   
if(mysql_num_rows($result2)) {
        echo '<table 
        cellpadding="1" 
        cellspacing="1" 
        class="db-table">';
        echo '<tr><th>Post</th>
        <th>Date</th>
        <th>Info</th></tr>';
while($row2 = mysql_fetch_row($result))
 {
    echo '<tr>';
    foreach($row2 as $key=>$value) {
        echo '<td>',$value,'</td>';
    }
    echo '</tr>';
}
echo '</table><br />';

The solution to this is Separation of Concerns. There is an entire chapter on this topic.

Misleading Filtering Anti-Pattern

Avoid code that makes misleading security claims, and/or cleans filters incorrectly, and avoid names that imply a false sense of security.


function makeSafe($input) {

$safe = addslashes(
            strip_tags($input)
            );
return $safe;
}
$user   = makeSafe($_POST['user']);
$pass   = makeSafe($_POST['pass']);
mysql_query("SELECT name, pass 
                FROM users 
                WHERE 
                name = '".$user."' 
                AND 
                pass = '".$pass."'");

$safeName = makeSafe($row['name']);
echo "Hello,  ".$safeName;

There are a few problems here. The first is the heavy implication that the code is safe. The code itself says it is safe, so one might believe it. The second is destruction of password characters from the filters. A user should be able to use whatever character one wants within a password. This filtering method removes character choice possibilities for no reason. The third is that the data is double escaped, going in and going out without knowing why. The fourth is that the variable destination is unknown, therefore escaping requirement is unknown. addslashes() does not know about the underlying database character encoding set and cannot securely deal with attack characters. The fifth is that original, clear text passwords should not be used at all.

Better techniques are to use prepared statements, and to hash the password for storage. Hashing removes the need to destroy password characters.

Too Many Quotes Anti-Pattern

The battle over quotes is hard enough without adding additional difficultly. Code such as below, which mixes single and double quotes, increases the difficulty of visually inspecting and tracking quotes correctly.

The first example below is safe, just difficult to read.

CUMBERSOME
Multiple string concatenation mixes quotes, difficult to track visually.

echo "<input type='hidden' 
                name='key' 
                value='".$key."' />";

The following code samples are a visually cleaner way of presenting the same value.

CLEANER
No string concatenation, fewer quotes, easier to track visually.
 
echo "<input type='hidden' ' 
                name='key' 
                value='{$key}' />";  

EVEN CLEANER
Straight html using only single quotes, simplest to track visually.

 <input type='hidden' 
            name='key' 
            value='<?php _H($key); ?>' />     
 

Using the first ‘cleaner’ method, enclosing the PHP variable $key in brackets, makes it easier to see that HTML attributes are quoted properly within the string.

Using the second, ‘even cleaner’ method, where HTML is output direct without echo statements is very helpful. Note that the only quotes in the entire line are the single quotes surrounding the attribute values. This is why it is easier to visually examine this line of code. Note also that the variable $key is output escaped inline via _H() as it is being output into the UTF-8 HTML context.

By moving HTML out of PHP and avoiding echo statements, HTML can be formatted, structured better, and highlighted by the editor, which greatly improves visual clarity.

NOTE* _H() is a facade function wrapping htmlentities($data, ENT_QUOTES, ‘UTF-8’) which escapes and echoes the output.

Raw Request Variables As Application Variables


    $id = $_GET['name']);

    if (isset($_GET['name'])) 
    {
        $update = $_GET[$data];
    }
    ……..
    else 
    if (isset($_POST['page'])) {
        ...
    }

Code like this is very hard to ensure that all places are being validated properly. Two problems which arise from code like this is that variable usage is spread throughout the script, and not localized. Mixing POST and GET for data adds processing and intent complexity. If a page is depending on data from both input arrays at the same time, then there is some refactoring to be done, especially in terms of intent. The other problem is that it simply becomes cumbersome to track processing intent or make changes this way. Raw data needs to be abstracted away. It can then be easily cleaned and filtered according to need in one place, at the top of the script.

Poor Error Management Practices

Improper handling of error messages creates two categories of problems. First, they create user satisfaction problems. Second, they create security problems. This may sound odd coming from a security book, but user satisfaction should always come first. Error message annoy users, and are useless from a users standpoint. Users cannot do anything with error messages. Displaying them should be avoided for their sake. Errors should be handled and logged internally. A best practice is to provide directions that are meaningful to the user for the current situation, and to display those directions instead of the actual error, as part of the error handling process. An example might be, on database connection failure, to construct an error handler that emails the administrator the error, but informs the user that “The site is down for maintenance, Please return in two hours”.

From a security standpoint, they leak system information that can be used to attack the site. Detailed error messages obviously reveal details about the inner workings of the applications, and displaying them should avoided. Instead, create an automated logging and internal alert system.

Specific anti-patterns that are very common in this regard are the following statements.

Poor Procedural Error Practice

mysql_query("SELECT * FROM users WHERE id = 5") or die (mysql_error());

Poor Object-Oriented PDO Error Practice

try {
    $pdo->query("SELECT * 
                FROM users 
                WHERE id = 5");
} 
catch (PDOException $exception) {
    echo ($exception->getMessage());
}

Both of these statements handle errors, and both send raw API error messages to users. Each of these practices disregards both points about annoying users with useless information, and revealing too much system information. Internal error messages also look unprofessional and take away from the perception of quality. A better practice is to securely log error message, and display polite, useful instructions to users about what they should do next.

Better Object-Oriented PDO Error Practice

try {
    $pdo->query("SELECT * 
                FROM users 
                WHERE id = 5");
} 
catch (PDOException $exception) {
    log($exception->getMessage());
    echo “Please return later...”
}

Poor Cryptography Practices

Broken PHP Cryptography
Old and Busted

Each of these code snippets are common examples found in both books and web tutorials, and are poor examples of cryptography implementations that need to be discontinued. In particular, these hashes cannot provided protection against modern brute force computing power.

FUNCTIONS TO DISCONTINUE FOR ENCRYPTION PURPOSES

$poorHash = md5($password);
//double hashing
$poorHash = sha1( md5($password))  
$poorHash = md5( rand());
$poorHash = sha1(uniqid(rand(),true));

The problems here are out of date crypto ciphers, md5() and sha1(), and insufficient randomness from rand() and uniqid(). All of these are predictable modes of encryption, open to predetermined rainbow table attack, and provide a false sense of protection.

Double hashing is also known as a poor practice as it increases the chances of hash collision and generating the same hash two different times which is not result intended.

Correct Encryption Randomization Methods Are CSPRNG

//best functions
openssl_random_pseudo_bytes ()
mcrypt_create_iv()
MCRYPT_DEV_URANDOM flag
hash(‘sha256’)
hash(‘sha512’)

//best source
/dev/urand 

Modern PHP Cryptography
New Hotness

NOTE* uniqid() is still useful. The point here is that uniqid() is not good as random entropy for strong encryption. rand() should be replaced with mt_rand() for non-cryptographic random number generation, and neither should be used for encryption.

/dev/urand is also a non-blocking source of random bits which can help with performance.

Poor Cookie Expiration

Incorrectly expiring cookies leads to attack window exposure. Two common ways for this to happen are letting the cookie expire when the browser closes, or setting the expiration time to T-60 minutes or similar time window.

INCORRECT METHODS

1. setcookie("cookieID", "", 0);
2. setcookie("cookieID", "", 
                    time()-3600);

Example One expires the cookie when the browser closes. Example Two expires the cookie an hour ago. The problems are that the users time zone is unknown, and when the user might close the browser is unknown, therefore the window for attack remains open for an unknown period of time. Expire the cookie by setting it to one second past Unix epoch time, which will immediately expire the cookie.

CORRECT METHOD

setcookie("cookieID", "AppUser", 
            1, “/app”, 
            “www.test.com”, 
            true, true);

Explicitly set expiration time that is not a relative time. Explicitly turn SSL requirement on. Explicitly turn JavaScript access off.

Poor Session Management

This one involves never activating or setting any of the secure session settings available to the PHP developer. This includes things like,


Not using SSL landing page
Not logging in over SSL
Not validating session ID over SSL only
Not restricting key cookies to SSL only
Not regularly regenerating session IDs
Not explicitly destroying session IDs
Not marking valid session in $_SESSION
Not setting cookie paths
Not invoking tougher session ID hash
Not making use of expiration 

There are many excellent session management features available in PHP that contribute to and enforce better security if activated and implemented.

Overcoming Anti-Patterns: Patterns, Testing, Automation

Secure Development Mobile Apps Applications
Purchase on Amazon

The main goal of this book is to introduce and encourage techniques that will increase the speed and consistency of your software development process. Hopefully, these techniques will be applied transparently, from the beginning of development, as an integral part, and not retrofitted afterwards.

Test Driven Development will help ensure security measures are put in place right from the beginning. Bad habits are a major contributor to poor security implementations, and TDD can help create new habits.

Software Design patterns help create reusable code that is consistent from project to project. Build process automation tools help things not be forgotten and reduce the time spent setting things up. This leaves more time for planning and coding.