Agile Security with PHP

How to incorporate security from the beginning

Posted by JD Glaser on August 24, 2016

Tutorial: Use the Specs Derived from the User Story to Drive Agile Security

Fast, secure, agile development with PHP

精神錯亂以同樣的方式做同樣的事情,並期待不同的結果 -Insanity is doing the same thing in the same way and expecting a different outcome.

Agile Security

It is absolutely possible to incorporate security, quickly and seamlessly, as part of the Agile process, from the very beginning of the development process.

Start with a User Story:
  1. Derive the acceptance criteria and data specs from the User Story card
  2. Design a SQL table with columns matching those specs
  3. Write unit tests for those column specs
  4. Write the PHP input class to make the tests pass

Here is a common scenario, a forum user needs to register for email updates. This tutorial will walk through securing the user input data as part of a Test Driven Development (TDD) / Agile process.

User Story Card

Register for Email Updates

As a new forum user
I want to register my name and unique email
So that I will receive email updates

Now that we have the scenario to code, let's decide two things, the acceptance criteria and the data specifications for the input. This is a very important step that drives the rest of the security process. It determines the data columns of the SQL table, it determines the size of the data, it determines the business rules to filter, and determines what type of filtering and escaping needs to be done.

The acceptance criteria drives the unit tests, which drives the PHP implementation class.

一步一步是好走 - One step at a time is good walking.

Back of Card: Acceptance Criteria / Data Specs

Form must include:

First name 20 UTF8 chars. A-Za-z only, no numbers, no HTML tags accepted
Last Name 20 UTF8 chars Support any language, no numbers, no HTML tags accepted
Password 64 UTF8 chars Support any input, only hash is stored. Password NOT SAVED
Email 60 UTF8 chars Support english alpha/numeric only, UNIQUE, no HTML tags accepted
Age Tiny Int 2-3 digits
Date SQL auto-assigned
* Store data in registration table. Use UTF8 column type. Add password confirmation.

Acceptance Criteria:

All form fields must be completed/validated client side before form submit is activated
All form fields must be validated or rejected by PHP before record insertion
Verification email sent to confirm address
Message of completed registration process and email sent returned
Welcome email sent upon successful completion of confirmation
Message of completed confirmation process returned


We now have the basis for securely validating and storing the required data. Providing this additional information took 5 extra minutes and provided what we need to know for specifying:

  1. mySQL table column types, sizes, and keys (unique email, 64 byte hash, 2-3 digit age)
  2. PDO connection type
  3. PHP filters and escape functions will be needed (sha-256, email)
  4. PHP Implementation class functions
  5. Unit tests needed to confirm all this

The security code needed is now very clear because we are not just accepting anything. We know specifically what we are accepting, and we are not in the dark about how to proceed.

UTF-8 mySQL Table Construction

Based on the story specs, here is our mySQL table:


CREATE TABLE registration(
  registration_id   INT(11) NOT NULL AUTO_INCREMENT,
  first_name        VARCHAR(20) NOT NULL,
  last_name         VARCHAR(20) NOT NULL,
  password_hash     VARCHAR(64) NOT NULL,
  age               TINYINT(2) NOT NULL,
  email             VARCHAR(60) NOT NULL,
  date              DATE NOT NULL,
  PRIMARY KEY (product_id),
  UNIQUE KEY email(email)
)  ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1;
                        
                    

Even before we write tests, we should have a notion of what we are going to do. The code snippets below are likely implementations for filtering data based on what we have previously decided. Unit tests will develop around the expected output wanted from these thoughts. I can see the chain I want for taking and storing input, and will now write a test to verify these plans, then code the implementation to make the tests pass.

One of the first concrete things we have is data size. Data size effects storage. Large tables slow down over time. Search speed can be effected. Size limits are important for both filtering, storing and searching. Don't waste processing time on either if it can be avoided. First task is to trim incoming data. Only filter and store what is needed. In this case, not trimming incoming data will mean that strings longer than the table column size are truncated and data is lost. Not what we want.

For security, size limits also determine what an attacker has to work with. Attackers do not always need longer strings to structure an attack, but don't give any more that you have to. Size limits increase attack difficulty and raise the protection level, in many cases eliminating a class of attack based on longer attack strings. Nor do we want the possibility of an attacker sending a huge input string as part of a denial of service attack. Just cut it at size.

Pre-Testing Thoughts for UTF-8 Input

//test for valid UTF8 characters
//using true param for strict mode
//reject if not encoded as expected 
//means tampering
//for mb_ functions
//use correct UTF8 parameters
//use strict mode  'true'
$utf8Encoded = mb_detect_encoding(
                $_POST['first_name'], 
                'UTF-8', 
                true); 
//cut to fit
$firstName = mb_substr(
            $_POST['first_name'], 
            0, NAME_SIZE, 
            UTF-8");

                    

Pre-Testing Thoughts for Input Validity

if($allFieldsHaveData===true) 
{ 
        
    ///////////////////////////////////////////
    //perform first level sanitization
    //manually validate/sanitize each element
    //if english, this is a possible test
    //username will allow 
    //only A-Z, a-z, 0-9 
    //with 20 max characters
    //use correct UTF8 params for mb_substr()
    if(ctype_alnum($_POST['first_name']))
    {
    //cut data size to table column size
    //using UTF8 function
    const NAME_SIZE = 20; 
    $firstName = mb_substr($_POST['first_name'],
                    0, 
                    NAME_SIZE,
                    "UTF-8");
    $lastName = mb_substr($_POST['last_name'], 
                    0, 
                    NAME_SIZE, 
                    "UTF-8");
    }

    //not for security
    //business decision 
    //disallow HTML tags
    $first_name = strip_tags($first_name);
    $last_name = strip_tags($last_name);

    //test for number only input
    //valid range 13 -120
    if(ctype_digit($_POST['age']) && $_POST['age'] < 120)
    {
        $age = $_POST['age'];
    }

    //allow a password input of up to 300 
    //someone may want a passphrase
    const PASS_SIZE = 300;
    $passOrig = mb_substr($_POST['pass_orig'], 
                                0, 
                                PASS_SIZE, 
                                "UTF-8");
    $passConfirm = mb_substr($_POST['pass_confirm'],
                                 0, 
                                 PASS_SIZE, 
                                 "UTF-8");

    //there is no need to sanitize password
    //allow anything for larger key space
    //hashing it makes it sanitized 
    //only a-f, 0-9 characters
    //hashed result is 64 chars
    //regardless of input length                               
    $passHash  = hash('sha256', 
                    $passOrig);
    $confirmHash  = hash('sha256', 
                    $passConfirm);
    //UNSET PASSWORD
    //WE DON"T WANT IT
    unset($_POST['pass_orig']);
    unset($_POST['pass_confirm']);

    //compare orig and confirm hashes
    //if same, store $passHash
    if($passHash === $confirmHash){
        //password is cofirmed
        //store hash
        //when user logs in
        //hash the password
        //compare to stored hash
    }
    
    //cut email to correct size
    //max = 100 characters
    //remove invalid chars
    //set utf8 parameter
    const EMAIL-SIZE = 60;
    $email = filter_var(mb_substr($_POST['email'], 
                    0, 
                    EMAIL-SIZE, 
                    "UTF-8"), FILTER_SANITIZE_EMAIL);  
}
                    

Make sure to setup your PHP functions, mb_detect_encoding, mb_substr, filter_var, correctly for UTF-8. Explicitly set UTF-8 as a parameter.

Agile Unit Testing with PHP

說話的好處不好...只是做到了 - To talk goodness is not good... only to do it is.

Now we start to write unit tests to verify our implementation.

The first thing should be to setup a test harness that verifies our UTF-8 compliance.

PHP/PDO UTF-8 Configuration Test

<?php
require_once "PHPUnit/Autoload.php";
require_once "registration.php";

class testUTF8Test extends PHPUnit_Framework_TestCase
{
  public function testConnectionForUTF8()
  {
    //verify PHP setup internally to use UTF-8 
    $this->assertEquals("UTF-8", 
                        mb_internal_encoding());
        
    //make new registration object
    $register = new Registration();

    //test what is returned from PDO connection
    //Register reportConnectionType should return
    //array('Variable_name' => 'character_set_client',
             'Variable' => 'utf8');
    $utf8Test = $register->reportConnectionType();
    $this->assertContains("utf8" , $utf8Test);

    //test for returning character set
    //should return
    //array('Variable_name' => 'character_set_results',
             'Variable' => 'utf8');
    $utf8Test = $register->reportResultType();
    $this->assertContains("utf8" , $utf8Test);
}

                    

Here, in the first test, we are testing that our PHP environment setup is correctly using UTF-8 internally for it's function processing. Very important. Look to the PHP Anti-Patterns article for more information on matching character encoding to function encoding.

Second, in the first PDO test, we are testing that the PDO connection gets opened, and correctly takes UTF-8 characters as input. If this does not happen, then Unicode characters will get mangled, and not stored correctly, destroying data forever. The PDO connection and the table column character set have to both be set to UTF-8 in order to save and retrieve Unicode correctly.

The second PDO test veriifes that UTF-8 is correctly output from storage.


PHP Class To Make testUTF8 Pass

<?php

class Registration
{
     public $conn=null;

public function __construct($host, 
                            $db, 
                            $user, 
                            $pass)
{
    try{   
    $this->conn = new PDO("mysql:host={$host};
                            dbname={$db};
                            charset=utf8", 
                            $user, $pass);  

    $this->conn->setAttribute(PDO::ATTR_ERRMODE, 
                                PDO::ERRMODE_EXCEPTION);
    $this->conn->setAttribute(PDO::ATTR_DEFAULT_FETCH_MODE, 
                                PDO::FETCH_ASSOC);
    }  
    catch(PDOException $e){
        //make sure you are logging to file
        //not returning internal errors
        //to users and attackers
        logError("Connection Error!: " 
                    . $e->getMessage());
        exit();
    }  
} 

public function logError($errorMSG){
    //implement logging
}

public function reportConnectionType(){

    $stmnt = $this->conn->prepare(
                "SHOW VARIABLES LIKE 'character_set_client'");
    $stmnt->execute();

    //should return
    //array('Variable_name' => 'character_set_client', 
    //       'Variable' => 'utf8');
    return $stmnt->fetch();

}

public function reportResultType(){

    $stmnt = $this->conn->prepare(
        "SHOW VARIABLES LIKE 'character_set_results'");
    $stmnt->execute();

    //should return
    //array('Variable_name' => 'character_set_results', 
            'Variable' => 'utf8');
    return $stmnt->fetch();

}
} //end class                       
                    

Possible unit tests for securing input:

  • Test for Unicode names in/out of Database
  • Test for Email correctness
  • Test for Email uniqueness
  • Test for name size limits
  • Test for HTML tags stripped out
  • Test age for numbers only
  • Test for number range only for age
  • Test for password hash
  • Test for password stored / returned
  • Test for auto-generated insertion date
  • Test for attack string removal from name/email/age

Unit Test for UTF-8 Input and Output

class testUTF8Test extends PHPUnit_Framework_TestCase
{
    public function testForUTF8FirstNamePreservation()
    {
    //make new registration object
    $register = new Registration();

    $asciiName = "Jonathan";
    $unicodeName = "喬納森";

    //test that name is stored and retrieved correctly
    $this->assertEquals($asciiName, 
                    $register->storeRetrieveFirstName(
                        $asciiName));
    $this->assertEquals($unicodeName, 
                    $register->storeRetrieveFirstName(
                        $unicodeName));   
    }

}
                    

PHP Implementation of UTF-8 Storage/Retrieval

For sake of brevity, I took a shortcut here in order to illustrate the testing technique. I've ommitted the fact that there are other columns that cannot be null, and treated this like a single column table. The column is first tested that Unicode is fully supported. In a later test we limit, per user story, the input to A-Z, a-z input. However, the technique applies, and you can see the pattern to test the rest of the columns.

Tests should also run a query to delete test data as you go.

When using stubs and mocks, or running longer test suites where the actual data connection is not desired, it still needs to be verified that the entire storage and retrieval chain works as expected.

PDO Prepared Statement Implementation

public function storeRetrieveFirstName($firstName)
{ 
try{

$query  = "INSERT 
            INTO registration (first_name) 
            VALUES (:first_name)";
$params = array( ':first_name' => $firstName);

$stmt = $this->conn->prepare($query); 
$success = $stmt->execute($params);
//successful insertion of name into db
if($success) {
//now check for retrieval
$query = "SELECT first_name 
            FROM registration 
            WHERE first_name = :first_name"; 
 
$params = array(':first_name' => $firstName); 

$stmt = $this->conn->prepare($query); 
$stmt->execute($params); 
$result = $stmt->fetch();

return $result[first_name];
  
}

                    

Storing Unicode Records
Unicode Data Stored

A screen shot of the records correctly stored as Unicode inside the Registration table.

For the rest of the tests, we'll write Single Responsibility unit tests for each validation we want. After that, code can be grouped into the desired functionality to achieve the user story implementation.

Validation Unit Tests

require_once "registration.php";
class testRegistrationInput extends PHPUnit_Framework_TestCase
{
  public function testNameInputs()
  {
    //make new registration object
    $register = new Registration();

    $firstName = "Jonathan";
    $size      = 20;
    //test that first  <= 20
    //alphabetic only
    $this->assertEquals(true, 
                $register->validateFirstName(
                        $firstName));

    $longName = "Jonathanisusing
                muchtoolongofaname";
    //verify long names fail
    $this->assertEquals(false, 
                $register->validateFirstName(
                        $longName));

    $numberName = "Jonathan123"; 
    //verify numeric names fail
    $this->assertEquals(false, 
                $register->validateFirstName(
                        $numberName));

    $unicode = '被淹死的人不會受到雨的困擾';
    $uniTags = '被受<script></script>到雨';
    //test unicode last name
    $this->assertEquals('UTF-8', 
                $register->ensureUnicodeLastName(
                        $unicode));  

    //test true - no HTML tags
    //nothing altered/removed
    $this->assertEquals($unicode, 
                $register->removeHTMLTags(
                        $unicode));  

    //test true - with HTML tags
    //should return altered string
    $this->assertNotEquals($uniTags, 
                $register->removeHTMLTags(
                        $uniTags)); 

    $age = 45;
    $textAge = 'Forty';
    //test age digit only
    //should be true for digit
    $this->assertEquals(true, 
                $register->ageDigitOnly(
                        $age)); 

    //should be false for text
    $this->assertEquals(false, 
                $register->ageDigitOnly(
                        $textAge)); 
    }

    //security test
    //test for HTML encoding
    $attack = "guest<script>alert('attacked')
                </script>";
    $encoded = "guest&lt;script&gt;
                alert('attacked')&lt;/script&gt;";
    //should be true for encoding
    $this->assertEquals($encoded, 
                $register->escapeHTML(
                        $attack)); 

    //security test
    //test for hashing
    //password is long
    //or
    //password is attack
    $pass = "guest<script>
                alert('attacked')</script>"; 
    $hash = "80dda42407e16a92efaae6847
                001b73ff3e19e4153dfb8deeb6e6caaf273e8cf";   
    //should be true 
    //safe hash has no control chars
    $this->assertEquals($hash, 
                $register->hashPassword(
                        $pass)); 

    //security test
    //test for SQL escaping
    $lastName = "O'Reilly";
    $this->assertEquals($lastName, 
                $register->escapeDB(
                        $lastName)); 
}

                    

Here we've got unit tests checking for both correct conditions, and error conditions. You need to handle both.

Passwords Can Be Anything

Don't limit passwords. Passwords need as much help as they can get from a keyspace perspecitve. If a user wants to submit a passphrase, let them. Since we are not keeping nor processing the passwords, even an attack string is acceptable. Hash it and delete the original password. Only store and compare hashes. World will be a much safer place if you do.

Validation Implementations

class Registration {
   
public function validateFirstName($firstName)
{
    //cut data size to table column size
    //using UTF8 function
    //use correct utf8 parameter
    const NAME_SIZE = 20; 
    $correctName = mb_substr($firstName],
                    0, 
                    NAME_SIZE,
                    "UTF-8");
    //per user story
    //only A-Z, a-z allowed 
    if(ctype_alpha($correctName))
    {
    //only alpha of correct size
    //and matches orig
    if($firstName === $correctName)
        return true;
    else
        //too long
        return false;
    }
    //not alpha
    return false;
}

public function ensureUnicodeLastName($name){
    //test for UTF8 encoding
    //set strict to true
    //returns either
    //'UTF-8'
    //or false
    //if false, reject input
    return mb_detect_encoding($name, 'UTF-8', true);
}

public function removeHTMLTags($name){
    
    $result = strip_tags($name);
    if($name === $result)
        return true;
    else
        return false;
}

public funciton ageDigitOnly($age){
    //true if digits
    //if not digits, reject
    return ctype_digit($age);
}

public function escapeHTML($text){
    
    //set correct UTF8 params
    //safe for output into HTML
    return htmlentities($text, 
                    ENT_QUOTES, 
                    'UTF-8');
}

public function hashPassword($pass){
    //a-z0-9 only returned
    return hash('sha256', $pass);  
}

public function escapeDB($lastName){
    
try{
//safely insert text
$query  = "INSERT 
            INTO registration (last_name)
            VALUES (:last_name)";
$params = array(':last_name' => $lastName);

$stmt = $this->conn->prepare($query); 
$success = $stmt->execute($params);
//successful insertion of name into db
if($success) {
//now check for retrieval
$query = "SELECT last_name 
            FROM registration 
            WHERE last_name = :last_name"; 
 
$params = array(':last_name' => $lastName); 

$stmt = $this->conn->prepare($query); 
$stmt->execute($params); 
$result = $stmt->fetch();

//return original text
return $result[last_name];
      
}  

                    

It's important to understand that if input does not pass your tests, your code should reject the inputted data. It's been tampered with, and not worth the risk. Legitimate users, using the app correctly will supply the expected input, in the expected manner, and as the expected type. Attackers won't. Reject it.

Make It DRY - Don't Repeat Yourself

Now that the initial unit tests pass, we see that storeRetrieveFirstName() and escapeDB() use almost identical code, so we can refactor that into a single DRY method and retest. Now add a test for email.

Unit Test for Email

We could test for either ascii email or Unicode email addresses. Our user story decided ascii, so we can make use of filter_var(). filter_var() does not work with Unicode characters, mangling the names, so we'll cover both. If unicode email support is required, then a RegEx is needed. Something like:


$unicodeMail = "André.Svensön@ünicøde.örg"

//to preserve original
preg_replace( 
 '/[^\pL\d\!\#\$\%\&\*\+\-\/\=\?\^\_\{\|\}\~\@\.\[\]]/u',
 '', 
 $unicodeEmail);
                

The following expression, while long is quite simple. No groupings, just filtering/removing of the illegal email control characters one after the other. In this case, characters in the list are retained. Once a control character not from the list is detected, it is replaced with nothing, as specificed by the empty single quotes.

The key elements are:

‘\pL’, ‘\d’, and ‘/u’

These specifiy that Unicode characters are to be processed using the right byte boundries.

IMPORTANT: Note that single quotes and backticks do pass through filter_var(). The Unicode Regex removes them. This might be more secure but break compliance. They may need to be allowed. This demonstrates a tricky situation to be dealt with. Which to allow? And always to secure via escaping afterwards.

These filters do not ensure the email string is safe. They simply removes illegal email characters. The email string MUST be PDO quoted before insertion into the database, and MUST be HTML escaped before output into an HTML page.

Email is difficult. The email protocol allows control characters that unless escaped for context could be harmful to other parsers.

Email Validity Test

require_once "registration.php";
class testRegistrationInput extends PHPUnit_Framework_TestCase
{
  public function testEmail()
  {
    //make new registration object
    $register = new Registration();

    //test valid email
    $email = "Jonathan@email.com";

    //ascii only
    //should return true
    //should return
    //"Jonathan@email.com"
    $this->assertEquals($email, 
                $register->asciiEmail(
                        $email)); 

    //test bad email
    $bad  = "Jonathan<script>
                alert('attacked')</script>
                    @email.com";

    //should return true
    //inequal after stripping
    //should not match
    //should return
    //"Jonathanscriptalert'
    //    attacked'script@email.com"
    $this->assertNotEqual($bad, 
                $register->asciiEmail(
                        $bad));                  
    } 

    //test valid unicode email
    $unicode = "André.Svensön@ünicøde.örg";
 
    //should return true
    //should not change
    //should return
    //"André.Svensön@ünicøde.örg"
    $this->assertEquals($unicode, 
                $register->unicodeEmail(
                        $unicode));                  
    }  

    //test bad unicode email
    $bad = "André.<script>;
                alert('attacked')</script>
                    .Svensön@ünicøde.örg";

    //test unicode 
    //should return true
    //inequal after stripping
    //should return
    //"André.scriptalert'attacked'
    //      /script.Svensön@ünicøde.örg"
    $this->assertNotEquals($bad, 
                $register->unicodeEmail(
                        $bad));                  
    }                        

                
Email Validity Implementation

public function asciiEmail($email){
    //test for email validity
    //cut email to correct size
    //max = 100 characters
    //remove invalid chars
    //does strip out <
    //leaves ' and ` in place
    //set utf8 parameter
    const EMAIL-SIZE = 60;
    $email = filter_var(mb_substr(
                    $email, 
                    0, 
                    EMAIL-SIZE, 
                    "UTF-8"), 
                    FILTER_SANITIZE_EMAIL); 

    return $email; 
}

public function unicodeEmail($email){
    //test unicode email validity
    //cut email to correct size
    //max = 100 characters
    //remove invalid chars
    //strips out <
    //strips out ' and `
    //add to list if needed
    const EMAIL-SIZE = 60;
    $email = preg_replace(
        '/[^\pL\d\!\#\$\%\&
        \*\+\-\/\=\?\^\_\
        {\|\}\~\@\.\[\]]/u', 
        '',
        mb_substr($unicodeEmail, 
                  0, 
                  EMAIL-SIZE, 
                  "UTF-8"));

return $email; 
}

                

If returned email string does not match the incoming email string, then ether illegal control characters or unacceptable Unicode characters were removed, so reject it. Also, do not fall into complacency that this is safe now. It just means invalid email control characters were removed. Other control characters might remain. Safety depends on proper escaping for context when stored and displayed.

Again, email is difficult, there is no easy answer, and a secure solution depends on the requirements of your companies allowed email formats. This most likely means a Regular Expression customized to parse an allowable format.

If you are just storing someones addresses, then the validity will be proved if they respond correctly to a mail sent to the given address.

Test. Implement. Refactor. Repeat.

In Conclusion

Based on our tests and initial implementations, we have a new handle on our data, the types and the formats. We now know what it is specifically, and how to treat it. The data is not 'vague'. We've also ensured a correct path for storing and retrieving these types, right from the beginning.

I hope this gives insight into how to incorporate secure development using TDD from the very beginning, without slowing the fast pace of an Agile process.

Many other PHP and Javascript techniques for writing secure code, including the complete details for secure session management, client side validation, as well securing Google Maps/Twitter/Facebook social apps are available in my book, Secure Development for Mobile Apps.