PHP type safety

Why?

Type safety can be invaluable feature for providing data integrity in regards to data types. Although DSL platform supports several type-safe, strictly-typed languages, PHP usually outmatches most of them in regards to sheer simplicity, wide acceptance, ease of setup and the supporting ecosystem. However, using PHP comes with some tradeoffs, one of them being an inconsistent type system. We'd like to improve upon PHP's type system by introducing the added benefit of limited type safety while still retaining its simplicity.

Current state

Although there have been several proposals to integrate static type safety in the upcoming PHP versions, it is unlikely to see such a change as it would require drastic changes in the underlying type system. To illustrate the scope of changes needed, it suffices to say that PHP variables have no notion of type as they are internally represented as containers holding a value or reference. An attempt to introduce type safety would require a major overhaul of the underlying type system. Therefore, it's not reasonable to expect type safety coming to language anytime soon.

With that out of the way, let' take a look at some of the existing ways to enforce type safety.

Existing tools

Static analysis

There are several tools providing static analysis, some of them built-in in IDEs and gaining more wide support. Although they can be a valuable tool, some PHP features make it impossible to perform a foul-proof method of analysis. Consider the "variable variables" feature:

function set($var) {
    $a = 10;
    $b = 'b';
    is_string($$var);
}

Static analysis will fall short as value contained in $var is known only at runtime.

Type hinting

Type hinting is a valuable feature, albeit with serious limitations as it is not possible to hint for scalar types such as integer, string or boolean types:

// fails as string is not 
function setName(string $name) { ... }

Calling setName will result in a somewhat obscure error:

Catchable fatal error:  Argument 1 passed to setName() must be an instance of string, string given...

...first "string" occurrence refers to the class name, and second "string" is a scalar

Further, it is not possible to hint element types in array. The best it can do is enforce array type:

function addIntegers(array $integers) { ... }

Type hinting will do nothing to prevent non-integer values in array:

addIntegers(array(true, -1, array()));

Other methods

There are several other methods of providing some level of type safety, such as specifying types in annotations, hinting types in docblock comments or autoboxing scalar types. Most of them have serious disadvantages, or are lacking in some aspect.

Type safety solution

Currently, the only way to provide strict type safety is to handle it in-code. This means manually writing type-checking and error handling code. Let's assume we have a class with one property strictly defined as array of integer values. Here's one way of writing a setter method that enforces correct values are passed in to the property:

function setNumbers($value) {
    if (!is_array($value)) {
        // throw exception: not an array
    }
    foreach ($value as $val) {
        if (is_int($val)) {
            // element is an integer
        } elseif (is_string($val) && filter_var($value, FILTER_VALIDATE_INT)) {
            // element is a string that can be converted to integer
        } else {
            // throw exception: cannot be converted to int
        }
    }
}

This method handles type safety for just one property. With more classes or properties, the code base grows very fast resulting in more maintenance and more bugs. Let's assume that requirements are changed and setNumbers takes in only array of float elements. Besides worrying about changing the data implementation, we now need to refactor the setNumbers method to allow float values.

Type-checking code is derived from a small set of rules, is highly repetitive and bound to change as underlying types change. Therefore, it's highly suitable to generate such code and that's one of the features DSL platform provides.

DSL platform approach

As it is not possible to implement type safety in PHP language as whole, we should make it work where it matters the most, and that is data integrity. Ideally, all data reaching our data storage should be type safe. To enforce that, DSL platform generates PHP classes for representation of models that have embedded type safe checking. That means all the repetitive type-safe checking code will be generated from rules defined in DSL. By changing the DSL, underlying PHP classes will be generated with type-safe checking code that reflects those changes.

Defining types

We'll take a look at ways of defining types in DSL platform, and how that reflects on type safety provided in generated PHP classes.

Scalar types

This DSL defines an object named "Foo" containing three scalar properties:

module TypeSafe
    value  Foo {
        string email;
        int score;
        bool isFinished;
    }
}

The platform will use DSL to generate a TypeSafe\Foo class that we can use in PHP:

$val = new TypeSafe\Foo();
$val->email = 'some@example.com';
$val->score = 42;
$val->isFinished = false;

Assigning values to properties is done through magic __set method, which calls the appropriate setter method (e.g. setEmail). Each setter holds specific type-checking logic derived from the rules set in the DSL. Setter method will throw an exception if you try assigning an invalid type. Here's a couple of such invalid assignments:

// following assignments throw InvalidArgumentException
$val->email = array();
$val->email = false;
$val->isFinished = 2;
$val->score = new TypeSafe\Foo();

Type-safe conversion

Some types will allow converting passed in arguments of differing types in case the argument contains reasonable values:

$val->score = "42";

This assignment will convert string "42" to integer value and assign it to property score. It's important to notice that these conversions don't use PHP's implicit conversions since they are prone to errors. For instance, converting empty strings to integer results in an integer with a value of 0, which is usually not an intended use case. In contrast, assigning empty string to score property will properly throw an exception.

Default values

When creating a new object instance, unassigned properties will assume reasonable default values, e.g. unassigned string property will default to an empty string value. The following two statements are identical:

$val = new TypeSafe\Val(array(
    'email' => '',
    'score' => 0,
    'isFinished' => false
));

$val = new TypeSafe\Foo();

Both statements will construct equal objects in regards to property values.

Complex types

Setter methods for complex types like objects or arrays of objects will also have appropriate type checking code similar to scalar properties.

module Auth
{
    root User {
        string username;
        Grant grant;
    }
    value Grant {
        string action;
        int level;
    }
}

Generated PHP code will enforce that value assigned to "grant" property is an instance of Auth\Grant:

$grant = new Auth\Grant();
$grant->action = 'read';
$grant->level = 10;

$user = new Auth\User();
$user->grant = $grant;

Collection types

We can specify each type as a collection. Let's rewrite previous example so it allows more multiple grants assigned to each user:

module Auth
{
    root User {
        string username;
        Grant[] grants;
    }
    value Grant {
        string action;
        int level;
    }
}

Now we can assign array of Auth\Grant objects to User's "grants" properties. Assigning array with elements other than Auth\Grant class will throw an exception.

$readGrant = new Auth\Grant();
$readGrant->action = 'read';
$readGrant->level = 10;

$writeGrant = new Auth\Grant();
$writeGrant->action = 'write';
$writeGrant->level = 10;

$user = new Auth\User();
$user->grants = array($readGrant, $writeGrant);

// throws an InvalidArgumentException because boolean is passed as 3rd argument
$user->grants = array($readGrant, $writeGrant, false);

Defining nullable types

Sometimes we want a value to be optional. You can define optional (or nullable) properties by marking them with a question mark in DSL. The resulting type-checking code will then allow null values.

Nullable types

Making "grants" property from previous example nullable:

root User {
    string username;
    Grant[]? grants;
}

Now it's ok to set the property to null.

$user = new Auth\User();
$user->grants = null;

If property was not defined as nullable, this statement would throw an exception.

Nullable elements in collections

Elements in array can be defined as nullable by setting a question mark after the type:

root User
{
    Array<Grant?> grants; //wrote in an alternative way since otherwise it's hard to read
}

This allows arrays with null values:

$user->grants = array($readGrant, null, $writeGrant);

Benefits

The approach we showcased allows writing native PHP with type safe objects. To summarize, this offers the following benefits:

  • less code to write and manage
  • guaranteed type-safe data
  • type-safety code updates with changes in model

It should be noted that type safety embedded in generated classes is not a replacement for proper validation. Proper rules should still be enforced by manually writing validation code in PHP. Type safety is a valuable tool that simplifies that code by assigning the management of type safety aspect to the platform. Additionally, it is possible to define validation rules and logic inside DSL, but those concepts shall be covered in another post.