PHP advanced tutorial

In first tutorial we saw some basic operations like creating objects, updating them and writing search requests using specifications. Second tutorial explained techniques like history management, indexing search and creating reports. This tutorial will bring up even more advanced concepts in our DSL. We'll see how to write snowflakes and olap cubes.

Let's add City aggregate root in our Football module:

root City
{
    string name;
}

and expand Team and Player roots with city field. We'll add

City *city;

to Team root and add nullable city, and goals field to Player root

City? *city;
int goals;

Now let's create objects which we'll work with.

use Football\Player;
use Football\Team;
use Football\TeamReport;
use Football\City;
use Football\PlayerGrid;

require_once 'platform/Bootstrap.php';

$players = Player::findAll();
$teams = Team::findAll();
$cities = City::findAll();

foreach($players as $p)
    $p->delete();

foreach($teams as $t)
    $t->delete();

foreach($cities as $c)
    $c->delete();

$city[] = new City(array('name' => 'Aldmarble'));
$city[] = new City(array('name' => 'Bellkeep'));
$city[] = new City(array('name' => 'Coldloch'));
$city[] = new City(array('name' => 'Dorwall'));
foreach($city as $c)
    $c->persist();

$team[] = new Team(array('name' => 'Old School', 'city' => $city[2]));
$team[] = new Team(array('name' => 'New School', 'city' => $city[0]));
foreach($team as $t)
    $t->persist();

$player[] = new Player(array('name' => 'Oliver Kahn', 'team' => $team[0], 'city' => $city[1], 'goals' => 2));
$player[] = new Player(array('name' => 'Davor Suker', 'nickname' => 'Suki', 'team' => $team[0], 'number' => 9, 'city' => $city[3], 'goals' => 270));
$player[] = new Player(array('name' => 'Zinedine Zidane', 'nickname' => 'Suki', 'team' => $team[0], 'number' => 10, 'city' => $city[1], 'goals' => 514, 'nickname' => 'zizou'));
$player[] = new Player(array('name' => 'Victor Valdes', 'team' => $team[1], 'goals' => 5, 'city' => $city[1]));
$player[] = new Player(array('name' => 'Christiano Ronaldo', 'team' => $team[1], 'number' => 7, 'goals' => 400, 'city' => $city[2], 'nickname' => 'CR7'));
$player[] = new Player(array('name' => 'Lionel Messi', 'team' => $team[1], 'number' => 9, 'goals' => 350, 'city' => $city[3], 'nickname' => 'La Pulga'));
foreach($player as $p)
    $p->persist();

To use olap cube first we need to create snowflake. We'll create snowflake called PlayerGrid on Player objects:

snowflake PlayerGrid from Player
{
    name; // if alias is not declared, name of the used property will be used as alias
    team.name as teamName; // we can navigate through reference properties
    team.city.name teamCity;
    city.name playerCity;
    goals;
}

This DSL will create additional view on Player root so writing

echo Player::count() . '<br>';
$players = Player::findAll();
echo $players[0]->name;

will give same result as writing

echo PlayerGrid::count() . '<br>';
$players = PlayerGrid::findAll();
echo $players[0]->name;

But trying echoing players nickname from PlayerGrid would give as an error because we did not mapped that field. Using snowflake we collected Player fields which will we use in olap cube.

cube PlayerCube from PlayerGrid
{
    dimension name;
    dimension teamName;
    dimension teamCity;
    dimension playerCity;
}

This olap cube is created from PlayerGrid snowflake. Keyword DIMENSION says which dimensions we are about to track, so this way we'll create tesseract with dimensions: name, teamName, teamCity and playerCity. Let's extend our cube by adding some facts. Facts are measures derived from dimensions of our snowflake data like SUM, COUNT, AVERAGE, DISTINCT etc.

cube PlayerCube from PlayerGrid
{
    dimension name;
    dimension teamName;
    dimension teamCity;
    dimension playerCity;
    count name as count; // number of players
    average goals as averageGoals;
    sum goals as totalGoals;
}

Facts are calculated on every requested dimension. Let's calculate those facts on teamName dimension:

$cube = new PlayerCube();
$dimensions = array($cube::teamName);
$facts = array($cube::count, $cube::averageGoals, $cube::totalGoals);
var_dump($cube->analyze($dimensions, $facts));

Above PHP code should output something like:

array(2)
  {
    [0] =>
      {
        ["teamName"] => "New School"
        ["count"] => 3
        ["averageGoals"] => 251.67
        ["totalGoals"] => 755
      }
    [1] =>
      {
        ["teamName"] => "Old School"
        ["count"] => 3
        ["averageGoals"] => 262
        ["totalGoals"] => 786
      }
  }

If we take teamCity for dimension instead of teamName, result would stay the same because we are grouping on team city where every team is in its own city (in our example). Grouping on player city

$cube = new PlayerCube();
$dimensions = array('playerCity');
$facts = array('count', 'averageGoals', 'totalGoals');
var_dump($cube->analyze($dimensions, $facts));

would give us interesting results

array(3)
  {
    [0]=>
      {
        ["playerCity"] => "Bellkeep"
        ["count"] => 3
        ["averageGoals"] => 173.67
        ["totalGoals"] => 521
      }
    [1]=>
      {
        ["playerCity"] => "Coldloch"
        ["count"] => 1
        ["averageGoals"] => 400
        ["totalGoals"] => 400
      }
    [2]=>
      {
        ["playerCity"] => "Dorwall"
        ["count"] => 2
        ["averageGoals"] => 310
        ["totalGoals"] => 620
      }
  }

We can claim that players from Dorwall are scoring most goals in total and players from Coldloch are scoring most goals on average. But what if we want to calculate facts on query that is complex more than using just dimension? We'll write specification which would filter cube and then calculate facts for specific Team, Team city or Player city. Our new cube looks like

cube PlayerCube from PlayerGrid
{
    dimension name;
    dimension teamName;
    dimension teamCity;
    dimension playerCity;
    count name as count;
    average goals as averageGoals;
    sum goals as totalGoals;
    specification filter
        'it => (team == null || team.name == it.teamName)
        && (teamCity == null || teamCity.name == it.teamCity)
        && (string.IsNullOrEmpty(playerCity) || playerCity == it.playerCity)'
    {
        Team? team;
        City? teamCity;
        string? playerCity;
    }
}

Specification we wrote is resistant on passing nullable arguments so we can choose which arguments to group. Let's find out from which cities are players that are training in Dorwall. Let's also count how many players are from those cities and calculate their averege goals.

$cube = new PlayerCube();
$dimensions = array('playerCity');
$facts = array('count', 'averageGoals');
$specification = new filter(array('teamCity' => $city[2]));
var_dump($cube->analyze($dimensions, $facts, array(), $specification));

Output should be

array(2)
  {
    [0] =>
      {
        ["playerCity"] => "Bellkeep"
        ["count"] => 2
        ["averageGoals"] => 258
      }
    [1]=>
      {
        ["playerCity"] => "Dorwall"
        ["count"] => 1
        ["averageGoals"] => 270
      }
  }

We can claim that there is one player training in Dorwall which is also living in Dorwall. There are two players living in Bellkeep which are training in Dorwall.

This was a simple example of how to use olap cubes in data analysis. Mostly, business data is composed of hundreds dimensions and analyzing those can be hard task. Because olap cubes can be made up of more than three dimensions (hypercube), in-depth analysis is enabled, allowing users to gain comprehensive and valuable business insights.