Use ArangoDB in Symfony without an external bundle



The official documentation on ArangoDB usage in Symfony is [literally from 2013](https://www.arangodb.com/2013/03/getting-started-with-arangodb-and-symfony-part1/) and hence targeting a pretty old Symfony version. This blag post will cover adding minimal ArangoDB support by hand. The actual database protocol and interacting will be handled by [ArangoDBClient](https://github.com/arangodb/arangodb-php). ## Plan The goal of this post is not to draft a fully integrated bundle for Symfony but merely use ArangoDb as an example to learn how to add support for a more exotic database system to the framework. Hence I might explain a few concepts that you are well-aware if you ever worked with other database abstraction layers. In order to keep things cozy, I'll model the directory structure after doctrine: * The Entity directory will contain classes representing different database objects: think of an entity as a single row in a table or a single documents in a collection. Those entities should contain relatively little domain logic. * The Repository directory will contain repository classes for the different collections. We will be using inheritance to share the Arango-specific logic between the different repositories. For this, we will be using an AbstractArangoRepository. * Finally, we will be using a single ArangoDatabase service class. This allows us for a central point to inject database credentials, a single point to interface with the ArangoDBClient, and the insurance that we will only open a single connection and not, say, one per repository. So execute
composer require triagens/arangodb
and start coding. ## Entities Not much to say about those: entities are mainly data transfer objects (DTOs) to represent documents from the database in a more strongly typed way. Here are two example entities:
<?php declare(strict_types=1);

namespace App\Entity;

class Source
{
    private string $id;
    private string $name;

    public function __construct(string $id, string $name)
    {
        $this->id = $id;
        $this->name = $name;
    }

    public function getId(): string
    {
        return $this->id;
    }

    public function getName(): string
    {
        return $this->name;
    }
}
which will represent the source of a malware sample (Hatching, MalShare, ...), and the following, which will represent a concrete malware sample:
<?php declare(strict_types=1);

namespace App\Entity;

class Sample
{
    private string $id;
    private int $crc32;
    private string $md5;
    private string $sha1;
    private string $sha256;
    private int $size;

    public function __construct(string $id, int $crc32, string $md5, string $sha1, string $sha256, int $size)
    {
        $this->id = $id;
        $this->crc32 = $crc32;
        $this->md5 = $md5;
        $this->sha1 = $sha1;
        $this->sha256 = $sha256;
        $this->size = $size;
    }

    public function getId(): string
    {
        return $this->id;
    }

    public function getCrc32(): int
    {
        return $this->crc32;
    }

    // more getters here
}
## Service Class In order to allow specification of the database credential through the environment, we need to adjust config/services.yaml:
services:
    App\Arango\ArangoDatabase:
        arguments:
            $protocol: '%env(ARANGO_PROTOCOL)%'
            $hostname: '%env(ARANGO_HOSTNAME)%'
            $port: '%env(ARANGO_PORT)%'
            $databaseName: '%env(ARANGO_DATABASE)%'
            $username: '%env(ARANGO_USERNAME)%'
            $password: '%env(ARANGO_PASSWORD)%'
This will allow for an .env.local file like the following:
ARANGO_PROTOCOL=tcp
ARANGO_HOSTNAME=127.0.0.1
ARANGO_PORT=8529
ARANGO_DATABASE=dbname
ARANGO_USERNAME=root
ARANGO_PASSWORD=123456
The service class itself is relatively simple, it's main purpose is to expose the database connection to repositories through the getConnection function:
<?php declare(strict_types=1);

namespace App\Arango;

use ArangoDBClient\Connection;
use ArangoDBClient\ConnectionOptions;

class ArangoDatabase
{
    private Connection $connection;

    public function __construct(
        string $hostname,
        int $port,
        string $databaseName,
        string $username,
        string $password,
        string $protocol = 'tcp',
    ) {
        $connectionOptions = [
            ConnectionOptions::OPTION_DATABASE => $databaseName,
            ConnectionOptions::OPTION_ENDPOINT => sprintf('%s://%s:%s', $protocol, $hostname, $port),
            ConnectionOptions::OPTION_AUTH_TYPE => 'Basic',
            ConnectionOptions::OPTION_AUTH_USER => $username,
            ConnectionOptions::OPTION_AUTH_PASSWD => $password,
            ConnectionOptions::OPTION_CONNECTION => 'Close',
            ConnectionOptions::OPTION_RECONNECT => true,
        ];
        $this->connection = new Connection($connectionOptions);
    }

    public function getConnection(): Connection
    {
        return $this->connection;
    }
}
A tiny sideline: I found the documentation on allowed protocols in the ConnectionOptions::OPTION_ENDPOINT field a bit confusing. But the tl;dr is: use http+ssl with an explicitly specified port of 443 if you want to go through https. ## Repositories In-line with the "Entities" section, we will specify two repositories. One for sources:
<?php declare(strict_types=1);

namespace App\Repository;

use App\Arango\AbstractArangoRepository;
use App\Entity\Source;
use ArangoDBClient\Document;

/**
 * @template-extends AbstractArangoRepository<Source>
 */
class SourceRepository extends AbstractArangoRepository
{
    protected function getCollectionName(): string
    {
        return 'sources';
    }

    protected function constructEntity(Document $document): Source
    {
        return new Source($document->getId(), $document->get('name'));
    }
}
And one for samples:
<?php declare(strict_types=1);

namespace App\Repository;

use App\Arango\AbstractArangoRepository;
use App\Entity\Sample;
use App\InputHelper;
use ArangoDBClient\Document;

/**
 * @template-extends AbstractArangoRepository<Sample>
 */
class SampleRepository extends AbstractArangoRepository
{

    protected function getCollectionName(): string
    {
        return 'samples';
    }

    protected function constructEntity(Document $document): object
    {
        return new Sample(
            $document->getId(),
            $document->get('crc32'),
            $document->get('md5'),
            $document->get('sha1'),
            $document->get('sha256'),
            $document->get('size'),
        );
    }
}
This repository class would also be the place to implement more specific query functions like, say, count all samples for a given source. Anticipating the aql method from the next section — which will allow execution of arbitrary AQL statements — this function will access the edge collection (between sources and samples) for counting:
public function countBySource(Source $source): int
{
    $ret = $this->rawAql(
        <<<AQL
FOR edge in sample_from_source
FILTER edge._to == @source
COLLECT WITH COUNT INTO cnt
RETURN cnt
AQL,
        ['source' => $source->getId()],
    );

    return $ret[0];
}
## Abstract Base Repository The last missing piece is the abstract repository, which will look like the following:
<?php

declare(strict_types=1);

namespace App\Arango;

use App\InputHelper;
use ArangoDBClient\Collection;
use ArangoDBClient\CollectionHandler;
use ArangoDBClient\Document;
use ArangoDBClient\DocumentHandler;
use ArangoDBClient\Exception;
use ArangoDBClient\Statement;

/**
 * @template T
 */
abstract class AbstractArangoRepository
{
    private ArangoDatabase $arangoDatabase;
    private Collection $collectionId;
    private CollectionHandler $collectionHandler;
    private DocumentHandler $documentHandler;

    public function __construct(ArangoDatabase $arangoDatabase)
    {
        $this->arangoDatabase = $arangoDatabase;
        $this->collectionHandler = new CollectionHandler($arangoDatabase->getConnection());
        $this->collectionId = $this->collectionHandler->get(new Collection($this->getCollectionName()));
        $this->documentHandler = new DocumentHandler($arangoDatabase->getConnection());
    }

    abstract protected function getCollectionName(): string;

    /**
     * @return T
     */
    abstract protected function constructEntity(Document $document): object;

    /**
     * @return ?T
     */
    public function get(string $id): ?object
    {
        return $this->constructEntity($this->documentHandler->get($this->getCollectionName(), $id));
    }

    /**
     * @param string[] $ids
     * @return T[]
     */
    public function findByIds(array $ids): array
    {
        return $this->aql(
            sprintf('FOR row in %s FILTER row._id IN @ids RETURN row', $this->getCollectionName()),
            ['ids' => $ids]
        );
    }

    /**
     * @return T[]
     */
    public function findAll(): array
    {
        return array_map(
            fn (Document $document) => $this->constructEntity(InputHelper::type($document, Document::class)),
            $this->collectionHandler->all($this->collectionId)->getAll()
        );
    }

    public function aql(string $query, array $bindVars): array
    {
        return array_map(
            fn (Document $document) => $this->constructEntity($document),
            $this->rawAql($query, $bindVars)
        );
    }

    public function rawAql(string $query, array $bindVars): array
    {
        $statement = new Statement(
            $this->arangoDatabase->getConnection(),
            [
                "query" => $query,
                "count" => true,
                "batchSize" => 100,
                "sanitize" => true,
                "bindVars" => $bindVars,
            ]
        );

        $ret = [];
        foreach ($statement->execute() as $_id => $document) {
            $ret[] = $document;
        }

        return $ret;
    }

    public function countAll(): int
    {
        return $this->aql(sprintf('RETURN LENGTH(%s)', $this->getCollectionName()), [], false)[0];
    }
}

Leave a Reply

Your email address will not be published. Required fields are marked *