The official documentation on ArangoDB usage in Symfony is [literally from 2013](https://www.arangodb.com/2013/03/getting-started-with-arangodb-and-symfony-part1/) and hence targeting a pretty old Symfony version. This blag post will cover adding minimal ArangoDB support by hand. The actual database protocol and interacting will be handled by [ArangoDBClient](https://github.com/arangodb/arangodb-php).
## Plan
The goal of this post is not to draft a fully integrated bundle for Symfony but merely use ArangoDb as an example to learn how to add support for a more exotic database system to the framework. Hence I might explain a few concepts that you are well-aware if you ever worked with other database abstraction layers.
In order to keep things cozy, I'll model the directory structure after doctrine:
* The
Entity
directory will contain classes representing different database objects: think of an entity as a single row in a table or a single documents in a collection. Those entities should contain relatively little domain logic.
* The Repository
directory will contain repository classes for the different collections. We will be using inheritance to share the Arango-specific logic between the different repositories. For this, we will be using an AbstractArangoRepository
.
* Finally, we will be using a single ArangoDatabase
service class. This allows us for a central point to inject database credentials, a single point to interface with the ArangoDBClient, and the insurance that we will only open a single connection and not, say, one per repository.
So execute
composer require triagens/arangodb
and start coding.
## Entities
Not much to say about those: entities are mainly data transfer objects (DTOs) to represent documents from the database in a more strongly typed way. Here are two example entities:
<?php declare(strict_types=1);
namespace App\Entity;
class Source
{
private string $id;
private string $name;
public function __construct(string $id, string $name)
{
$this->id = $id;
$this->name = $name;
}
public function getId(): string
{
return $this->id;
}
public function getName(): string
{
return $this->name;
}
}
which will represent the source of a malware sample (Hatching, MalShare, ...), and the following, which will represent a concrete malware sample:
<?php declare(strict_types=1);
namespace App\Entity;
class Sample
{
private string $id;
private int $crc32;
private string $md5;
private string $sha1;
private string $sha256;
private int $size;
public function __construct(string $id, int $crc32, string $md5, string $sha1, string $sha256, int $size)
{
$this->id = $id;
$this->crc32 = $crc32;
$this->md5 = $md5;
$this->sha1 = $sha1;
$this->sha256 = $sha256;
$this->size = $size;
}
public function getId(): string
{
return $this->id;
}
public function getCrc32(): int
{
return $this->crc32;
}
// more getters here
}
## Service Class
In order to allow specification of the database credential through the environment, we need to adjust config/services.yaml
:
services:
App\Arango\ArangoDatabase:
arguments:
$protocol: '%env(ARANGO_PROTOCOL)%'
$hostname: '%env(ARANGO_HOSTNAME)%'
$port: '%env(ARANGO_PORT)%'
$databaseName: '%env(ARANGO_DATABASE)%'
$username: '%env(ARANGO_USERNAME)%'
$password: '%env(ARANGO_PASSWORD)%'
This will allow for an .env.local
file like the following:
ARANGO_PROTOCOL=tcp
ARANGO_HOSTNAME=127.0.0.1
ARANGO_PORT=8529
ARANGO_DATABASE=dbname
ARANGO_USERNAME=root
ARANGO_PASSWORD=123456
The service class itself is relatively simple, it's main purpose is to expose the database connection to repositories through the getConnection
function:
<?php declare(strict_types=1);
namespace App\Arango;
use ArangoDBClient\Connection;
use ArangoDBClient\ConnectionOptions;
class ArangoDatabase
{
private Connection $connection;
public function __construct(
string $hostname,
int $port,
string $databaseName,
string $username,
string $password,
string $protocol = 'tcp',
) {
$connectionOptions = [
ConnectionOptions::OPTION_DATABASE => $databaseName,
ConnectionOptions::OPTION_ENDPOINT => sprintf('%s://%s:%s', $protocol, $hostname, $port),
ConnectionOptions::OPTION_AUTH_TYPE => 'Basic',
ConnectionOptions::OPTION_AUTH_USER => $username,
ConnectionOptions::OPTION_AUTH_PASSWD => $password,
ConnectionOptions::OPTION_CONNECTION => 'Close',
ConnectionOptions::OPTION_RECONNECT => true,
];
$this->connection = new Connection($connectionOptions);
}
public function getConnection(): Connection
{
return $this->connection;
}
}
A tiny sideline: I found the documentation on allowed protocols in the ConnectionOptions::OPTION_ENDPOINT
field a bit confusing. But the tl;dr is: use http+ssl
with an explicitly specified port of 443
if you want to go through https
.
## Repositories
In-line with the "Entities" section, we will specify two repositories. One for sources:
<?php declare(strict_types=1);
namespace App\Repository;
use App\Arango\AbstractArangoRepository;
use App\Entity\Source;
use ArangoDBClient\Document;
/**
* @template-extends AbstractArangoRepository<Source>
*/
class SourceRepository extends AbstractArangoRepository
{
protected function getCollectionName(): string
{
return 'sources';
}
protected function constructEntity(Document $document): Source
{
return new Source($document->getId(), $document->get('name'));
}
}
And one for samples:
<?php declare(strict_types=1);
namespace App\Repository;
use App\Arango\AbstractArangoRepository;
use App\Entity\Sample;
use App\InputHelper;
use ArangoDBClient\Document;
/**
* @template-extends AbstractArangoRepository<Sample>
*/
class SampleRepository extends AbstractArangoRepository
{
protected function getCollectionName(): string
{
return 'samples';
}
protected function constructEntity(Document $document): object
{
return new Sample(
$document->getId(),
$document->get('crc32'),
$document->get('md5'),
$document->get('sha1'),
$document->get('sha256'),
$document->get('size'),
);
}
}
This repository class would also be the place to implement more specific query functions like, say, count all samples for a given source. Anticipating the aql
method from the next section — which will allow execution of arbitrary AQL statements — this function will access the edge collection (between sources and samples) for counting:
public function countBySource(Source $source): int
{
$ret = $this->rawAql(
<<<AQL
FOR edge in sample_from_source
FILTER edge._to == @source
COLLECT WITH COUNT INTO cnt
RETURN cnt
AQL,
['source' => $source->getId()],
);
return $ret[0];
}
## Abstract Base Repository
The last missing piece is the abstract repository, which will look like the following:
<?php
declare(strict_types=1);
namespace App\Arango;
use App\InputHelper;
use ArangoDBClient\Collection;
use ArangoDBClient\CollectionHandler;
use ArangoDBClient\Document;
use ArangoDBClient\DocumentHandler;
use ArangoDBClient\Exception;
use ArangoDBClient\Statement;
/**
* @template T
*/
abstract class AbstractArangoRepository
{
private ArangoDatabase $arangoDatabase;
private Collection $collectionId;
private CollectionHandler $collectionHandler;
private DocumentHandler $documentHandler;
public function __construct(ArangoDatabase $arangoDatabase)
{
$this->arangoDatabase = $arangoDatabase;
$this->collectionHandler = new CollectionHandler($arangoDatabase->getConnection());
$this->collectionId = $this->collectionHandler->get(new Collection($this->getCollectionName()));
$this->documentHandler = new DocumentHandler($arangoDatabase->getConnection());
}
abstract protected function getCollectionName(): string;
/**
* @return T
*/
abstract protected function constructEntity(Document $document): object;
/**
* @return ?T
*/
public function get(string $id): ?object
{
return $this->constructEntity($this->documentHandler->get($this->getCollectionName(), $id));
}
/**
* @param string[] $ids
* @return T[]
*/
public function findByIds(array $ids): array
{
return $this->aql(
sprintf('FOR row in %s FILTER row._id IN @ids RETURN row', $this->getCollectionName()),
['ids' => $ids]
);
}
/**
* @return T[]
*/
public function findAll(): array
{
return array_map(
fn (Document $document) => $this->constructEntity(InputHelper::type($document, Document::class)),
$this->collectionHandler->all($this->collectionId)->getAll()
);
}
public function aql(string $query, array $bindVars): array
{
return array_map(
fn (Document $document) => $this->constructEntity($document),
$this->rawAql($query, $bindVars)
);
}
public function rawAql(string $query, array $bindVars): array
{
$statement = new Statement(
$this->arangoDatabase->getConnection(),
[
"query" => $query,
"count" => true,
"batchSize" => 100,
"sanitize" => true,
"bindVars" => $bindVars,
]
);
$ret = [];
foreach ($statement->execute() as $_id => $document) {
$ret[] = $document;
}
return $ret;
}
public function countAll(): int
{
return $this->aql(sprintf('RETURN LENGTH(%s)', $this->getCollectionName()), [], false)[0];
}
}