Use ArangoDB in Symfony without an external bundle



The official documentation on ArangoDB usage in Symfony is [literally from 2013](https://www.arangodb.com/2013/03/getting-started-with-arangodb-and-symfony-part1/) and hence targeting a pretty old Symfony version. This blag post will cover adding minimal ArangoDB support by hand. The actual database protocol and interacting will be handled by [ArangoDBClient](https://github.com/arangodb/arangodb-php). <span id="more-5858"></span> ## Plan The goal of this post is not to draft a fully integrated bundle for Symfony but merely use ArangoDb as an example to learn how to add support for a more exotic database system to the framework. Hence I might explain a few concepts that you are well-aware if you ever worked with other database abstraction layers. In order to keep things cozy, I'll model the directory structure after doctrine: * The `Entity` directory will contain classes representing different database objects: think of an entity as a single row in a table or a single documents in a collection. Those entities should contain relatively little domain logic. * The `Repository` directory will contain repository classes for the different collections. We will be using inheritance to share the Arango-specific logic between the different repositories. For this, we will be using an `AbstractArangoRepository`. * Finally, we will be using a single `ArangoDatabase` service class. This allows us for a central point to inject database credentials, a single point to interface with the ArangoDBClient, and the insurance that we will only open a single connection and not, say, one per repository. So execute ```bash composer require triagens/arangodb ``` and start coding. ## Entities Not much to say about those: entities are mainly data transfer objects (DTOs) to represent documents from the database in a more strongly typed way. Here are two example entities: ```php <?php declare(strict_types=1); namespace App\Entity; class Source { private string $id; private string $name; public function __construct(string $id, string $name) { $this->id = $id; $this->name = $name; } public function getId(): string { return $this->id; } public function getName(): string { return $this->name; } } ``` which will represent the source of a malware sample (Hatching, MalShare, ...), and the following, which will represent a concrete malware sample: ```php <?php declare(strict_types=1); namespace App\Entity; class Sample { private string $id; private int $crc32; private string $md5; private string $sha1; private string $sha256; private int $size; public function __construct(string $id, int $crc32, string $md5, string $sha1, string $sha256, int $size) { $this->id = $id; $this->crc32 = $crc32; $this->md5 = $md5; $this->sha1 = $sha1; $this->sha256 = $sha256; $this->size = $size; } public function getId(): string { return $this->id; } public function getCrc32(): int { return $this->crc32; } // more getters here } ``` ## Service Class In order to allow specification of the database credential through the environment, we need to adjust `config/services.yaml`: ``` services: App\Arango\ArangoDatabase: arguments: $protocol: '%env(ARANGO_PROTOCOL)%' $hostname: '%env(ARANGO_HOSTNAME)%' $port: '%env(ARANGO_PORT)%' $databaseName: '%env(ARANGO_DATABASE)%' $username: '%env(ARANGO_USERNAME)%' $password: '%env(ARANGO_PASSWORD)%' ``` This will allow for an `.env.local` file like the following: ``` ARANGO_PROTOCOL=tcp ARANGO_HOSTNAME=127.0.0.1 ARANGO_PORT=8529 ARANGO_DATABASE=dbname ARANGO_USERNAME=root ARANGO_PASSWORD=123456 ``` The service class itself is relatively simple, it's main purpose is to expose the database connection to repositories through the `getConnection` function: ```php <?php declare(strict_types=1); namespace App\Arango; use ArangoDBClient\Connection; use ArangoDBClient\ConnectionOptions; class ArangoDatabase { private Connection $connection; public function __construct( string $hostname, int $port, string $databaseName, string $username, string $password, string $protocol = 'tcp', ) { $connectionOptions = [ ConnectionOptions::OPTION_DATABASE => $databaseName, ConnectionOptions::OPTION_ENDPOINT => sprintf('%s://%s:%s', $protocol, $hostname, $port), ConnectionOptions::OPTION_AUTH_TYPE => 'Basic', ConnectionOptions::OPTION_AUTH_USER => $username, ConnectionOptions::OPTION_AUTH_PASSWD => $password, ConnectionOptions::OPTION_CONNECTION => 'Close', ConnectionOptions::OPTION_RECONNECT => true, ]; $this->connection = new Connection($connectionOptions); } public function getConnection(): Connection { return $this->connection; } } ``` A tiny sideline: I found the documentation on allowed protocols in the `ConnectionOptions::OPTION_ENDPOINT` field a bit confusing. But the tl;dr is: use `http+ssl` with an explicitly specified port of `443` if you want to go through `https`. ## Repositories In-line with the "Entities" section, we will specify two repositories. One for sources: ```php <?php declare(strict_types=1); namespace App\Repository; use App\Arango\AbstractArangoRepository; use App\Entity\Source; use ArangoDBClient\Document; /** * @template-extends AbstractArangoRepository<Source> */ class SourceRepository extends AbstractArangoRepository { protected function getCollectionName(): string { return 'sources'; } protected function constructEntity(Document $document): Source { return new Source($document->getId(), $document->get('name')); } } ``` And one for samples: ```php <?php declare(strict_types=1); namespace App\Repository; use App\Arango\AbstractArangoRepository; use App\Entity\Sample; use App\InputHelper; use ArangoDBClient\Document; /** * @template-extends AbstractArangoRepository<Sample> */ class SampleRepository extends AbstractArangoRepository { protected function getCollectionName(): string { return 'samples'; } protected function constructEntity(Document $document): object { return new Sample( $document->getId(), $document->get('crc32'), $document->get('md5'), $document->get('sha1'), $document->get('sha256'), $document->get('size'), ); } } ``` This repository class would also be the place to implement more specific query functions like, say, count all samples for a given source. Anticipating the `aql` method from the next section — which will allow execution of arbitrary AQL statements — this function will access the edge collection (between sources and samples) for counting: ```php public function countBySource(Source $source): int { $ret = $this->rawAql( <<<AQL FOR edge in sample_from_source FILTER edge._to == @source COLLECT WITH COUNT INTO cnt RETURN cnt AQL, ['source' => $source->getId()], ); return $ret[0]; } ``` ## Abstract Base Repository The last missing piece is the abstract repository, which will look like the following: ```php <?php declare(strict_types=1); namespace App\Arango; use App\InputHelper; use ArangoDBClient\Collection; use ArangoDBClient\CollectionHandler; use ArangoDBClient\Document; use ArangoDBClient\DocumentHandler; use ArangoDBClient\Exception; use ArangoDBClient\Statement; /** * @template T */ abstract class AbstractArangoRepository { private ArangoDatabase $arangoDatabase; private Collection $collectionId; private CollectionHandler $collectionHandler; private DocumentHandler $documentHandler; public function __construct(ArangoDatabase $arangoDatabase) { $this->arangoDatabase = $arangoDatabase; $this->collectionHandler = new CollectionHandler($arangoDatabase->getConnection()); $this->collectionId = $this->collectionHandler->get(new Collection($this->getCollectionName())); $this->documentHandler = new DocumentHandler($arangoDatabase->getConnection()); } abstract protected function getCollectionName(): string; /** * @return T */ abstract protected function constructEntity(Document $document): object; /** * @return ?T */ public function get(string $id): ?object { return $this->constructEntity($this->documentHandler->get($this->getCollectionName(), $id)); } /** * @param string[] $ids * @return T[] */ public function findByIds(array $ids): array { return $this->aql( sprintf('FOR row in %s FILTER row._id IN @ids RETURN row', $this->getCollectionName()), ['ids' => $ids] ); } /** * @return T[] */ public function findAll(): array { return array_map( fn (Document $document) => $this->constructEntity(InputHelper::type($document, Document::class)), $this->collectionHandler->all($this->collectionId)->getAll() ); } public function aql(string $query, array $bindVars): array { return array_map( fn (Document $document) => $this->constructEntity($document), $this->rawAql($query, $bindVars) ); } public function rawAql(string $query, array $bindVars): array { $statement = new Statement( $this->arangoDatabase->getConnection(), [ "query" => $query, "count" => true, "batchSize" => 100, "sanitize" => true, "bindVars" => $bindVars, ] ); $ret = []; foreach ($statement->execute() as $_id => $document) { $ret[] = $document; } return $ret; } public function countAll(): int { return $this->aql(sprintf('RETURN LENGTH(%s)', $this->getCollectionName()), [], false)[0]; } } ```

Leave a Reply

Your email address will not be published. Required fields are marked *