ENTRADA can be deployed using Docker Compose, download one of the example Docker Compose scripts and save it as docker-compose.yml and then edit the script to configure the variables to fit your requirements and then start the container using the docker-compose command:
docker-compose up
There is no web interface, a limited HTTP based API is available mostly to cleanly stop the ENTRADA process inside the container, before stopping the container. When the container is started ENTRADA will monitor the input directory for new pcap files, when new files are detected the data is converted to Parquet format and uploaded to HDFS/AWS or saved on a local disk.
Follow these steps to deploy ENTRADA using a H2 database and AWS Athena. The H2 database is only recommended for testing and evaluating of ENTRADA.
docker-compose-h2-aws.yml to docker-compose.ymldocker-compose.yml in an editorENTRADA_NAMESERVERS. For this test use the value test-ns.AWS_BUCKET. This bucket will be created by ENTRADA.docker-compose up, watch the log file $ENTRADA_HOME/log/*.log for errors.entrada database and tables.test-ns in $ENTRADA_HOME/inputentrada.dns table, A single row should have been added to the table.Follow these steps to deploy ENTRADA using a H2 database without a query engine (Impala or Athena). The generated parquet data will be saved in the configured location on local disk. The H2 database is only recommended for testing and evaluating of ENTRADA.
docker-compose-h2-local.yml to docker-compose.ymldocker-compose.yml in an editorENTRADA_NAMESERVERS variable in the Docker compose file. For this test use the value test-ns.docker-compose up, watch the log file $ENTRADA_HOME/log/*.log for errors.test-ns in $ENTRADA_HOME/inputENTRADA expects the input directory to contain a sub-directory for each name server.
Each name server sub-directory should use the following format <ns>_<anycast_site>, the ns and anycast_site parts will be extracted and the ns part is used to partition the Parquet data on name server name value. The anycast_site part will be saved in the server_location column of the dns table.
There are also example scripts for the following combinations.
| Script | Database | Mode | Description |
|---|---|---|---|
| H2+AWS | H2 | AWS | Use only for testing and evaluation |
| H2+Local | H2 | Local | Use only for testing and evaluation |
| PostgreSQL+AWS | PostgreSQL | AWS | Can be used for production |
| PostgreSQL+Hadoop | PostgreSQL | Hadoop | Standard Hadoop, can be used for production |
| PostgreSQL+Secure Hadoop | PostgreSQL | Hadoop | Secure Hadoop (Kerberos), can be used for production |
| PostgreSQL+Local | PostgreSQL | Local | Local storage, can be used for production |
ENTRADA supports multiple modes of operation
In local mode all input and output directories must be local and no SQL-engine is used.
Use this mode if you want to save the created Parquet files on the local system.
In AWS mode the output directory must be on S3, and Athena is used as a SQL-engine.
The input and archive directories may be on S3 but may also be on the local filesystem.
ENTRADA can create a bucket and configure the correct security settings ( Access control and encryption).
In Hadoop mode the output directory must be on HDFS, and Impala is used as a SQL-engine.
The input and archive directories may be on HDFS but may also be on the local filesystem.
ENTRADA requires a database to persist information about processed files and created database partitions. There is support for both PostgreSQL and H2,
H2 is a fast database that is useful for testing scenarios, you should only use it when evaluating or testing ENTRADA functionality.
PostgreSQL should be used when ENTRADA is deployed in a production environment.