ENTRADA creates a databases named entrada, in this database two data tables are created, the dns table contains the DNS data and the icmp that contains ICMP data.
The new 2.x database schema is not compatible with theParquet files generated by the 0.x version of ENTRADA.
Apache Impala uses index-based column indexing and this breaks when using the new schema because ENTRADA 2.x added and removed columns.
The fix for this is to make sure Impala uses named-based indexing, this can be enabled using the PARQUET_FALLBACK_SCHEMA_RESOLUTION option.
set PARQUET_FALLBACK_SCHEMA_RESOLUTION=name;
select count(1) from entrada.dns;
The staging table has been removed, all data is now directly inserted into entrada.dns by default.
| Column | Now use |
|---|---|
| unixtime (secs) | time (millis) |
| len | req_len, res_len |
| dns_len | req_len, res_len |
| udp_sum | - |
| is_google | pub_resolver |
| is_open_dns | pub_resolver |
| pub_resolver | Description |
|---|---|
| pub_resolver | name of public resolver |
| req_len | length of DNS request |
| res_len | length of DNS response |
| tcp_hs_rtt | RTT (ms) of TCP handshake |
| tcp_pk_rtt | RTT (ms) of TCP server response |