Skip to main content

Manage audit logs within StarRocks via Audit Loader

Manage audit logs within StarRocks via Audit Loader

This topic describes how to manage StarRocks audit logs within a table via the plugin - Audit Loader.

StarRocks stores its audit logs in the local file fe/log/fe.audit.log rather than an internal database. The plugin Audit Loader allows you to manage audit logs directly within your cluster. Audit Loader reads logs from the file, and loads them into StarRocks via HTTP PUT.

Create a table to store audit logs

Create a database and a table in your StarRocks cluster to store its audit logs. See CREATE DATABASE and CREATE TABLE for detailed instructions.

Because the fields of audit logs vary among different StarRocks versions, you must choose among the following examples to create a table that is compatible with your StarRocks.

CAUTION

DO NOT change the table schema in the examples, or the log loading will fail.

  • StarRocks v2.4, v2.5, v3.0, v3.1, and later minor versions:
CREATE DATABASE starrocks_audit_db__;

CREATE TABLE starrocks_audit_db__.starrocks_audit_tbl__ (
`queryId` VARCHAR(48) COMMENT "Unique query ID",
`timestamp` DATETIME NOT NULL COMMENT "Query start time",
`queryType` VARCHAR(12) COMMENT "Query type (query, slow_query)",
`clientIp` VARCHAR(32) COMMENT "Client IP address",
`user` VARCHAR(64) COMMENT "User who initiates the query",
`authorizedUser` VARCHAR(64) COMMENT "user_identity",
`resourceGroup` VARCHAR(64) COMMENT "Resource group name",
`catalog` VARCHAR(32) COMMENT "Catalog name",
`db` VARCHAR(96) COMMENT "Database that the query scans",
`state` VARCHAR(8) COMMENT "Query state (EOF, ERR, OK)",
`errorCode` VARCHAR(96) COMMENT "Error code",
`queryTime` BIGINT COMMENT "Query latency in milliseconds",
`scanBytes` BIGINT COMMENT "Size of the scanned data in bytes",
`scanRows` BIGINT COMMENT "Row count of the scanned data",
`returnRows` BIGINT COMMENT "Row count of the result",
`cpuCostNs` BIGINT COMMENT "CPU resources consumption time for query in nanoseconds",
`memCostBytes` BIGINT COMMENT "Memory cost for query in bytes",
`stmtId` INT COMMENT "Incremental SQL statement ID",
`isQuery` TINYINT COMMENT "If the SQL is a query (0 and 1)",
`feIp` VARCHAR(32) COMMENT "IP address of FE that executes the SQL",
`stmt` STRING COMMENT "SQL statement",
`digest` VARCHAR(32) COMMENT "SQL fingerprint",
`planCpuCosts` DOUBLE COMMENT "CPU resources consumption time for planning in nanoseconds",
`planMemCosts` DOUBLE COMMENT "Memory cost for planning in bytes"
) ENGINE = OLAP
DUPLICATE KEY (`queryId`, `timestamp`, `queryType`)
COMMENT "Audit log table"
PARTITION BY RANGE (`timestamp`) ()
DISTRIBUTED BY HASH (`queryId`) BUCKETS 3
PROPERTIES (
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.start" = "-30",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.buckets" = "3",
"dynamic_partition.enable" = "true",
"replication_num" = "3"
);
  • StarRocks v2.3.0 and later minor versions:
CREATE DATABASE starrocks_audit_db__;
CREATE TABLE starrocks_audit_db__.starrocks_audit_tbl__ (
`queryId` VARCHAR(48) COMMENT "Unique query ID",
`timestamp` DATETIME NOT NULL COMMENT "Query start time",
`clientIp` VARCHAR(32) COMMENT "Client IP address",
`user` VARCHAR(64) COMMENT "User who initiates the query",
`resourceGroup` VARCHAR(64) COMMENT "Resource group name",
`db` VARCHAR(96) COMMENT "Database that the query scans",
`state` VARCHAR(8) COMMENT "Query state (EOF, ERR, OK)",
`errorCode` VARCHAR(96) COMMENT "Error code",
`queryTime` BIGINT COMMENT "Query latency in milliseconds",
`scanBytes` BIGINT COMMENT "Size of the scanned data in bytes",
`scanRows` BIGINT COMMENT "Row count of the scanned data",
`returnRows` BIGINT COMMENT "Row count of the result",
`cpuCostNs` BIGINT COMMENT "CPU resources consumption time for query in nanoseconds",
`memCostBytes` BIGINT COMMENT "Memory cost for query in bytes",
`stmtId` INT COMMENT "Incremental SQL statement ID",
`isQuery` TINYINT COMMENT "If the SQL is a query (0 and 1)",
`feIp` VARCHAR(32) COMMENT "IP address of FE that executes the SQL",
`stmt` STRING COMMENT "SQL statement",
`digest` VARCHAR(32) COMMENT "SQL fingerprint",
`planCpuCosts` DOUBLE COMMENT "CPU resources consumption time for planning in nanoseconds",
`planMemCosts` DOUBLE COMMENT "Memory cost for planning in bytes"
) ENGINE = OLAP
DUPLICATE KEY (`queryId`, `timestamp`, `clientIp`)
COMMENT "Audit log table"
PARTITION BY RANGE (`timestamp`) ()
DISTRIBUTED BY HASH (`queryId`) BUCKETS 3
PROPERTIES (
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.start" = "-30",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.enable" = "true",
"replication_num" = "3"
);
  • StarRocks v2.2.1 and later minor versions:
CREATE DATABASE starrocks_audit_db__;
CREATE TABLE starrocks_audit_db__.starrocks_audit_tbl__
(
query_id VARCHAR(48) COMMENT "Unique query ID",
time DATETIME NOT NULL COMMENT "Query start time",
client_ip VARCHAR(32) COMMENT "Client IP address",
user VARCHAR(64) COMMENT "User who initiates the query",
db VARCHAR(96) COMMENT "Database that the query scans",
state VARCHAR(8) COMMENT "Query state (EOF, ERR, OK)",
query_time BIGINT COMMENT "Query latency in milliseconds",
scan_bytes BIGINT COMMENT "Size of the scanned data in bytes",
scan_rows BIGINT COMMENT "Row count of the scanned data",
return_rows BIGINT COMMENT "Row count of the result",
cpu_cost_ns BIGINT COMMENT "CPU resources consumption time for query in nanoseconds",
mem_cost_bytes BIGINT COMMENT "Memory cost for query in bytes",
stmt_id INT COMMENT "Incremental SQL statement ID",
is_query TINYINT COMMENT "If the SQL is a query (0 and 1)",
frontend_ip VARCHAR(32) COMMENT "IP address of FE that executes the SQL",
stmt STRING COMMENT "SQL statement",
digest VARCHAR(32) COMMENT "SQL fingerprint"
) engine=OLAP
duplicate key(query_id, time, client_ip)
partition by range(time) ()
distributed by hash(query_id) BUCKETS 3
properties(
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.start" = "-30",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.enable" = "true",
"replication_num" = "3"
);
  • StarRocks v2.2.0, v2.1.0 and later minor versions:
CREATE DATABASE starrocks_audit_db__;
CREATE TABLE starrocks_audit_db__.starrocks_audit_tbl__
(
query_id VARCHAR(48) COMMENT "Unique query ID",
time DATETIME NOT NULL COMMENT "Query start time",
client_ip VARCHAR(32) COMMENT "Client IP address",
user VARCHAR(64) COMMENT "User who initiates the query",
db VARCHAR(96) COMMENT "Database that the query scans",
state VARCHAR(8) COMMENT "Query state (EOFE, RR, OK)",
query_time BIGINT COMMENT "Query latency in milliseconds",
scan_bytes BIGINT COMMENT "Size of the scanned data in bytes",
scan_rows BIGINT COMMENT "Row count of the scanned data",
return_rows BIGINT COMMENT "Row count of the result",
stmt_id INT COMMENT "Incremental SQL statement ID",
is_query TINYINT COMMENT "If the SQL is a query (0 and 1)",
frontend_ip VARCHAR(32) COMMENT "IP address of FE that executes the SQL",
stmt STRING COMMENT "SQL statement",
digest VARCHAR(32) COMMENT "SQL fingerprint"
) engine=OLAP
DUPLICATE KEY(query_id, time, client_ip)
PARTITION BY RANGE(time) ()
DISTRIBUTED BY HASH(query_id) BUCKETS 3
PROPERTIES(
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.start" = "-30",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.enable" = "true",
"replication_num" = "3"
);
  • StarRocks v2.0.0 and later minor versions, StarRocks v1.19.0 and later minor versions:
CREATE DATABASE starrocks_audit_db__;
CREATE TABLE starrocks_audit_db__.starrocks_audit_tbl__
(
query_id VARCHAR(48) COMMENT "Unique query ID",
time DATETIME NOT NULL COMMENT "Query start time",
client_ip VARCHAR(32) COMMENT "Client IP address",
user VARCHAR(64) COMMENT "User who initiates the query",
db VARCHAR(96) COMMENT "Database that the query scans",
state VARCHAR(8) COMMENT "Query state (EOF, ERR, OK)",
query_time BIGINT COMMENT "Query latency in milliseconds",
scan_bytes BIGINT COMMENT "Size of the scanned data in bytes",
scan_rows BIGINT COMMENT "Row count of the scanned data",
return_rows BIGINT COMMENT "Row count of the result",
stmt_id INT COMMENT "Incremental SQL statement ID",
is_query TINYINT COMMENT "If the SQL is a query (0 and 1)",
frontend_ip VARCHAR(32) COMMENT "IP address of FE that executes the SQL",
stmt STRING COMMENT "SQL statement"
) engine=OLAP
DUPLICATE KEY(query_id, time, client_ip)
PARTITION BY RANGE(time) ()
DISTRIBUTED BY HASH(query_id) BUCKETS 3
PROPERTIES(
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.start" = "-30",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p",
"dynamic_partition.enable" = "true",
"replication_num" = "3"
);

starrocks_audit_tbl__ is created with dynamic partitions. By default, the first dynamic partition is created 10 minutes after the table is created. Audit logs can then be loaded into the table. You can check the partitions in the table using the following statement:

SHOW PARTITIONS FROM starrocks_audit_db__.starrocks_audit_tbl__;

After a partition is created, you can move on to the next step.

Download and configure Audit Loader

  1. Download the Audit Loader installation package. The package contains multiple directories for different StarRocks versions. You must navigate to the corresponding directory and install the package that is compatible with your StarRocks.

    • 2.4: StarRocks v2.4.0 and later minor versions
    • 2.3: StarRocks v2.3.0 and later minor versions
    • 2.2.1+: StarRocks v2.2.1 and later minor versions
    • 2.1-2.2.0: StarRocks v2.2.0, StarRocks v2.1.0 and later minor versions
    • 1.18.2-2.0: StarRocks v2.0.0 and later minor versions, StarRocks v1.19.0 and later minor versions
  2. Unzip the installation package.

    unzip auditloader.zip

    The following files are inflated:

    • auditloader.jar: the JAR file of Audit Loader.
    • plugin.properties: the properties file of Audit Loader.
    • plugin.conf: the configuration file of Audit Loader.
  3. Modify plugin.conf to configure Audit Loader. You must configure the following items to make sure Audit Loader can work properly:

    • frontend_host_port: FE IP address and HTTP port, in the format <fe_ip>:<fe_http_port>. The default value is 127.0.0.1:8030.
    • database: name of the database you created to host audit logs.
    • table: name of the table you created to host audit logs.
    • user: your cluster username. You MUST have the privilege to load data (LOAD_PRIV) into the table.
    • password: your user password.
  4. Zip the files back into a package.

    zip -q -m -r auditloader.zip auditloader.jar plugin.conf plugin.properties
  5. Dispatch the package to all machines that host FE nodes. Make sure all packages are stored in an identical path. Otherwise, the installation fails. Remember to copy the absolute path to the package after you dispatched the package.

Install Audit Loader

Execute the following statement along with the path you copied to install Audit Loader as a plugin in StarRocks:

INSTALL PLUGIN FROM "<absolute_path_to_package>";

See INSTALL PLUGIN for detailed instructions.

Verify the installation and query audit logs

  1. You can check if the installation is successful via SHOW PLUGINS.In the following example, the Status of the plugin AuditLoader is INSTALLED, meaning installation is successful.

    mysql> SHOW PLUGINS\G
    *************************** 1. row ***************************
    Name: __builtin_AuditLogBuilder
    Type: AUDIT
    Description: builtin audit logger
    Version: 0.12.0
    JavaVersion: 1.8.31
    ClassName: com.starrocks.qe.AuditLogBuilder
    SoName: NULL
    Sources: Builtin
    Status: INSTALLED
    Properties: {}
    *************************** 2. row ***************************
    Name: AuditLoader
    Type: AUDIT
    Description: load audit log to olap load, and user can view the statistic of queries
    Version: 1.0.1
    JavaVersion: 1.8.0
    ClassName: com.starrocks.plugin.audit.AuditLoaderPlugin
    SoName: NULL
    Sources: /x/xx/xxx/xxxxx/auditloader.zip
    Status: INSTALLED
    Properties: {}
    2 rows in set (0.01 sec)
  2. Execute some random SQLs to generate audit logs, and wait for 60 seconds (or the time you have specified in the item max_batch_interval_sec when you configure Audit Loader) to allow Audit Loader to load audit logs into StarRocks.

  3. Check the audit logs by querying the table.

    SELECT * FROM starrocks_audit_db__.starrocks_audit_tbl__;

    The following example shows when audit logs are loaded into the table successfully:

    mysql> SELECT * FROM starrocks_audit_db__.starrocks_audit_tbl__\G
    *************************** 1. row ***************************
    queryId: 082ddf02-6492-11ed-a071-6ae6b1db20eb
    timestamp: 2022-11-15 11:03:08
    clientIp: xxx.xx.xxx.xx:33544
    user: root
    resourceGroup: default_wg
    db:
    state: EOF
    errorCode:
    queryTime: 8
    scanBytes: 0
    scanRows: 0
    returnRows: 0
    cpuCostNs: 62380
    memCostBytes: 14504
    stmtId: 33
    isQuery: 1
    feIp: xxx.xx.xxx.xx
    stmt: SELECT * FROM starrocks_audit_db__.starrocks_audit_tbl__
    digest:
    planCpuCosts: 21
    planMemCosts: 0
    1 row in set (0.01 sec)

Troubleshooting

If no audit logs are loaded to the table after the dynamic partition is created and the plugin is installed, you can check whether plugin.conf is configured properly or not. To modify it, you must first uninstall the plugin:

UNINSTALL PLUGIN AuditLoader;

After all configurations are set correctly, you can follow the above steps to install Audit Loader again.