diff --git a/docs/src/user-docs/guides-using-object-storage/clp-usage.md b/docs/src/user-docs/guides-using-object-storage/clp-usage.md index e4eec83c76..db978e663c 100644 --- a/docs/src/user-docs/guides-using-object-storage/clp-usage.md +++ b/docs/src/user-docs/guides-using-object-storage/clp-usage.md @@ -5,35 +5,97 @@ should be able to use CLP as described in the [clp-json quick-start guide](../qu ## Compressing logs from S3 -To compress logs from S3, use the `sbin/compress.sh` script as follows, replacing the fields in -angle brackets (`<>`) with the appropriate values: +To compress logs from S3, use the `sbin/compress-from-s3.sh` script. The script supports two modes +of operation: + +* [**s3-object** mode](#s3-object-compression-mode): Compress S3 objects specified by their full + S3 URLs. +* [**s3-key-prefix** mode](#s3-key-prefix-compression-mode): Compress all S3 objects under a given + S3 key prefix. + +### `s3-object` compression mode + +The `s3-object` mode allows you to specify individual S3 objects to compress by using their full +URLs. To use this mode, call the `sbin/compress-from-s3.sh` script as follows, and replace the +fields in angle brackets (`<>`) with the appropriate values: ```bash -sbin/compress.sh \ +sbin/compress-from-s3.sh \ --timestamp-key \ - + --dataset \ + s3-object \ + [ ...] ``` -* `` is a URL identifying the logs to compress. It can have one of two formats: - * `https://.s3..amazonaws.com/` - * `https://s3..amazonaws.com//` -* The fields in `` are as follows: +* `` is a URL identifying the S3 object to compress. It can be written in either of two + formats: + * `https://.s3..amazonaws.com/` + * `https://s3..amazonaws.com//` +* The fields in `` are as follows: * `` is the name of the S3 bucket containing your logs. * `` is the AWS region [code][aws-region-codes] for the S3 bucket containing your logs. - * `` is the prefix of all logs you wish to compress and must begin with the - `` value from the [compression IAM policy][compression-iam-policy]. + * `` is the [object key][aws-s3-object-key] of the log file object you wish to + compress. + + :::{warning} + There must be no duplicate object keys across all `` arguments. + ::: + + +* For a description of other fields, see the [clp-json quick-start + guide](../quick-start/clp-json.md#compressing-json-logs). + +Instead of specifying input object URLs explicitly in the command, you may specify them in a text +file and then pass the file into the command using the `--inputs-from` flag, like so: + +```bash +sbin/compress-from-s3.sh \ + --timestamp-key \ + --dataset \ + s3-object \ + --inputs-from +``` + +* `` is a path to a text file containing one S3 object URL **per line**. The URLs must + follow the same format as described above for ``. :::{note} -Compressing from S3 only supports a single URL but will compress any logs that have the given -prefix. +The `s3-object` mode requires the input object keys to share a non-empty common prefix. If the input +object keys do not share a common prefix, they will be rejected and no compression job will be +created. This limitation will be addressed in a future release. +::: + +### `s3-key-prefix` compression mode + +The `s3-key-prefix` mode allows you to compress all objects under a given S3 key prefix. To use this +mode, call the `sbin/compress-from-s3.sh` script as follows, and replace the fields in angle +brackets (`<>`) with the appropriate values: + +```bash +sbin/compress-from-s3.sh \ + --timestamp-key \ + --dataset \ + s3-key-prefix \ + +``` -If you wish to compress a single log file, specify the entire path to the log file. However, if -that log file's path is a prefix of another log file's path, then both log files will be compressed -(e.g., with two files "logs/syslog" and "logs/syslog.1", a prefix like "logs/syslog" will cause -both logs to be compressed). This limitation will be addressed in a future release. +* `` is a URL identifying the S3 key prefix to compress. It can be written in either + of two formats: + * `https://.s3..amazonaws.com/` + * `https://s3..amazonaws.com//` +* The fields in `` are as follows: + * `` is the name of the S3 bucket containing your logs. + * `` is the AWS region [code][aws-region-codes] for the S3 bucket containing your + logs. + * `` is the prefix of all logs you wish to compress and must begin with the + `` value from the [compression IAM policy][compression-iam-policy]. + +:::{note} +`s3-key-prefix` mode only accepts a single `` argument. This limitation will be +addressed in a future release. ::: -[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console [aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability +[aws-s3-object-key]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html [compression-iam-policy]: ./object-storage-config.md#configuration-for-compression