@@ -5,35 +5,97 @@ should be able to use CLP as described in the [clp-json quick-start guide](../qu
55
66## Compressing logs from S3
77
8- To compress logs from S3, use the ` sbin/compress.sh ` script as follows, replacing the fields in
9- angle brackets (` <> ` ) with the appropriate values:
8+ To compress logs from S3, use the ` sbin/compress-from-s3.sh ` script. The script supports two modes
9+ of operation:
10+
11+ * [ ** s3-object** mode] ( #s3-object-compression-mode ) : Compress S3 objects specified by their full
12+ S3 URLs.
13+ * [ ** s3-key-prefix** mode] ( #s3-key-prefix-compression-mode ) : Compress all S3 objects under a given
14+ S3 key prefix.
15+
16+ ### ` s3-object ` compression mode
17+
18+ The ` s3-object ` mode allows you to specify individual S3 objects to compress by using their full
19+ URLs. To use this mode, call the ` sbin/compress-from-s3.sh ` script as follows, and replace the
20+ fields in angle brackets (` <> ` ) with the appropriate values:
1021
1122``` bash
12- sbin/compress.sh \
23+ sbin/compress-from-s3 .sh \
1324 --timestamp-key < timestamp-key> \
14- < url>
25+ --dataset < dataset-name> \
26+ s3-object \
27+ < object-url> [< object-url> ...]
1528```
1629
17- * ` <url> ` is a URL identifying the logs to compress. It can have one of two formats:
18- * ` https://<bucket-name>.s3.<region-code>.amazonaws.com/<prefix> `
19- * ` https://s3.<region-code>.amazonaws.com/<bucket-name>/<prefix> `
20- * The fields in ` <url> ` are as follows:
30+ * ` <object-url> ` is a URL identifying the S3 object to compress. It can be written in either of two
31+ formats:
32+ * ` https://<bucket-name>.s3.<region-code>.amazonaws.com/<object-key> `
33+ * ` https://s3.<region-code>.amazonaws.com/<bucket-name>/<object-key> `
34+ * The fields in ` <object-url> ` are as follows:
2135 * ` <bucket-name> ` is the name of the S3 bucket containing your logs.
2236 * ` <region-code> ` is the AWS region [ code] [ aws-region-codes ] for the S3 bucket containing your
2337 logs.
24- * ` <prefix> ` is the prefix of all logs you wish to compress and must begin with the
25- ` <all-logs-prefix> ` value from the [ compression IAM policy] [ compression-iam-policy ] .
38+ * ` <object-key> ` is the [ object key] [ aws-s3-object-key ] of the log file object you wish to
39+ compress.
40+
41+ :::{warning}
42+ There must be no duplicate object keys across all ` <object-url> ` arguments.
43+ :::
44+
45+
46+ * For a description of other fields, see the [ clp-json quick-start
47+ guide] ( ../quick-start/clp-json.md#compressing-json-logs ) .
48+
49+ Instead of specifying input object URLs explicitly in the command, you may specify them in a text
50+ file and then pass the file into the command using the ` --inputs-from ` flag, like so:
51+
52+ ``` bash
53+ sbin/compress-from-s3.sh \
54+ --timestamp-key < timestamp-key> \
55+ --dataset < dataset-name> \
56+ s3-object \
57+ --inputs-from < input-file>
58+ ```
59+
60+ * ` <input-file> ` is a path to a text file containing one S3 object URL ** per line** . The URLs must
61+ follow the same format as described above for ` <object-url> ` .
2662
2763:::{note}
28- Compressing from S3 only supports a single URL but will compress any logs that have the given
29- prefix.
64+ The ` s3-object ` mode requires the input object keys to share a non-empty common prefix. If the input
65+ object keys do not share a common prefix, they will be rejected and no compression job will be
66+ created. This limitation will be addressed in a future release.
67+ :::
68+
69+ ### ` s3-key-prefix ` compression mode
70+
71+ The ` s3-key-prefix ` mode allows you to compress all objects under a given S3 key prefix. To use this
72+ mode, call the ` sbin/compress-from-s3.sh ` script as follows, and replace the fields in angle
73+ brackets (` <> ` ) with the appropriate values:
74+
75+ ``` bash
76+ sbin/compress-from-s3.sh \
77+ --timestamp-key < timestamp-key> \
78+ --dataset < dataset-name> \
79+ s3-key-prefix \
80+ < key-prefix-url>
81+ ```
3082
31- If you wish to compress a single log file, specify the entire path to the log file. However, if
32- that log file's path is a prefix of another log file's path, then both log files will be compressed
33- (e.g., with two files "logs/syslog" and "logs/syslog.1", a prefix like "logs/syslog" will cause
34- both logs to be compressed). This limitation will be addressed in a future release.
83+ * ` <key-prefix-url> ` is a URL identifying the S3 key prefix to compress. It can be written in either
84+ of two formats:
85+ * ` https://<bucket-name>.s3.<region-code>.amazonaws.com/<key-prefix> `
86+ * ` https://s3.<region-code>.amazonaws.com/<bucket-name>/<key-prefix> `
87+ * The fields in ` <key-prefix-url> ` are as follows:
88+ * ` <bucket-name> ` is the name of the S3 bucket containing your logs.
89+ * ` <region-code> ` is the AWS region [ code] [ aws-region-codes ] for the S3 bucket containing your
90+ logs.
91+ * ` <key-prefix> ` is the prefix of all logs you wish to compress and must begin with the
92+ ` <all-logs-prefix> ` value from the [ compression IAM policy] [ compression-iam-policy ] .
93+
94+ :::{note}
95+ ` s3-key-prefix ` mode only accepts a single ` <key-prefix-url> ` argument. This limitation will be
96+ addressed in a future release.
3597:::
3698
37- [ add-iam-policy ] : https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console
3899[ aws-region-codes ] : https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability
100+ [ aws-s3-object-key ] : https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html
39101[ compression-iam-policy ] : ./object-storage-config.md#configuration-for-compression
0 commit comments