Skip to content

Commit 4281c1a

Browse files
LinZhihao-723quinntaylormitchellkirkrodrigues
authored
docs(clp-package): Rewrite S3 log compression guide to reflect new API and script features. (#1510)
Co-authored-by: Quinn Taylor Mitchell <[email protected]> Co-authored-by: kirkrodrigues <[email protected]>
1 parent 5c0a903 commit 4281c1a

File tree

1 file changed

+79
-17
lines changed

1 file changed

+79
-17
lines changed

docs/src/user-docs/guides-using-object-storage/clp-usage.md

Lines changed: 79 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,35 +5,97 @@ should be able to use CLP as described in the [clp-json quick-start guide](../qu
55

66
## Compressing logs from S3
77

8-
To compress logs from S3, use the `sbin/compress.sh` script as follows, replacing the fields in
9-
angle brackets (`<>`) with the appropriate values:
8+
To compress logs from S3, use the `sbin/compress-from-s3.sh` script. The script supports two modes
9+
of operation:
10+
11+
* [**s3-object** mode](#s3-object-compression-mode): Compress S3 objects specified by their full
12+
S3 URLs.
13+
* [**s3-key-prefix** mode](#s3-key-prefix-compression-mode): Compress all S3 objects under a given
14+
S3 key prefix.
15+
16+
### `s3-object` compression mode
17+
18+
The `s3-object` mode allows you to specify individual S3 objects to compress by using their full
19+
URLs. To use this mode, call the `sbin/compress-from-s3.sh` script as follows, and replace the
20+
fields in angle brackets (`<>`) with the appropriate values:
1021

1122
```bash
12-
sbin/compress.sh \
23+
sbin/compress-from-s3.sh \
1324
--timestamp-key <timestamp-key> \
14-
<url>
25+
--dataset <dataset-name> \
26+
s3-object \
27+
<object-url> [<object-url> ...]
1528
```
1629

17-
* `<url>` is a URL identifying the logs to compress. It can have one of two formats:
18-
* `https://<bucket-name>.s3.<region-code>.amazonaws.com/<prefix>`
19-
* `https://s3.<region-code>.amazonaws.com/<bucket-name>/<prefix>`
20-
* The fields in `<url>` are as follows:
30+
* `<object-url>` is a URL identifying the S3 object to compress. It can be written in either of two
31+
formats:
32+
* `https://<bucket-name>.s3.<region-code>.amazonaws.com/<object-key>`
33+
* `https://s3.<region-code>.amazonaws.com/<bucket-name>/<object-key>`
34+
* The fields in `<object-url>` are as follows:
2135
* `<bucket-name>` is the name of the S3 bucket containing your logs.
2236
* `<region-code>` is the AWS region [code][aws-region-codes] for the S3 bucket containing your
2337
logs.
24-
* `<prefix>` is the prefix of all logs you wish to compress and must begin with the
25-
`<all-logs-prefix>` value from the [compression IAM policy][compression-iam-policy].
38+
* `<object-key>` is the [object key][aws-s3-object-key] of the log file object you wish to
39+
compress.
40+
41+
:::{warning}
42+
There must be no duplicate object keys across all `<object-url>` arguments.
43+
:::
44+
45+
46+
* For a description of other fields, see the [clp-json quick-start
47+
guide](../quick-start/clp-json.md#compressing-json-logs).
48+
49+
Instead of specifying input object URLs explicitly in the command, you may specify them in a text
50+
file and then pass the file into the command using the `--inputs-from` flag, like so:
51+
52+
```bash
53+
sbin/compress-from-s3.sh \
54+
--timestamp-key <timestamp-key> \
55+
--dataset <dataset-name> \
56+
s3-object \
57+
--inputs-from <input-file>
58+
```
59+
60+
* `<input-file>` is a path to a text file containing one S3 object URL **per line**. The URLs must
61+
follow the same format as described above for `<object-url>`.
2662

2763
:::{note}
28-
Compressing from S3 only supports a single URL but will compress any logs that have the given
29-
prefix.
64+
The `s3-object` mode requires the input object keys to share a non-empty common prefix. If the input
65+
object keys do not share a common prefix, they will be rejected and no compression job will be
66+
created. This limitation will be addressed in a future release.
67+
:::
68+
69+
### `s3-key-prefix` compression mode
70+
71+
The `s3-key-prefix` mode allows you to compress all objects under a given S3 key prefix. To use this
72+
mode, call the `sbin/compress-from-s3.sh` script as follows, and replace the fields in angle
73+
brackets (`<>`) with the appropriate values:
74+
75+
```bash
76+
sbin/compress-from-s3.sh \
77+
--timestamp-key <timestamp-key> \
78+
--dataset <dataset-name> \
79+
s3-key-prefix \
80+
<key-prefix-url>
81+
```
3082

31-
If you wish to compress a single log file, specify the entire path to the log file. However, if
32-
that log file's path is a prefix of another log file's path, then both log files will be compressed
33-
(e.g., with two files "logs/syslog" and "logs/syslog.1", a prefix like "logs/syslog" will cause
34-
both logs to be compressed). This limitation will be addressed in a future release.
83+
* `<key-prefix-url>` is a URL identifying the S3 key prefix to compress. It can be written in either
84+
of two formats:
85+
* `https://<bucket-name>.s3.<region-code>.amazonaws.com/<key-prefix>`
86+
* `https://s3.<region-code>.amazonaws.com/<bucket-name>/<key-prefix>`
87+
* The fields in `<key-prefix-url>` are as follows:
88+
* `<bucket-name>` is the name of the S3 bucket containing your logs.
89+
* `<region-code>` is the AWS region [code][aws-region-codes] for the S3 bucket containing your
90+
logs.
91+
* `<key-prefix>` is the prefix of all logs you wish to compress and must begin with the
92+
`<all-logs-prefix>` value from the [compression IAM policy][compression-iam-policy].
93+
94+
:::{note}
95+
`s3-key-prefix` mode only accepts a single `<key-prefix-url>` argument. This limitation will be
96+
addressed in a future release.
3597
:::
3698

37-
[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console
3899
[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability
100+
[aws-s3-object-key]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html
39101
[compression-iam-policy]: ./object-storage-config.md#configuration-for-compression

0 commit comments

Comments
 (0)