Notice
This document is for a development version of Ceph.
Bucket Logging
New in version T.
Bucket logging provides a mechanism for logging all access to a bucket. The log data can be used to monitor bucket activity, detect unauthorized access, get insights into bucket usage, and use the logs as a journal for bucket changes. The log records are stored in objects in a separate bucket and can be analyzed later. Logging configuration is done at the bucket level and can be enabled or disabled at any time. The log bucket can accumulate logs from multiple buckets. It is recommended to configure a different “prefix” for each bucket, so that the logs of different buckets will be stored in different objects in the log bucket.
Note
The log bucket must be created before enabling logging on a bucket.
The log bucket cannot be the same as the bucket being logged.
The log bucket cannot have logging enabled on it.
The log bucket cannot have any encryption set on it (including SSE-S3 with AES-256).
The log bucket cannot have any compression set on it.
The log bucket must not have RequestPayer enabled.
Source and log buckets must be in the same zonegroup.
Source and log buckets may belong to different accounts (with proper bucket policy set).
The log bucket may have object lock enabled with default retention period.
The 16-byte unique ID part of the log object name is a lexicographically ordered random string that consists of a 10-byte counter and a 6-byte random alphanumeric string (or a random alphanumeric string if the counter is not available).
Logging Reliability
For performance reasons, even though the log records are written to persistent
storage, the log object will appear in the log bucket only after some
configurable amount of time (or if the maximum object size of 128 MB is
reached). This time (in seconds) can be set per source bucket via a Ceph
extension to the REST API, or globally via the
rgw_bucket_logging_obj_roll_time
configuration option. If not set, the
default time is 5 minutes. Adding a log object to the log bucket is done
“lazily”, meaning that if no more records are written to the object, it may
remain outside of the log bucket even after the configured time has passed. To
counter that, you can flush all logging objects on a given source bucket to log
them, regardless if enough time passed or if no more records are written to the
object. Flushing will happen automatically when logging is disabled on a
bucket, or its logging configuration is changed, or the bucket is deleted.
Standard
If the logging type is set to “Standard” (the default) the log records are written to the log bucket after the bucket operation is completed. This means that the logging operation may fail with no indication to the client.
Journal
If the logging type is set to “Journal”, the records are written to the log bucket before the bucket operation is completed. This means that if the logging action fails, the operation is not executed and an error is returned to the client. Some exceptions to that rule exist: the “Fails Operation” columns in the table below indicate by “No” which operations will not fail even if logging failed. Journal mode supports filtering out records based on matches of the prefixes and suffixes of the logged object keys. Regular expression matching can also be used on these to create filters. Note that it may happen that the log records were successfully written but the bucket operation failed, since the logs are written.
The following operations are supported in journal mode:
Operation |
Operation Name |
Fails Operation |
---|---|---|
|
|
Yes |
|
|
No |
|
|
No |
|
|
Yes |
|
|
Yes |
|
|
Yes |
|
|
Yes |
|
|
Yes |
|
|
Yes |
|
|
No |
Multisite
In a multi-zone deployment, each zone uses its own log object before the log object is added to the log bucket. After the log object is added to the log bucket (that is, after being flushed) it is replicated to other zones. This means that for a given time period there can be more than one log object holding relevant log records.
Bucket Logging Policy
Only the owner of the source bucket is allowed to enable or disable bucket
logging. For a bucket to be used as a log bucket, it must have a bucket policy
that allows that (even if the source bucket and the log bucket are owned by the
same user or account). The bucket policy must allow the s3:PutObject
action
for the log bucket, to be performed by the logging.s3.amazonaws.com
service
principal. The bucket policy should also specify the source bucket and account
that are expected to write logs to it. For example:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowLoggingFromSourceBucket",
"Effect": "Allow",
"Principal": {
"Service": "logging.s3.amazonaws.com"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::log-bucket-name/prefix*",
"Condition": {
"StringEquals": {
"aws:SourceAccount": "source-account-id"
},
"ArnLike": {
"aws:SourceArn": "arn:aws:s3:::source-bucket-name"
}
}
}
]
}
Bucket Logging Quota
Bucket and user quota are applied on the log bucket. Quota is checked every time a log record is written, and is updated when the log object is added to the log bucket. In “Journal” mode, if the quota is exceeded, the logging operation fails and as a result the bucket operation also fails. In “Standard” mode, the logging operation is skipped, but the bucket operation continues.
Bucket Logging REST API
Detailed under: Bucket Operations.
Log Objects Key Format
Simple
The “Simple” log objects key has the following format:
<prefix><year-month-day-hour-minute-second>-<16 bytes unique-id>
For example:
fish/2024-08-06-09-40-09-0000000002AGQ6W1
Partitioned
The “Partitioned” log objects key has the following format:
<prefix><source bucket owner>/<zone group>/[tenant:]<source bucket name>/<year>/<month>/<day>/<year-month-day-hour-minute-second>-<16 bytes unique-id>
For example:
fish/testid/default/fish-bucket/2024/08/06/2024-08-06-10-11-18-0000000011D1FGPA
Log Records Format
The log records are space-separated string columns and have the following possible formats:
Journal
The “Journal” record format uses minimum amount of data for journaling bucket changes (this is a Ceph extension).
bucket owner (or dash if empty)
bucket name (or dash if empty), in the format:
[tenant:]<bucket name>
time in the following format:
[day/month/year:hour:minute:second timezone]
operation in the following format:
WEBSITE/REST.<HTTP method>.<resource>
object key (or dash if empty)
object size (or dash if empty)
version id (or dash if empty)
eTag (or dash if empty)
For example:
testid fish [06/Aug/2024:09:40:09 +0000] REST.PUT.OBJECT myfile - 512 4cfdfc1f58e762d3e116787cb92fac60
testid fish [06/Aug/2024:09:40:28 +0000] REST.DELETE.OBJECT myfile - - 4cfdfc1f58e762d3e116787cb92fac60
Standard
The “Standard” record format is based on AWS Logging Record Format.
bucket owner (or dash if empty)
bucket name (or dash if empty) in the format:
[tenant:]<bucket name>
time in the following format:
[day/month/year:hour:minute:second timezone]
where “timezone” is in UTC offsetclient IP address (or dash if empty)
user or account (or dash if empty)
request ID
operation in the following format:
WEBSITE/REST.<HTTP method>.<resource>
object key (or dash if empty)
request URI in the following format:
"<HTTP method> <URI> <HTTP version>"
HTTP status (or dash if zero). Note that in most cases log is written before the status is known
error code (or dash if empty)
bytes sent (or dash if zero)
object size (or dash if zero)
total time (not supported, always a dash)
turnaround time in milliseconds
referer (or dash if empty)
user agent (or dash if empty) inside double quotes
version id (or dash if empty)
host id taken from
x-amz-id-2
(or dash if empty)signature version (or dash if empty)
cipher suite (or dash if empty)
authentication type (
AuthHeader
for regular auth,QueryString
for presigned URL or dash if unauthenticated)host header (or dash if empty)
TLS version (or dash if empty)
access point ARN (not supported, always a dash)
ACL flag (
Yes
if an ACL was required for authorization, otherwise dash)
For example:
testid fish [06/Aug/2024:09:30:25 +0000] - testid 9e369a15-5f43-4f07-b638-de920b22f91b.4179.15085270386962380710 REST.PUT.OBJECT myfile "PUT /fish/myfile HTTP/1.1" 200 - 512 512 - - - - - - - - - localhost - -
testid fish [06/Aug/2024:09:30:51 +0000] - testid 9e369a15-5f43-4f07-b638-de920b22f91b.4179.7046073853138417766 REST.GET.OBJECT myfile "GET /fish/myfile HTTP/1.1" 200 - - 512 - - - - - - - - - localhost - -
testid fish [06/Aug/2024:09:30:56 +0000] - testid 9e369a15-5f43-4f07-b638-de920b22f91b.4179.10723158448701085570 REST.DELETE.OBJECT myfile "DELETE /fish/myfile1 HTTP/1.1" 200 - - 512 - - - - - - - - - localhost - -
Brought to you by the Ceph Foundation
The Ceph Documentation is a community resource funded and hosted by the non-profit Ceph Foundation. If you would like to support this and our other efforts, please consider joining now.