Cloudflare’s Enterprise customers have access to the Enterprise Log Share service, a RESTful API that allows our customers to consume request logs over HTTP. The new "logs/received" REST API route exposes data by time received (the time the event was written to disk in our log aggregation system). This is in contrast to our “logs/requests” API documented here, which orders logs by request time. Ordering by log aggregation time instead of log generation time results in lower (faster) log share pipeline latency and deterministic/idempotent log pulls.
Early Access Program
Access to this endpoint is currently available in Early Access. Please mind the following known limitations to usage:
- The API is not finalized, and is subject to change.
- The endpoint may be unavailable for periods of time in order to make functional changes.
- Performance may at times be worse than the current API.
- No completeness or accuracy of data is guaranteed.
This also means we are accepting feedback. Please let us know what you think so we can improve our product!
The new "logs/received" REST API route exposes data by time received (the time the event was written to disk in our log aggregation system). This is in contrast to our “logs/requests” API documented here, which orders logs by request time. Ordering by log aggregation time instead of log generation time results in lower (faster) log share pipeline latency and deterministic/idempotent log pulls.
This means that if you want to obtain logs for a given time range, you can do so by issuing one call for each consecutive minute (or other time range). Because log lines are batched by time received and made available, there is no “late arriving data”. A response for a given minute will never change. You do not have to repeatedly poll a given time range to receive logs as they converge on our aggregation system.
Recommended Access Pattern
Pick a time interval that works (we suggest starting with a minute for most request volumes; try a few intervals, if the interval returns too much data to easily handle, scale down), and then make requests with that fixed interval (not setting 'count') for the whole time range in which you are interested. Once you read an entire 200 response for a request, you do not need to hit that url again.
cURL example from burritobot.com
- start: inclusive, either unix timestamp (which by definition is UTC) or time in rfc3339 (specifies time zone); must be at last 5 minutes in the past
- end: exclusive, rest same as start
- count: return up to that many records; negative means no limit
- fields: comma separated list of fields to return; when empty uses default list
curl -s -H'X-Auth-Email: MONKEY' -H'X-Auth-Key: BANANA' 'https://api.cloudflare.com/client/v4/zones/4e6d50a41172bca54f222576aec3fc2b/logs/received?start=2017-07-18T22:00:00Z&end=2017-07-18T22:01:00Z&count=1&fields=Timestamp,ClientIP'
Once a 200 response is received and read completely for a given zone and time range, the following will be true for all subsequent requests:Expectations
- The number and content of returned records will be the same
- The order of returned records may be (and is likely to) be different
- Requests will fail when:
- There are more than 5 requests to the log share endpoint for a given zone currently in progress
- There are more than 1M messages in the response. Because responses are streamed, there is no way to tell the number of messages in response ahead of time. So, after the 1 million-th message, the response will fail with terminated connection. The idea here is that the user, based on the volume of their data, can tune their queries to a range that will not result in failures. I think this is a better solution than hard-limiting requests to an arbitrary time range (1 minute etc)
- Response fields
- When set explicitly in the request url, the response fields will never change
- When not set explicitly, default fields will be used; default fields may change at any time
Data is timestamped by time it was written to disk at our log aggregation point.
Other API Parameters
When count is provided, the response will contain up to count results. Since results are not sorted, you are likely to get different data for repeated requests.
If "fields" are not specified, default field set is returned. This default field set may change at any time. The list of all fields is available at:
Fields are passed to the request as a comma separated list. So, to have "ClientIP", and "RayID", use:
The order in which fields are specified doesn't matter, and the order of fields in the response is not specified. If you need fields added to the list of available fields, file a DATA ticket and tag it with "els:fields". See DS-4481 for an example.
Response data is returned in json, 1 json object (1 log message) per line. Sample log message with default field set:
"ClientRequestUserAgent": "Mozilla/5.0 (Windows NT 10.0; rv:54.0) Gecko/20100101 Firefox/54.0",