Cloudflare’s Enterprise customers have access to the Enterprise Log Share service, a RESTful API that allows our customers to consume request logs over HTTP. The new "logs/received" REST API route exposes data by time received (the time the event was written to disk in our log aggregation system). This is in contrast to our “logs/requests” API documented here, which orders logs by request time. Ordering by log aggregation time instead of log generation time results in lower (faster) log share pipeline latency and deterministic/idempotent log pulls.
The "logs/received" REST API route exposes data by time received (the time the event was written to disk in our log aggregation system). This is in contrast to our “logs/requests” API documented here, which orders logs by request time. Ordering by log aggregation time instead of log generation time results in lower (faster) log share pipeline latency and deterministic/idempotent log pulls. Functionally, it is similar to tailing a log file, or reading from rsyslog (albeit in chunks).
This means that if you want to obtain logs for a given time range, you can do so by issuing one call for each consecutive minute (or other time range). Because log lines are batched by time received and made available, there is no “late arriving data”. A response for a given minute will never change. You do not have to repeatedly poll a given time range to receive logs as they converge on our aggregation system.
Recommended Access Pattern
Pick a time interval that works (we suggest starting with a minute for most request volumes; try a few intervals, if the interval returns too much data to easily handle, scale down), and then make requests with that fixed interval (not setting 'count') for the whole time range in which you are interested. Once you read an entire 200 response for a request, you do not need to hit that url again.
cURL example from burritobot.com
- start: inclusive, either unix timestamp (which by definition is UTC) or time in rfc3339 (specifies time zone); must be at last 5 minutes in the past.
- end: exclusive, rest same as start.
- count: return up to that many records; negative means no limit.
- sample: float between 0.0 and 1.0 specifying what percentage of requests should be returned; default is 1.0.
- fields: comma separated list of fields to return; when empty uses default list.
curl -s -H'X-Auth-Email: MONKEY' -H'X-Auth-Key: BANANA' 'https://api.cloudflare.com/client/v4/zones/4e6d50a41172bca54f222576aec3fc2b/logs/received?start=2017-07-18T22:00:00Z&end=2017-07-18T22:01:00Z&count=1&fields=Timestamp,ClientIP'
Once a 200 response is received and read completely for a given zone and time range, the following will be true for all subsequent requests:Expectations
- The number and content of returned records will be the same.
- The order of returned records may be (and is likely to) be different.
- Requests will fail when:
- There are more than 5 requests to the log share endpoint for a given zone currently in progress.
- There are more than 1GB of uncompressed data in the response. Because responses are streamed, there is no way to determine response size ahead of time. After 1GB of data is streamed, the response will fail with a terminated connection. This allows users to tune their queries to a range that will not result in failures based on their request volume. We believe this is a better solution than hard-limiting requests to an arbitrary time range (eg. 1 minute at a time).
- Response fields
- When set explicitly in the request url, the response fields will never change.
- When not set explicitly, default fields will be used; default fields may change at any time.
API Parameters In Depth
When count is provided, the response will contain up to count results. Since results are not sorted, you are likely to get different data for repeated requests.
When ?sample= is provided, a sample of matching records is returned. If sample=0.1, 10% of records will be returned. Sampling is random: repeated calls will not only return different records, but likely will also vary slightly in number of returned records.
When ?count= is also specified, count is applied to the number of returned records, not the sampled records. So, with sample=0.05 and count=7, when there is a total of 100 records available, approximately 5 will be returned. When there are 1000 records, 7 will be returned. When there are 10,000 records, 7 will be returned.
If "fields" are not specified, default field set is returned. This default field set may change at any time. The list of all fields is available at:
Fields are passed to the request as a comma separated list. So, to have "ClientIP", and "RayID", use:
The order in which fields are specified doesn't matter, and the order of fields in the response is not specified. If you need fields added to the list of available fields, file a DATA ticket and tag it with "els:fields". See DS-4481 for an example.
Data is timestamped by time it was written to disk at our log aggregation point.
Response data is returned in json, 1 json object (1 log message) per line. Sample log message with default field set:
"ClientRequestUserAgent": "Mozilla/5.0 (Windows NT 10.0; rv:54.0) Gecko/20100101 Firefox/54.0",