Log File Analysis

With ContentKing’s Log File Analysis feature, you can easily understand how search engines traverse the content on your website: discover when Google or Bing last visited each page, and compare visit frequencies between pages.

What is Log File Analysis

A log file is a text file containing records of all the requests a server has received, from both humans and crawlers, and its responses to the requesters.

Through log file analysis, SEOs aim to get a better understanding of what search engines are actually doing on their websites, in order to improve their SEO performance.

Log Sources

Currently you can feed your logs to ContentKing using a Cloudflare Worker (opens in a new tab), Cloudflare Logpush (opens in a new tab), the Akamai DataStream (opens in a new tab), the CloudFront Standard logging (opens in a new tab) or the Fastly Real-Time Log Streaming (opens in a new tab). In the near future we will be adding support for custom log sources.

Traffic that ContentKing detects and tracks

ContentKing detects and tracks visits of the following search engines to your websites:

  • Google Desktop
  • Google Mobile
  • Bing Desktop
  • Bing Mobile

Setting up the Log File Analysis

The process of configuring the Log File Analysis differs based on the log sources that are used to feed them to ContentKing. Refer to the relevant support article based on your website's log source:

  1. Setting up the Cloudflare Worker integration
  2. Setting up the Cloudflare Logpush integration
  3. Setting up the Akamai integration
  4. Setting up CloudFront Standard logging
  5. Setting up Fastly Real-Time Log Streaming

Working with the Log File Analysis data in ContentKing

Using the Log File Analysis data in Pages

Search engines will have two columns each on the Pages screen - Last Visited and Visit Frequency - which you can use for filtering:

Screenshot of the pages screen in ContentKing showing search engine visit columns

When filtering on the columns, you can easily create segments which can be very useful to do spot checks of your least visited pages.

On the Page Detail screen, you can zoom in on how often are Google and Bing visiting a specific page. With a neat graph you can identify any trends in the search engine traffic on that page:

Screenshot of the page detail screen in ContentKing showing search engine activity for a page

And in the Tracked Changes tab of the Page Detail screen, you will be able to see whether search engines saw the latest important change that you’ve made.

In combination with our Real-Time IndexNow feature you can see if Bing visited the page that was submitted to IndexNow:

Screenshot of the tracked changes screen in ContentKing showing search engine activity tracked historically

Log File Analysis data in the Platform section

If you have made some changes to the robots.txt file or the XML sitemaps and you want to check if search engines visited them since - we got you covered!

Just head to the Platform section in ContentKing and easily check if the search engines visited the robots.txt:

Screenshot of the robots.txt in ContentKing showing search engine activity tracked historically

Or the XML sitemaps:

Screenshot of the XML sitemaps in ContentKing showing search engine activity tracked historically

Disabling Log File Analysis

To learn how to disable Log File Analysis please refer to the relevant support article based on your website’s log source:

1. How do we ensure that log ingestion is anonymous and that no sensitive info or PII is being passed?

Cloudflare Worker

At the point of log generation in Cloudflare Workers, no PII is generated, so the logs are free of PII at their inception. We do this by generating those logs through a script, all automatically, which tracks visits only of search engines.

Once the Cloudflare Worker is installed, the code is also observable/auditable in real-time in Cloudflare Workers by your team, although minified.

Akamai DataStream, CloudFront Standard Logging, Fastly Real-Time Log Streaming, Cloudflare Logpush

Akamai DataStream (opens in a new tab), CloudFront Standard logging (opens in a new tab), Fastly Real-Time Log Streaming (opens in a new tab) and Cloudflare Logpush (opens in a new tab) send logs to AWS S3 bucket that ContentKing provisions when you enable the Log File Analysis feature. To ensure that only Search Engine traffic is sent to ContentKing, there is a two-level filtering in place.

  • 1st level filters out all non-search engine traffic from the AWS S3 bucket, so only non-PII (non-personally identifiable information) data is sent to ContentKing servers. This is done by matching the traffic with the search engine’s IP addresses.

  • To err on the side of caution, 2nd level of filtering is implemented on ContentKing servers where ContentKing again checks to ensure only search engine traffic based on the IP addresses is being processed and if not, the data is filtered away.

Furthermore in your Akamai and Fastly account it is possible to configure Akamai DataStream and Fastly Real-Time Log Streaming to only send search engine traffic to the AWS S3 bucket. We require doing so and you can learn how to filter away non-search engine traffic here:

2. How is the Cloudflare Worker invoked?

The Cloudflare Worker acts as a middleware between the visitor (search engine) and your website. It intercepts requests specifically made only by selected search engine bots, and sends a message to ContentKing backend. This is done in a non-blocking way, and it cannot affect what response the visitor will receive.

To connect ContentKing to Cloudflare, you will need to create an API token in your Cloudflare account. Using this token, ContentKing will automatically install its Cloudflare Worker for this website without any additional configuration required from your side.

Afterwards, the Cloudflare Worker will continuously detect and track visits of search engines to this website. The Cloudflare API token is deleted from our systems immediately after the Cloudflare Worker is installed.

3. At what execution interval does the Cloudflare Worker run?

It runs on every HTTP request made to your website, but it opts-out from doing anything else as soon as it decides that the visitor is not one of the supported search engines.

ContentKing never tracks visits of actual visitors to your websites, and this data never gets to our systems.

Need help?

In case you have any questions regarding the Log File Analysis that are not covered by our documentation, don’t hesitate to contact us!