As you’re probably aware, there is a wealth of information and benefit to be had from paying attention to and scrutinising your log files.
That said, it’s important that the information provided by them is accurate, which means that you must be aware of subtle errors and mistakes that are easy to overlook.
Ensure you have the correct port…
Making sure that you have the right port is integral to getting the information you need.
Although it might already be present in your logs, there could be a chance that the only available logs derive from an internal layer within your infrastructure.
If you can, ensure to check that the returned port matches one used by visitors to your website.
…And the right IP address
Much in the same way, the IP address in your logs could also be incorrect.
Using a layered infrastructure as an example, the returned IP address could be from your cache server, and not from a user making a request.
That said, the IP address is the only element that can dissociate from a Googlebot crawl and that of a competitor, or even another tool that is browsing your site while falsifying its identity.
This is another reason that you need to ensure that the information that you are reviewing is accurate.
Inspect the host/vhost
Sometimes a single server can host several websites, and each one will arrange its own log files accordingly.
You should be able to locate the ones you need quite easily by inputting the site name or that of the repertory which stores them.
Sometimes however, a server configuration can write the logs from many sites into one file, which in this case, is not ideal.
If that happens, add a field so that you can identify the website you need within every log line.
Another alternative is to fill the absolute URL as a Path instead of using the relative URL (where it is usually found).
This tactic also offers another option besides using the port or any other field that can identify the protocol.
Check your server time
This almost goes without saying, but like many things that go without saying, checking that your server respects the local time zone is something that can be easily overlooked.
As you’re probably aware, every log line includes time and date information, which allows for the recording of events.
If the server time happens to be out by merely an hour, it would not match incident times, which means that it would be hard to identify errors.
Check your server time configuration point with your host.
When migrating from HTTP to HTTPS
Log files are an important area to consider if you’re migrating to HTTPS, which is something that every site should do, as the protocol is now considered a cornerstone of security for many, including Google and other search engines.
When moving across to HTTPs, checking through log files is a great way to ensure that a migration takes place without hiccup or hindrance.
Remember to set up a robust log files implementation, which you can combine with a segmentation tool that can crawl your website.
By doing this, you’ll be able to monitor the inclusion of redirections and the progressive transfer of crawl budgets.
It’s worth remembering however, that the original log format does not differentiate the HTTP protocol from that of the HTTPS.
This is due to the lack of an element that will identify a targeted protocol.
Put simply, an element could be the port 80 for HTTP, or 443 for HTTPs, the scheme for either, or the SSL/TIS protocol (excluding TLSV1.2.).
This means that you would see two status codes for a single URL.
During a Googlebot visit to a HTTP page which is properly redirected in 301 to its HTTPS equivalent, there would be two entries for /a.html.
The first one will be in 301 and the second one will have the final status code.
Before any migration to HTTPS, ensure that your log files have all the correct information you need so that you can clearly monitor the process.
Like many aspects of running a website, unless you keep an eye on things and ensure that everything is ordered and in the right place, you will rarely find everything ready and waiting for you without a good bit of preparation and set up.
If you modify any writing rules within your logs however, it’s important to remember that the modification will not be retroactive.
This means that you need to optimise file formats as soon as possible so that you will have an efficient and useful log analysis which you can use as part of an intelligent technical SEO strategy.
If you have any questions about log file analysis, or other issues that you have encountered with your log files, feel free to check out my contact page, or alternatively, email me at [email protected]