An XML Sitemap can be useful for optimising your site with Google, particularly if you make use of their Webmaster Tools. Jekyll doesn’t come with one out-of-the-box, but it is easy to add one. There’s probably a plugin out there which will automate things, but I just used a normal Jekyll-generated file for mine, based on code found on Robert Birnie’s site.
The only modification I made was to exclude feed.xml
from the sitemap. Because this is auto-generated by a plugin I couldn’t add any front-matter to a file to exclude it in the same way as other files.
Create a file called sitemap.xml in the root of your site, and paste the following code into it:
---
layout: null
sitemap:
exclude: 'yes'
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{% for post in site.posts %}
{% unless post.published == false %}
<url>
<loc>{{ site.url }}{{ post.url }}</loc>
{% if post.sitemap.lastmod %}
<lastmod>{{ post.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
{% elsif post.date %}
<lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
{% else %}
<lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
{% endif %}
{% if post.sitemap.changefreq %}
<changefreq>{{ post.sitemap.changefreq }}</changefreq>
{% else %}
<changefreq>monthly</changefreq>
{% endif %}
{% if post.sitemap.priority %}
<priority>{{ post.sitemap.priority }}</priority>
{% else %}
<priority>0.5</priority>
{% endif %}
</url>
{% endunless %}
{% endfor %}
{% for page in site.pages %}
{% unless page.sitemap.exclude == "yes" or page.url=="/feed.xml" %}
<url>
<loc>{{ site.url }}{{ page.url | remove: "index.html" }}</loc>
{% if page.sitemap.lastmod %}
<lastmod>{{ page.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
{% elsif page.date %}
<lastmod>{{ page.date | date_to_xmlschema }}</lastmod>
{% else %}
<lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
{% endif %}
{% if page.sitemap.changefreq %}
<changefreq>{{ page.sitemap.changefreq }}</changefreq>
{% else %}
<changefreq>monthly</changefreq>
{% endif %}
{% if page.sitemap.priority %}
<priority>{{ page.sitemap.priority }}</priority>
{% else %}
<priority>0.3</priority>
{% endif %}
</url>
{% endunless %}
{% endfor %}
</urlset>
If you want fine-control over what appears in the sitemap, you can use any of the following front-matter variables.
sitemap:
lastmod: 2014-01-23
priority: 0.7
changefreq: 'monthly'
exclude: 'yes'
As an example, I use this in my feed.json
template to exclude the generated file from the sitemap:
sitemap:
exclude: 'yes'
And this in my index/archive pages for a daily change frequency:
sitemap:
changefreq: 'daily'
Chris McLeod mentioned this article on mrkapowski.com.