How To Generate A Dynamic Sitemap For SEO Using Python Flask
- By Bryan Bailey
A proper sitemap is a key part of your websites SEO strategy and SEO is vital in bringing in traffic (and thus revenue) to the website. A sitemap is a text file that google (among other search engine providers) uses to crawl and index the website so that its content can be added and ranked within the results. There are two main types of sitemaps: XML and HTML. XML sitemaps are unique in that they are never seen by the end users and are only used as a way to clue a search engine in as to what pages are on a website as well as other information such as how often the content is updated and the general importance of the pages in relation to each other. We're going to build an XML sitemap.
When using the Flask framework, outside of a poorly documented module, there is no easy way of creating a sitemap. The only documentation on the matter is a snippet from 2013 that uses Python 2.x. While the code is greatly outdated, the one thing it got right was that we can simply use a Flask route to generate a sitemap of not only all of the static pages but the dynamic pages as well.
First you'll need to import datetime and timedelta from the datetime module. We will use these to get our modified times. We will also need to import current_app, make_response, render_template from the flask module. If you have any tables you want to generate dynamic urls for import them from your applications models file. In this case it would be from app.models.
from datetime import datetime, timedelta from flask import current_app, make_response, render_template from app.models import Post
Next, we need to set up the route and create a sitemap function. This will create the route /sitemap.xml and only accept the GET request method.
@app.route('/sitemap.xml', methods=['GET']) def sitemap():
In our new route we'll define two immediate variables: a list we'll call pages and a variable that holds the date from 10 days before whenever the route the route is called by Google's crawlers; let's name it ten_days_ago.
pages =  ten_days_ago = datetime.now - timedelta(days=10)
Now let's grab our static routes: We want to grab each rule from the current_app's url_map using its iter_rules function. Then we'll check all of those rules for the following parameters:
- Does the methods list contain a GET request?
- Is the length of arguments of the rule 0?
- Does the rule start with a route you don't want to expose to google (such as an admin area)?
If the rule meets all of the requested criteria, we can concatenate the string of the rule to the end of https://yourdomain.com into a url and add it with the ten_days_ago variable to a list that we'll append to pages. Voila! All static pages.
for rule in current_app.url_map.iter_rules(): if 'GET' in rule.methods and len(rule.arguments) == 0 and not rule.rule.startswith('/admin'): pages.append(['https://yourdomain.com' + rule.rule, ten_days_ago])
So how, then, do we get dynamic pages like blog content? Similar setup to grabbing the static rules only this time we're going to iterate over the table that contains the content that creates your pages. In this case we would first start by creating a variable called posts to grab all of the Post table ordered by the post timestamp.
posts = Post.query.order_by(Post.timestamp).all()
Then, for each post in posts we'd create a url variable by concatenating your domain name with the url_for your blog route and pass in the necessary variable. Create a variable called modified_time to grab the post timestamp and make sure the date is in isoformat. A poorly formatted timestamp will immediately make the site crawl fail in Google search console. Now we'll put the two newly created variables in a list that we'll append to pages. We should now have all of the necessary pages from our site.
for post in posts: url = "https://yourdomain.com" + url_for('blog_post', slug=post.slug) modified_time = post.timestamp.date().isoformat() pages.append([url, modified_time])
Now we need to figure out how to actually create the file dynamically.
To start we'll need a template to use; we'll create a variable called sitemap_template and tell it to render the sitemap_template.xml file using the pages list we created with flasks url mapping earlier.
Now we can turn that template into a response variable and set the headers Content-Type to application/xml. Now return the response and you've got your Sitemap! You can test this by Going to Google Search Console or a free validation tool https://www.xml-sitemaps.com.
See the code here: flask-sitemapxml
sitemap_template = render_template('sitemap_template.xml', pages=pages) response = make_response(sitemap_template) response.headers["Content-Type"] = "application/xml" return response