I am Hack Sparrow
Captain of the Internets.

XML Sitemap Generator in Node.js – sitemap.xml.gz

Node.js sitemap.xml.gz generator

Generating XML sitemaps for website can be done manually, through web apps, and by the website itself. If you are the technical kind, probably you wanna use the last option and personally take control over the generation of XML sitemap on your website. Infact, if your website has hundred thousand pages and you want a dynamic way of generating the sitemap - manually coding it is the only option.

In reality your URLs might come from a database, but for this example, we'll assume it is coming from an array. Following is a function to generate dynamic XML sitemaps in Node.js.

function generate_xml_sitemap() {
    // this is the source of the URLs on your site, in this case we use a simple array, actually it could come from the database
    var urls = ['about.html', 'javascript.html', 'css.html', 'html5.html'];
    // the root of your website - the protocol and the domain name with a trailing slash
    var root_path = 'http://www.example.com/';
    // XML sitemap generation starts here
    var priority = 0.5;
    var freq = 'monthly';
    var xml = '<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
    for (var i in urls) {
        xml += '<url>';
        xml += '<loc>'+ root_path + urls[i] + '</loc>';
        xml += '<changefreq>'+ freq +'</changefreq>';
        xml += '<priority>'+ priority +'</priority>';
        xml += '</url>';
        i++;
    }
    xml += '</urlset>';
    return xml;
}

Now that we have the XML sitemap generator function, let's see how we might serve it. Let's assume we will make the sitemap available at http://www.example.com/sitemap.xml, here is how you would set up the route.

app.get('/sitemap.xml', function(req, res) {
    var sitemap = generate_xml_sitemap(); // get the dynamically generated XML sitemap
    res.header('Content-Type', 'text/xml');
    res.send(sitemap);     
})

Alright it's cool, we are abale to generate an XML sitemap using Node.js, but is there a way to send it as sitemap.xml.gz?

Heck yes, there is a way! We will use Node.js' Zlib module to send a compressed XML sitemap. And here is the way to do it:

var zlib = require('zlib');
app.get('/sitemap.xml.gz', function(req, res) {
    // get the dynamically generated XML sitemap
    var sitemap = generate_xml_sitemap();
    // Set the appropriate HTTP headers to help old and new browsers equally to how to handle the output
    res.header('Content-Type: application/x-gzip');
    res.header('Content-Encoding: gzip');
    res.header('Content-Disposition: attachment; filename="sitemap.xml.gz"');
    zlib.gzip(new Buffer(sitemap, 'utf8'), function(error, data) {
        res.send(data);
    });
});

Depending on the size and nature of your sitemap, you could further optimize the gzipped XML sitemap generation process using the fs and Stream Node modules.

5 Responses to “XML Sitemap Generator in Node.js – sitemap.xml.gz”

  1. amit says:

    very nice article :)
    in the code I noticed
    var sitemap = generate_xml_sitemap();
    res.header(‘Content-Type’, ‘text/xml’);
    res.send(sitemap);

    i thing this is blocking…..cant we send a callback function in generate_xml_sitemap() that will handle the response ?

  2. Captain says:

    Hi there amit,

    You can do it this way:


    // route
    app.get('/sitemap.xml', function(req, res) {
    generate_xml_sitemap(res);
    }

    function generate_xml_sitemap(res) {
    // do the XML string generation
    ...
    res.header('Content-Type', 'text/xml');
    res.send(xml);
    }

    Hope that helps.

    – Captain

  3. amit says:

    @captain: thanks :)

  4. TheDigitalNinja says:

    Great post!

  5. Sandeep says:

    Too good dude. :D

Make a Comment