Is there a way to track unique subscribers to my RSS feed?
RSS (Really Simple Syndication) is a content delivery mechanism that’s
growing in popularity. Internet publishers have been moving to supply content in
RSS feeds, thereby avoiding many delivery issues associated with spam.
Unfortunately unlike a mailing list, it’s not immediately obvious how many RSS
readers you have. And publishers care deeply about how many readers they
Become a Patron of Ask Leo! and go ad-free!
An RSS feed is nothing more than a file on your web server that your
readers’ RSS aggregators check periodically for updates. No one tells
you that they’re looking or how often they’re checking – they just check. The
problem is how to track the number of people doing so.
Since it’s a file on your web server the first thing that comes to mind is
to use the server logs. There you’ll see that your RSS feed’s file is being
accessed a certain number of times each day. Unfortunately this doesn’t really
help much – readers can instruct their aggregators to check for updates at
arbitrary intervals. Some may check once an hour and others once a day. So 24
entries in your log could mean that one person is checking once an hour – or 24
people are checking once a day or some other combination in between.
It is possible to use the logged IP address each time someone checks to
narrow down the possibilities. If you count only accesses from each unique IP
address once per day this gives us a better “order of magnitude” of unique
readers per day. Unfortunately there are still problems. The prevalence of
NAT routers and firewalls mean that some number of machines will appear as the
same IP address. Conversely, machines behind a bank of proxy servers – such as is
common at AOL or at large corporations – can each appear as a different IP
address each time they access the internet across a different server.
What we really need is some way to uniquely identify each unique reader of an
RSS feed. In some ways what we’d really like is something akin to a cookie that
we could associate with each consumer. Unfortunately cookies themselves are
typically not paid attention to by the aggregators that people use to read RSS
An approach I’m considering but have yet to try is to generate a
unique URL to the RSS feed for every visitor to the site. This requires
that the HTML containing the link to your feed (say, the XML button
and the link underneath it) needs to be programmatically generated. More
on that in a second. The link would be of the form:
The “randomnumber” here isn’t really random. It’s algorithmically
generated but it is unique. By that I mean it will never be the
same value twice. Refresh the page and it will change.
When a potential reader comes along and places that URI into his or
her RSS aggregator that “randomnumber” goes with it. Since no two
people will ever get the same number it now serves as a unique
identifier for this one reader. The aggregator will duly pass on the
full URI on each feed fetch including the ID=number which will then
show up in the web server logs or could even be processed by the
script that returns the feed.
There are, of course, caveats.
First, the link to the feed must be programmatically generated. In
other words there’s script involved. If the page itself it ASP, PHP, or
similar, then there’s not too much of a problem. An alternative is to
Second, the link to the feed itself must allow the ?ID=number
parameter to be passed. In my example above I used a .php script to
generate the feed which could also easily be any type of CGI that either
generates or simply copies the feed back to the requestor. In
many cases it may also be benign to pass the “parameter” directly on
the .rdf or .xml file request and not use any server scripting at all.
In most cases the parameter can be completely ignored by the server
as long as it appears in the web server log for later analysis.
There are other tracking techniques out there that are most notably etags or
301 redirection attempts, but both require some amount of cooperation
out of the client’s aggregator that may or may not be there. I believe
the approach I’ve outlined here places the burden on the feed provider
and should work with just about any aggregator.
More Info: You can read a good summary of the counting
problem including some several possible solutions (including this one)
courtesy of Tim Bray here. Derek Scruggs comments on
that, and adds his own ideas
here. And you can read tons more about RSS