Personalized Feeds

As an investor in a number of companies that do stuff with RSS (NewsGator, FeedBurner, Technorati, and Judy’s Book) and fan and active user of others (e.g., FeedBlitz, SixApart) I’ve been seeing a lot of “this sucks, that’s great, that sucks, this is great” blog posts lately, but rarely do I see anyone decompose what’s actually bad or great and explaining why. Occasionally there’s some stuff from an end-user perspective (especially whenever Google rolls something out), but I’ve been surprised by the general lack of technical depth and public debate. Ok – maybe I’m reading the wrong feeds – but I’m trying.

While I’m a nerd, I’m on the investor side of the equation instead of the engineer side of the equation. As a result, I’m always looking for the analog of the thing I’m experiencing – “what – in the past – was like this thing that is now happening that can provide insight into what the future is going to be like?” I spend a lot of time thinking about this with regard to RSS (and blogs, user-generated content, online advertising, content organization, search, tools, platforms, trendy buzzwords to try to describe everything, and a preponderance of VC investors diving into an area just to get bets on the table.) I don’t pretend I necessarily have a clue technically (ok – I pretend, but I don’t have a clue) – but I know enough to be able to play around with things, look at what I think is going on under the hood, and make (at the minimum) provocative suggestions (often wrong, but at least provocative) about what I think is happening.

Personalized RSS feeds is one of the issues that hit me in the face recently. In the past few weeks, I’ve subscribed to a few RSS feeds that were personalized just for me. Specifically, when I subscribed, the URL that ended up in FeedDemon / NGOS (the aggregator that I use) had a unique identifier at the end. If I subscribed a second time (pretending I was a different person), I got a different unique identifier and ended up with two feeds. This is distinct from a feed that I’ve customized such as a delicious tag feed that is still a generic feed that presumably multiple people will subscribe to if they use the same parameters that I do.

Now – I believe that RSS feeds that are personalized for a particular subscriber’s preferences will become an important tool in the content syndication world, just as static html gave way to CGI, cookies appeared, or broadcast opt-in email (Dear Sir:) evolved into narrowcast (Dear Brad:). However, I think the early attempts at brute force personalization by assigning unique feed URLs as a means of tracking subscribers can cause several problems.

  1. Web-based aggregator aren’t going to put up with having 10,000 feeds in their database that are essentially the same feed. This places an undue burden of polling and synching on the aggregator, it’s inefficient, and of course, many of the aggregator will ultimately collapse these into a “single” feed. It’s fundamentally inefficient for the publisher for exactly the same reasons. A year ago, there was a lot of noise about “overpolling of RSS” (e.g. aggregator that polled every minute). Most aggregrators have addressed this issue, but the personalized feed phenomenon could start this issue back up.
  2. This approach breaks OPML reading lists. If I’ve got a unique URL feed in my OPML, then when somebody imports my curated collection of feeds, they end up subscribing to a personalized feed, and now you’ve got multiple people subscribed to a personal feed. The stats are no longer accurate for the publisher and my OPML friend is now getting “Dear Brad:” stuff.
  3. Once anybody subscribes to the feed in a web-based aggregator like NewsGator Online or My Yahoo, when people search for that topic, they’ll find one or more personal feeds, subscribe to it, and now you have N people subscribed to a personal feed, the publisher thinks all the subscribers are coming from that one person, you’ve lost an accurate count of the number of subscribers. In addition, the new subscribers get the original personalized feed, which may not be configured the way they want (or thought it was). Finally, in some cases, the search will turn up numerous feeds that cover the same topic, making it hard to determine which one should be subscribed to.
  4. If you are the publisher and you eventually want to change the way you distribute feeds, it’s no longer a matter of redirecting one URL, you now have to go herd the countless subscribers to countless URLs out there in the wild.

Fundamentally, the approach that I’m starting to see appear results in a false sense of a true subscriber count via personalization (presumably one of the goals of personalization is to get an accurate subscriber count), doesn’t scale for the aggregators, the subscriber count quickly diverges from reality as people search for or share feeds, and it’s hard to redirect your subscribers correctly if you decide to do something different later.

There’s got to be a better approach.

  • At least as far as counting subscribers, most of the big online aggregators support subscription counts in the User Agent header. I think it’s unlikely that people will wholesale expose all of their aggregator feeds via an OPML Reading List especially as RSS becomes used more inside the enterprise and, as you say, for personalized content. You could imagine having public / private feed lists in your aggregator for example. Server based aggregators need to be smart about when they combine feeds together – i.e. URL is the same and no authentication is required. The GMail ATOM Feeds are a good example of something that can’t be averaged together (and of something that I would never read in a server based aggregator…)

  • Brad, as an employee of company that cares about security, I immediately saw yet another problem with personalized feeds – you guessed it – security.

    Looking through your list of subscribed blogs, I found that I am able to read your private feed. Your key is in a plain sight!

    Well, I cannot know for sure (may be you intentionally made it public), but it is very likely scenario that people will expose their private feeds using blogroll tools without proper care.

    It is classical security failure: none of the parts involved is directly at fault, but their combination is vulnerable.

    delicious gave you a key, assuming you will keep it private. Newsgator did not recognize this as a private feed. Ultimately, the problem is in lack of authentication mechanism in RSS. Not that it is technically complex (after all, RSS piggybacks on HTTP, and there are authentication mechanisms for HTTP), it is just there is not much demand for feed security for a number of reasons.

    It is also easy to imagine scenario when some smart service builds a combined prioritized feed based on your blogroll and your actual reading preferences. Now, for my paranoid self a plain blogroll is already a big privacy breach, leaking one of those personalized feed URLs would be a total disaster for me!

    I am not sure if you want to let this comment through. I certainly would not – as a security researcher I should not believe in “security by obscurity” but my gut feeling is all for it.

    On the other hand, you are a blogger with a big audience and investor in an RSS company, so may be my humble comment could help raise awareness about security issues in feed aggregation.

  • Brad Feld

    Artem – you are right on the money with this issue. delicious isn’t using any real security on my for/bfeld feed – it’s simply putting private= and then the key. NewsGator actually handles a bunch of HTTP-based security – this is a case where the reader can’t anticipate what the service is going to do if the service is using non-standard security (how does an aggregator know that private= is the key? – how about using something standard.) I’m not terribly concerned about security on my for/bfeld feed – I also know that delicious knows they need to tighten up the secure stuff so they’ll get it at some point.

  • Brad,

    I think your concern is partially valid, but only partially. Take the case of a personalized feed of (basically an RSS feed of a personalized home page). Any aggregator can implement a heuristic to the affect that if there is only one subscriber to a feed, the aggregator does lazy polling on the feed — that is, the aggregator only polls when the user logs in or starts a new session, and then only keeps polling while the user maintains a session.

    If 10,000 RSS subscribers all have their personalized URL in Bloglines, Bloglines uses their lazy loading heuristic. If people start to share their personalized feeds, Bloglines determines a lazy loading v. prefetch tipping point (say 10 subscribers to a given unique feed).

    In terms of the security concern: here’s my personalized My EarthLink feed. It has my local weather, some stock tickers I follow, my horoscope, and some other random stuff. Fairly boring to anyone but me. Not enough personally identifying information to do much damage, and no way to back out from that URL to a username and password for the actual personalized portal site.

    As for your point about messing with tracking — I totally agree. Providers shouldn’t use a unique feed ID to assist in tracking unique subscribers. There are much better ways for that.


  • The Spanning Salesforce feeds are personalized based on the user-ID supplied by the client in response to an HTTP Basic Authentication challenge. This moves the unique identifier out of the URL addressing Brad’s issues. Combine this with HTTPS encryption and you have a secure feed.

    IMHO, this is the Right Way to do personalization (since aggregators don’t support cookies). The UI on the client end could use some work, but between NGOnline, NGOutlook, NGES, FeedDemon, and NNW, a single company could get a lot done on that front.

  • I think that the web has developed a lot since this article was written, and that the progress fits into what you're looking for quite well!

    MIG Welding Techniques | Respectable Reviews | Training An Older Dog

  • Pingback: Kontes SEO()

  • Pingback: Visit This Link()

  • Pingback: Highland Springs()

  • Pingback: cash advance loan()

  • Pingback: cheap auto insurance in nc()