Site icon Michaël Rigart

Parsing RSS feeds in ruby

One of the great benefits of open source is that for some of the problems you encounter, someone probably already solved it for you.

For a project I needed a simple way to parse an RSS feed. If you look around, you’ll find plenty of libraries that designed to tackle this problem.

But taking a closer look at them revealed that most of these libraries have a lot of external dependencies and most offered features weren’t needed. Heck, I just want to parse a feed.

Ruby standard library

What might come in as a surprise is that ruby comes with its own RSS library.

I haven’t looked at all the features, but it seems pretty complete, offering parsing for RSS, Atom and even Itunes Channel XML.

You can parse and iterate a feed like this:

require 'rss'

rss = RSS::Parser.parse('http://www.michaelrigart.be/en/blog.rss', false)
rss.items.each do |item|
  puts "#{item.pubDate} - #{item.title}"
end

Keep in mind that the available data can be different between an RSS and Atom feed. If you don’t know which type your are parsing, try checking the .feed_type:

require 'rss'

rss = RSS::Parser.parse('http://www.someatomfeed.org/blog.atom', false)

case rss.feed_type
  when 'rss'
    rss.items.each { |item| puts item.title }
  when 'atom'
    rss.items.each { |item| puts item.title.content }
end

The above is just a very basic example, but you get the idea. So why use a 3rd party library when you can do the same with functionality provided from the ruby standard library.