Tagged: rss

Degooglifying (Part II): Feed Reader

This post is part of a series in which I am detailing my move away from centralized, proprietary network services. Previous posts in this series: email.

Next to email, replacing Google Reader as my feed reader was relatively easy, though I’ve chosen to use the move as an opportunity to clean out my feed subscriptions, rather than doing a straight export/import. I’ve replaced Google Reader with two free software feed readers: Liferea (desktop) and Tiny Tiny RSS (web).

A reading list can be very personal, and it can also be very misleading out of context. For example, my reading list suggests all sorts of things about my religious and political views, about the communities to which I may be connected, etc. Though, it would take some analysis to try and figure out why I subscribe to any particular feed. Is the author’s view one I espouse and whole-heartedly hold as my own? One I find interesting, challenging, or thought-provoking? Or one I utterly disagree with yet want to learn more about?

There is something private about a complete reading list, much like the books you might check out from the library or the videos you might rent from a store. As we get more of this content through the internet, it’s easy for these lists (and even more behavioural data about how we interact with them) to be compiled in large, centralized, proprietary databases, alongside all sorts of other personal information that would not be available to a traditional Blockbuster or public library. Besides the software fredom issues, this is another revealing personal dataset that I can claim more control over by exercising software freedom, rather than dumping it into a big centralized, proprietary database. Both software freedom and privacy issues are at play here.

Desktop Client: Liferea

Liferea is a desktop feed reader for GNU/Linux. Google Reader was my first feed reader, so a desktop feed reader was a bit of an adjustment, but there are a few things I really like about it:

  • Native application: It integrates well with my desktop, with something like Ubuntu’s Messaging Menu, and it’s a client that feels somewhat familiar in GNOME.
  • Control over update frequency: One of the things that bugged me about Google Reader is it constantly checks for new content, whether or not you want it to. Sometimes, I don’t want to see anything new until tomorrow. It’s nice to be able to click update, read, and then let it be until I choose to update again. (Though, the downside is missing material if you don’t update often enough.)
  • Integration with Google Reader / Tiny Tiny RSS: This is a killer feature. You can use Liferea to read feeds through the Google Reader API, and recent versions have added support for a tt-rss backend as well. This helped with my transition because I could use Liferea as a front-end for Google Reader before I was prepared to migrate my feeds, to test it out, to ease the transition, etc. And, I will be able to use Liferea and tt-rss together to have both desktop and web-based clients.
  • Embedded Web Browser: This is also a killer feature. Websites that don’t have full-text feeds and only offer a content snippet are annoying in Google Reader, because you have to leave Reader to see the full content. But, in Liferea, you can tell it to automatically load content for a feed using the embedded web browser instead of just viewing the snippet, or hit enter on any feed entry to load the URL using the embedded browser. It even has basic tabbed browsing support, so you don’t have to flip back and forth between your web browser and your feed reader. This makes reading content from non-full-text feeds easy without leaving Liferea.
  • Integrated Comments: Liferea can detect comment feeds on many blogs, and it shows a handful of comments underneath entries. Combine this with a quick enter key to visit the web page with the embedded browser, and you no longer have to leave the feed reader to participate in the comments. This is a nice step up from the usual isolation of a feed reader from comment threads.
  • Authentication support for protected feeds: This is a useful feature for subscribing to protected content, such as an updates feed on an internal wiki.

I tested Liferea as a Google Reader front end, then migrated subscriptions group by group (giving me a chance to re-organize, though I could have just used an OPML export/import), and once I upgrade to Liferea 1.8, I’ll connect it to tt-rss.

Other Desktop Clients: RSSOwl is a free software, cross-platform (Windows, Mac OS X, GNU/Linux) feed reader, which also has Google Reader integration. I have only tried this briefly, so that I could recommend it to Windows users.

Web Client: Tiny Tiny RSS

Tiny Tiny RSS is a web-based feed reader, similar to Google Reader, but free software that you can run on your own web server. There are some feeds I read all the time, and others I’ll skim or catch up on when I have a chance. For the must-read feeds, it makes a huge difference to be able to read them from my mobile computer. With Google Reader, I used grr, and there is a mobile web interface. I migrated my must-read feeds to tt-rss instead of Liferea so that I’d have easy access to them while away from my laptop, while still having the ability to use Liferea when on my laptop with it’s tt-rss integration. I’m moving more and more feeds into tt-rss, though I plan to leave some less frequently updated, less important feeds or feeds that are difficult to read from my mobile in Liferea only.

Some cool features:

  • Publish articles to shared feed: Google Reader had a shared articles RSS feed, and I’d piped that into blaise.ca. tt-rss has a similar RSS feed, which I’ve also been able to include on my website
  • Mobile web interface: tt-rss has a mobile web interface for webkit browsers powered by iUI. With Macuco on my N900 or the Android web browser, it works quite well — though, only for full-text feeds.
  • Filters: With tt-rss, you can create filters on feeds. So, for example, I am automatically publishing articles from the Techdirt feed that I’ve written, or I can auto-delete posts for a particular series or author that I’m not interested in to custom tailor a feed to my interests. It’s very useful for automating certain actions or reducing noise on a high-traffic feed.
  • Custom CSS: I suppose you could customize Google Reader’s styles with a GreaseMonkey script or something, but tt-rss offers custom CSS overrides and multiple themes out of the box, which is great for setting some more readable default colours.
  • API: tt-rss has an API, which allows for Liferea integration, an Android client, etc.
  • Authentication support for protected feeds: Like Liferea, tt-rss provides support for feeds requiring authentication.

As with Liferea, tt-rss gives me control over how frequently updates run, since I schedule the update job. But that control also comes without the downside of missing content if I’m away from my feed reader for a while; unlike a desktop client that needs to be open to retrieve new content, tt-rss does so in the background from the server, so it can still track new entries while I’m away. It has the benefits of Google Reader’s persistent background updates, while still giving me control over frequency and scheduling. I have the update job set to run a few specific times through the day, and tt-rss gives you the option to set an even longer update interval for any given feed.

While I was initially migrating from Google Reader to Liferea, Tiny Tiny RSS is quickly becoming my primary feed reader, while Liferea will become my primary desktop client for tt-rss and home for less frequent/important/non-full-text feeds.

Other Web Clients: NewsBlur is another web-based, free software feed reader, which is based on a more modern web stack and seems to have some neat features. I have yet to try it, and I’m not sure of the state of its mobile or API/desktop integration, which are two things I really like in tt-rss. It’s worth taking a look at though for sure. NewsBlur.com has a hosted service, if you aren’t able to run your own web server or don’t have a friend who’s running one.

Conclusion

My migration away from Google Reader is essentially complete. I have less than a dozen feeds remaining there, but mostly old or broken feeds. I no longer log into Google Reader to read anything, though I’ve got one more round of cleaning to do to empty my account. I’m currently split between Liferea and tt-rss, but with Liferea 1.8, I’ll be able to integrate the two. I also have other libre options to explore with NewsBlur and RSSOwl.

There is nothing that I miss about Google Reader, and if anything, with an embedded browser, native desktop options, integrated comments, control over update scheduling, feed filters, and authentication support for protected feeds, I have a lot of useful features now that I didn’t have with Google’s proprietary service — nevermind more software freedom and less surveillance.

Creative Commons Attribution-ShareAlike 4.0 International Permalink | Comments (5)

Cleaning up HTML entities in MySpace blog RSS feeds (or how to eliminate squidginess)

I recently setup a Facebook musician page for Robyn Dell’Unto. We ran into one really annoying problem importing her blog posts from her MySpace blog. As Robyn described it,

my only issue with the notes is that they go all squidgy when there’s punctuation in the title. which, frankly, embarrasses me! I’m really embarrassed by squidgy punctuation!

By “squidgy,” she meant that the HTML entities were not displaying properly. Titles from imported posts displayed like this: “I’m doing stuff I swear.”

Ugh.

First, I thought it was a problem with Facebook Notes, but upon inspecting the MySpace RSS feed, I found that (aside from being woefully invalid — iTunes?) MySpace seems to have no freaking clue how to handle HTML entities properly. It’s no secret that I’m not a fan of MySpace. Why would I expect a valid feed? *sigh*

There were two really annoying things that MySpace was doing (aside from the whole iTunes thing):

  1. They double encode entities. Sure, it’s necessary that they turn each & into & in links, but not in text that they’ve already encoded!! This leads to the ’ “squidgies” in the titles
  2. There are a bunch of unicode characters that they don’t encode. For all the double encoding, other characters which ought to be encoded are missed entirely.

On top of that, I discovered that Facebook won’t display any of the unicode characters (I think?) even when they are represented by the proper HTML entities. They just display the entity code, causing the ’ “squidgies.”

Now, I’m no expert on character encoding and HTML entities, but I can do better than that. I’ve hacked together some PHP code to clean up the feed a bit before importing to Facebook, which has solved all of our problems so far. I realize I’m only addressing a limited subset of unicode character entities, but it’s working for our purposes for now.

View the code.

It’s nowhere near perfect, but it’s a definite improvement and it works so far. Hopefully this can be of assistance to someone else. Suggestions welcome!

Creative Commons Attribution-ShareAlike 4.0 International Permalink | Post a Comment