Blog - Unity Behind Diversity

Searching for beauty in the dissonance

Tagged: network services

Degooglifying (Part III): Web Search

This post is part of a series in which I am detailing my move away from centralized, proprietary network services. Previous posts in this series: email, feed reader.

Of all Google services, you’d think the hardest to replace would be search. Yet, although search is critical for navigating the web, the switching costs are low — no data portability issues, easy to use more than one search engine, etc. Unfortunately, there isn’t a straightforward libre web search solution ready yet, but switching away from Google to something that’s at least more privacy-friendly is easy to do now.

Quick Alternative: DuckDuckGo

In on sense, degooglifying search is easy: use DuckDuckGo. DuckDuckGo has a strong no-tracking aproach to privacy. The !bang syntax is awesome (hello !wikipedia), the search results are decent (though I still often !g for more technical, targeted or convoluted searches), it doesn’t have any search-plus-your-world nonsense or whatever walled garden stuff Google has been experimenting with lately, and it’s pretty solid on the privacy side. After just a few days, DuckDuckGo replaced Google as my default search engine, and my wife has since switched over as well.

The switch from Google Search to DuckDuckGo is incredibly easy and well worth it. If you’re still using Google Search, give DuckDuckGo a try — you’ve got nothing to lose.

But… DuckDuckGo isn’t a final destination. Remember: the point of this exercise isn’t for me to “leave Google,” but to leave Google’s proprietary, centralized, walled gardens for free and autonomous alternatives. DuckDuckGo is a step towards autonomy, as web search sans tracking, but it is still centralized and proprietary.

Web Search Freedom

A libre search solution calls for a much bigger change — from proprietary to free, from centralized to distributed, from a giant database to a peer-to-peer network — not just a change in search engines, but a revolution in web search.

YaCy

Last summer, I ran a search engine out of my living room for a few months: YaCy — a cross-platform, free software, decentralized, peer-to-peer search engine. Rather than relying on a single centralized search provider, YaCy users can install the software on their own computers and connect to a network of other YaCy users to perform web searches. It’s a libre, non-tracking, censorship-resistant web search network. The problem was that it wasn’t stable or mature enough last summer to power my daily web searches. I intend to install it again soon, because as a peer-to-peer effort it needs users and usage in order to improve, but an intermediate step like DuckDuckGo is necessary in the meantime.

Although YaCy is designed to be installed on your own computer, there is a public web search portal available as a demo.

Seeks

Seeks is another interesting project that takes a different approach to web search freedom. Seeks is “an open, decentralized platform for collaborative search, filtering and content curation.” As far as I understand, Seeks doesn’t replace existing search engines, but it adds a distributed network layer on top of them, giving users more control over search queries and results. That is, Seeks is a P2P collaborative filter for web search rather than a P2P indexer like YaCy. Rather than replacing web indexing, Seeks is focused on the privacy, control, and trust surrounding search queries and results, even if it sits on top of proprietary search engines.

Seeks also has a public web search portal (and DuckDuckGo supports !seeks). As you can tell, its results are much better than YaCy’s, but Seeks is tackling a smaller problem and still relying on existing search engines to index the web.

Conclusion

DuckDuckGo, though proprietary and centralized, provides some major privacy advantages over Google and is ready to be used today — especially with Google just a !g away.

But web search freedom requires a revolution like that envisioned by YaCy or Seeks. Seeks seems like more of a practical, incremental and realistic solution, but it still depends on proprietary search. YaCy is more of a complete solution, but it’s not clear whether its vision is technically feasible.

I intend to experiment with both of these projects — p2p services need users to improve — and continue to watch this space for new developments.

Creative Commons Attribution-ShareAlike 4.0 International Permalink | Post a Comment

Degooglifying (Part I): Email

I’ve begun to write about free (libre) network services, and the hazards of being a tenant on the web instead of a property owner. I began slowly moving away from Google in 2009, but I’ve accelerated that process since the launch of Google+. I thought I’d begin to share my process of degooglification.

To be clear, I still generally trust and respect Google, and I do believe they’re generally less evil than most, but…

  1. Despite great support for open source software, they remain a proprietary software company at their core. Google is a friend to open source infrastructure, but not to free (libre) network services. Specifically, it’s the proprietary network services I’m degoogling from.
  2. The sheer amount of data — email, contacts, documents, calendar, RSS feeds, social graph, phone calls, photos, GPS location, nevermind web searches… — aggregated into a one single account with a proprietary service provider is an obviously bad idea. Even if Google never intends to do anything bad with it, they can make mistakes. Even if Google never does anything bad itself, it’s a single vector for attack from an outsider. And it’s not your account.

Email is one of the easiest services from which to degooglify. It’s also a good example of a multi-step transition.

Changing the front-end

The first thing I did was to stop using the Gmail web interface. I configured my Gmail account in Thunderbird, which I was already using for other email accounts. Google’s commitment to data portability often makes it easy to switch your front-end software before switching the back-end, which can make a transition much smoother. Rather than cutting over cold turkey, you can ease into a new interface. My Gmail account is still active, but it rarely sees any important email anymore. I’ve transitioned 99% of my email to other accounts on domains I control (like this one).

Changing the Backend

Gradually, I started using my blaise.ca email addresses instead of my Gmail account, until eventually I wasn’t getting much email through Gmail anymore. With my Gmail account configured in Thunderbird, it was easy to archive the contents on my computer. You can access Gmail labels as IMAP folders and just copy email from one account to another, and Thunderbird will even offer to synchronize a local copy of your Gmail account. I never used Gmail contacts, but an export and import to Thunderbird would get your data out (more on contacts another time). Lastly, I’m still monitoring my Gmail account via Thunderbird, but I could set an auto-reply and/or forwarder if I really wanted to force that last 1% over. I will probably do that eventually.

Other Considerations

There are a few other perks of a Gmail account that are pretty easy to get from libre alternatives:

  • Hosted: Not everyone is going to run their own mail server, or have a friend or family member who does. But there are hosted, libre services, like riseup.net
  • Storage space: in 2004, 1 GB of email was a huge game changer. Today, it’s not very hard to get that kind of storage space on a server for cheap.
  • Chat: Google uses the open standard XMPP for its chat service. I run my own XMPP server, and there are public Jabber services like jabber.org. I’ve simply added my Gmail contacts to my blaise@blaise.ca XMPP account. More on chat another time.
  • Conversations: The Conversations add-on provides Gmail-style conversations inside Thunderbird.
  • Spam filtering: Gmail has a good track record on spam filtering, but SpamAssassin, ClamAV and a greylisting policy can produce great results on your own server nowadays. I don’t get any more spam to my blaise.ca inbox than I do to my Gmail inbox.
  • Webmail: I love Thunderbird, but not everyone wants to use a desktop client, and you’re not always on your own computer. Roundcube is already a great free software webmail client, and it hasn’t even hit 1.0 yet. Many hosting providers already offer Roundcube to their customers.
  • Mobile: With IMAP, my email is easily accessible from and synchronized between Thunderbird, Roundcube, and my mobile computer’s IMAP client.

Email is probably the easiest thing to degooglify. It can be a smooth, gradual transition, and there are lots of good alternatives, as well as benefits from leaving Gmail. Over the next while, I’ll share my ongoing efforts to degooglify other aspects of my online life.

Creative Commons Attribution-ShareAlike 4.0 International Permalink | Comments (8)

Explaining Distributed Social Networking Services

Via the FreedomBox Foundation, J David Eisenberg has created a great comic introduction to distributed social network services. Distributed systems are an important part of free network services.

Creative Commons Attribution-ShareAlike 4.0 International Permalink | Post a Comment

Four Criteria for Free Network Services

I’m increasingly critical of network services — software that you use on someone else’s server to do your own computing. We rely on computers more and more for our work, social lives, civic engagement, health, education and leisure, and more and more that means relying on networking services rather than our own personal computers. There are serious trade-offs to living as a tenant online, rather than a property owner. I’ve been reconsidering the network services I use and rely on, especially in the shift to mobile computing.

The work of Autonomo.us has heavily influenced my thinking. Also of note is Stallman’s essay on software as a service (though he does more to identify the problems than recommend solutions). I essentially agree with the Franklin Street Statement from Autonomo.us. As a user of network services, I’ve narrowed it down to four major criteria to look for when deciding whether to trust a service on freedom and autonomy.

  1. Free (libre) software
  2. Control over data
  3. Privacy / Encryption
  4. Distributed Systems

Note: This is more of a working list than an attempt at a formal definition. For example, I’m not sure that #3 and #4 should be required, even though I believe they are important. Feedback is welcome.

1. Free (libre) software

Free (libre) or open source software licenses designed for network services, like the GNU AGPL, help guarantee the software will respect users’ freedoms. The arguments for software freedom have been addressed at length elsewhere, but the freedom to run the software yourself is particularly relevant here since, unlike desktop software, you often have the choice of letting someone else run the software for you. Even if you don’t run the software on your own server, having the freedom to do so ensures that you can still run the service in the event that the service provider shuts down — a frequent concern with proprietary web startups after acquisition or failure. And, even if you can’t run the software yourself, with all four freedoms, chances are someone else will. The broader case for software freedom is made at length elsewhere.

Network services should respect users’ freedoms. LibreProjects.net has a good list of free web services and alternatives.

2. Control over data

If users want to leave a service provider, can they take their data with them? Open standards are important. Open standards allow other software to read and understand your data. Open standards also allow you to mix the software you use on the client and server or across multiple devices more easily. Not only does this make migration more realistic, but it makes transitions smoother.

Google’s network services aren’t often free (libre) software, but Google does have a strong commitment to open standards and making your data easily available. I’ve used many Google services from non-Google clients: Gmail from Thunderbird, Evolution and Modest; Google Calendar from Lightning, Evolution, and my N900; Google Reader from Liferea and grr; Google Talk from Empathy, Pidgin, and my N900, etc. I’ve been able to switch my client-side software before changing the back-end. This makes it possible to transition to new services gradually, in smaller steps, with less disruption.

Facebook has a download feature, but it’s slow, and it just chucks all of your data into a giant zip file rather than putting it into formats that other software or services could understand. Facebook has also actively blocked services that export your data to other providers. Your data is available for download, but not in a very useful way.

Migrations are not always planned. On your own server, you have the master key. With a service provider, if you lose access to your account because it’s cracked or cancelled suddenly, will you also lose access to your data? Or will you have an up-to-date copy locally? Open standards often help make it possible to keep a local copy up-to-date, but this isn’t always the default way we use these services. A synchronization service will typically maintain a complete local copy of your data, but services intended to be accessed through the web often require additional client-side set up
on the user’s part to make this happen (e.g. using Thunderbird or OfflineIMAP to keep a local copy of your Gmail email, or using Google Sync to keep a local copy of your calendar and contacts). Or, the services may only offer data dumps as backup. Does a service let you keep a complete local copy of your data easily in your everyday usage? Even if you primarily use the web interface, setting up a desktop client for regular use can help maintain a local copy of your data without having to consciously download backups.

Lastly, public data that is intended to be shared should be available under a free and open licence. Identi.ca uses CC BY for public user data. Libre.fm focuses on freely licensed music. This gives control over public content to the community, rather than just the service provider.

Network services should let users control their data, using open standards to give users control of their personal data and free licences to give the community control over public data. Despite having a very mixed record on other criteria, Google is a good example of open standards done right. Free (libre) and open source tools are also usually good with open standards. Identi.ca is a good example of licensing public data freely.

3. Privacy / Encryption

My concern with privacy isn’t so much what a service provider’s policies are, but who has access to the data in the first place.

With the launch of Google+, I’ve been quite relieved that I’ve moved a lot of my important data out of Google over the past few years. It’s one thing for Google to have my email or my social graph or my documents, but the volume of data that would be in one place using all of Google’s services is astounding. Google is generally a well-meaning company, but I wouldn’t want any single organization to have everything that Google might have: my email (love letters, job applications…), address book (contacts and their private information), documents (budget, resume, business plans), calendar (activities, habits, regular whereabouts), RSS feeds (passions, interests, and political, intellectual, religious leanings), instant messaging (chat logs with friends, lovers, co-workers), my social graph (strong ties, relationships), my phone calls (the ability to recognize my voice from Google Talk or Google Voice), my photos (facial recognition and identification of my family, friends, colleagues) — nevermind all of the revealing personal information contained in web searches! There are lots of questions regarding each type of data and whether or not you’d want to trust it with someone else, but the aggregation of all of it into a single account is a more noticably bad idea. It’s a recipe for disaster in the event of a privacy leak or breach, oppressive government actions, a supeona, the loss or revocation of your account, etc.

Furthermore, some things I simply don’t want on someone else’s computer ever. I’ve felt comfortable trusting service providers like Google with my email in the past, but I’ve never been comfortable trusting them with my entire address book — that’s not just my data, but other people’s private information too. Similarly, I would never want my personal journal on someone else’s computer — that’s just too private.

However, Mozilla does a fantastic job of handling private data. With Mozilla Weave (i.e. Firefox Sync), not only is it free (libre) software that you can run on your own server, but your data is encrypted on the server. A user has two passwords — one to authenticate with the server, another to encrypt the data locally. Since encryption happens locally, the server only sees the encrypted data and never sees your second password. Mozilla doesn’t even ask for the information to decrypt your Firefox Sync data. You can use their server to sync your data across computers, but it’s only ever decrypted on your computers, not the server. If you use Mozilla’s server instead of your own, Mozilla still won’t have access to your data.

I wish more services providers would do this. I understand it doesn’t work for services that are meant to be accessed directly on the server through the web, but at least for synchronization services it seems like a privacy no-brainer. Funambol, for example, is a great libre software data synchronization server for mobile devices, but I don’t think their gratis service at my.funambol.com encrypts your data. I suppose they have a web interface on their server, but I’d rather run my own Funambol server in the absence of Weave-style encryption, whereas I don’t mind using Mozilla’s Firefox Sync service at all.

Encryption of data in transit is another concern. Does a network service or web application offer encrypted methods of communication? Or is your private data being transmitted out in the open? Gmail now offers HTTPS by default. Facebook and Twitter offer an “Always use HTTPS” setting. The EFF has developed a Firefox add-on that uses HTTPS wherever possible. I’ve started using basic StartSSL Class 1 certificates, which are available at no cost to individuals, in order to encrypt traffic on my home servers.

A good network service should take privacy seriously, and offer encryption wherever possible. I’m not sure that this should be a requirement for a free network service, but it’s an important consideration before using a service hosted by somebody else. However, a service that may fail to adequately protect your privacy as a hosted service could still provide an acceptable self-hosted solution.

4. Distributed Systems

Email is a common example of a distributed set of protocols. If Bob uses Hotmail and Sally uses Gmail, they can still communicate with each other. Telephony provides another example; Bell customers can phone Rogers customers, and vice versa. This is the ideal — choosing a service provider independently from the people with whom you want to communicate. Distributed systems strengthen the Internet, creating fewer points of failure or censorship, more opportunities for expression and innovation, more freedom and autonomy for users. This isn’t always relevant for network tools or synchronization services aimed at individuals or small groups compared to social network services and communications tools.

Most online social networking services are walled gardens. Facebook users can only talk to other Facebook users, MySpace users can only talk to other MySpace users, etc. In this environment, social pressure has negative effects on freedom and autonomy. You might not feel comfortable using Facebook, but if that’s where your social circles are active, you’re faced with the choice of being left out or using a service provider with which you’re uncomfortable.

Google Talk makes it clear that it doesn’t have to be this way. Rather than developing their own proprietary walled garden instant messaging service, Google used the open standard XMPP (aka Jabber) for its chat service. With XMPP, you can chat with people on other servers. I have a Jabber account on my own server (and there are dozens of public Jabber servers), and I can still talk with (or call) people on Gmail Chat. I’ve left Google Talk, but I’m not cut off from Google Talk users. Compare that to Skype, which has so far relied on a proprietary VoIP protocol that only lets Skype users call other Skype users (short of bridging to traditional telephony).

In the social networking space, there are efforts like GNU Social/StatusNet and Diaspora to develop distributed solutions. StatusNet has already had some success implementing an open standard for distributed status updates. I’m curious whether Google+ might advance the cause of distributed social networking services (even slightly), given Google’s commitment to distributed systems and open standards elsewhere, and their development of new standards like OpenSocial.

Social network services should be distributed, allowing users to communicate across service providers. Email, traditional telephony, XMPP/Google Talk and GNU Social/Diaspora are all good examples of this. I’m not sure that this should be a strict requirement for a free network service, but the freedom to run the software on your own server is pretty useless for some social applications if you can’t communicate with people on other servers.

Conclusion

Identi.ca, the flagship StatusNet site, is a perfect example of a free network service. It’s free software (AGPL), implements open standards and documented APIs for accessing your data, they’ve pioneered an open standard for distributed networking, and public updates are licensed freely. I’m happy to use Identi.ca.

Mozilla’s Firefox Sync is a good example of a free network synchronization service. Data is encrypted, it’s free software that can be run on another server, and bookmarks are stored locally in a format that other applications can read. I’m comfortable using Mozilla’s service for Firefox Sync.

AGPL network sync services like Funambol and Snowy are also libre services (free software, open standards or documented formats), but in the absence of Mozilla-style encryption, I’d prefer to run them on my own server. The FreedomBox Foundation has been working on an easy way to run libre services from a home server, and make them available to others. I currently use a combination of always-on GNU/Linux home computers available remotely and some dedicated servers that I manage. Even without your own server, you can use free (or more freedom-friendly) hosted services like riseup.net for email, jabber.org or others for instant messaging, my.funambol.com for mobile sync, Mozilla Firefox Sync for bookmarks and browser data, Identi.ca over Twitter, Voip.ms (SIP) over Skype, Libre.fm over Last.fm, etc. If you’re looking to try out some of the self-hosted services, I do have Snowy, Funambol, and Tiny Tiny RSS running on my home server — contact me if you’d like an account to try them out.

The process of disentangling from proprietary network services can take some time, but it’s well worth it for the sake of freedom and autonomy, even when it may be challenging in the short-run. If you can’t leave a proprietary service right away, recognizing where it fails to meet these criteria can help you take some important steps in the meantime.

Creative Commons Attribution-ShareAlike 4.0 International Permalink | Comments (8)