27. December 2011 · Comments Off · Categories: Computers, Internet, Linux · Tags: , , ,

YouHaveDownloaded.com claims to track the IP addresses of people who use BitTorrent, but I’ve been torrenting Linux ISOs for months, pushing almost 200 GB of data, and the IP address of that box is not in their database. I don’t know what their methodology is, but they’re obviously not interested in people sharing legal files.

So I finally got a smart phone, and let me tell you, going from a phone that only does voice and text to the Galaxy Nexus is like being shotgunned into the next century. We call the Galaxy Nexus a mobile phone for historical reasons only. It’s really a mobile computer that happens to do voice and text messaging, among other things. I think there are enough reviews of the Galaxy Nexus that I don’t have to give another rundown of its features here, but I do have a few thoughts on the smart phone phenomenon.

First, my “phone” has 1 GB of RAM and 30 GB of storage. My laptop from five years ago had 512 MB of RAM and a 40 GB hard drive. My desktop computer from ten years ago had 128 MB of RAM and a PIII 133 MHz processor. I don’t know how that compares to the ARM Cortex 9 on standard benchmarks, but it’s not just hardware. Browsers today can render JavaScript 10 times faster than the same browsers just three years ago, and probably 50 times faster than Internet Explorer 6, on the same hardware.

So what will our “phones” be like in another five or ten years? In five years, they will probably have the computing power of today’s commodity desktops and laptops, and in ten years they will far surpass them. On top of that, protocol and software improvements like SPDY, Dart, NaCl, etc. (well, maybe), will push performance much farther than hardware improvements alone. I believe the future looks bright, at least from a purely technological perspective.

Of course, the changes we see are not just technological. My phone has GPS that can geolocate me to “within 30 meters” of my actual location. When I turned it on at home, the address that it gave me was my next door neighbor’s house, which is close enough. That’s really convenient when I want directions to the closest Chinese restaurant, but I’m also keenly aware that Google will never delete that data. Ever.

This always happens with technology — there’s always some catch, some unintended (or intended) side effect to our technological marvels. The combustion engine created the industrial revolution and allowed us to build cities (because it made rail and transport affordable), but it also pumped megatons of toxins and greenhouse gases into the atmosphere. For all the problems we solve, we create many new ones, but we keep going because usually the marginal benefits outweigh the costs, all things considered.

It’s just another thing to keep in mind. Email allows you to communicate easily with other people, but that doesn’t mean you should use it for every conversation. Some conversations should be reserved for face to face communication, but that fact doesn’t mean we should abandon email either. Likewise, we don’t have to abandon smart phones because we have (valid) privacy concerns. We just have to remember sometimes to turn off the GPS.

16. December 2011 · Comments Off · Categories: Internet, Politics · Tags: , , ,

According to the rules on Beijing’s microblog management, which went into effect Friday, web users need to give their real names to website administrators before being allowed to put up microblog posts.

Bloggers, however, are free to choose their screen names, said a spokesman with the Beijing Internet Information Office (BIIO), the city’s web content management authority.

“The new rules are aimed at protecting web users’ interests and improving credibility on the web,” he said, speaking on condition of anonymity.

I always thought that personal privacy coupled with institutional transparency was the most effective form of government and society. At least in practice, government secrecy coupled with massive surveillance tends to be despotic. But maybe China will be the outlier…

Source.

14. December 2011 · Comments Off · Categories: Computers, Internet, Linux · Tags: , , , , ,

I manage a Debian server that hosts the Evolving Scientist Podcast, among other things. I have no background in computer science or Linux administration. I’m just a Linux fan who loves learning, and administering a “production” server can be simultaneously entertaining, educational, perplexing, and infuriating.

A few weeks ago, I noticed htop reporting a load over 1.0, yet the CPU use was always close to 0%. I didn’t think much of it at the time, but as the load persisted, I got more worried. Was this harming my box in some way? Turns out, if it’s not CPU use, it’s probably I/O — continuous writing to the hard drive — and that’s not good for the integrity of your data.

If something was constantly read/writing to the hard drive, what was it? Where was it? I just happened to do this:

$ tail -n30 /var/log/syslog
 
kernel: [   76.392218] hub 1-0:1.0: unable to enumerate USB device on port 5
kernel: [   76.580230] hub 1-0:1.0: unable to enumerate USB device on port 5
kernel: [   76.768221] hub 1-0:1.0: unable to enumerate USB device on port 5
kernel: [   76.952222] hub 1-0:1.0: unable to enumerate USB device on port 5
# repeated 30 times

What the hell? The kernel was writing error messages at a rate of about 6 times per second. That might be the problem. But what does “unable to enumerate USB device” mean? Why didn’t I see it before?

A Google search turned up this bug, along with numerous forum posts, speculating and pontificating on the matter. I tried upgrading the kernel. I tried rebooting the server (losing 131 days of uptime), to no avail. I was ready to move everything to a new server. Finally I stumbled across this:

cd /sys/bus/pci/drivers/ehci_hcd/
sudo sh -c 'find ./ -name "0000:00:*" -print | sed "s/.///" > unbind'

Immediately the load dropped, and 15 minutes later, it sits comfortably near zero. It was just that simple, but as always, you don’t know what you don’t know.

This is yet another reminder that there are no hard problems — only problems that are hard to a certain level of intelligence and knowledge.

I don’t know why it worked; I was just desperate and wanted it to stop, so I pasted some code from (yet another) tutorial on the net. Now that I have some peace of mind, I can dig deeper.

23. October 2011 · Comments Off · Categories: Internet, Philosophy · Tags: , , ,

Two years ago today, Philhellenes published “Why Didn’t Anybody Tell Me?

02. October 2011 · Comments Off · Categories: Internet, Linux, Research · Tags: , , , ,

A while ago I posted a tutorial on setting up your own webdav server to sync Zotero. Although you may not consider your bibliography to be sensitive data, authenticating to the server by sending a plaintext password is a bad idea. Here I’ll show you how to sync over an encrypted connection.

First, create a self-signed SSL certificate:

mkdir -p /etc/apache2/ssl
openssl req -new -x509 -days 365 -nodes -out /etc/apache2/ssl/ssl.pem 
        -keyout /etc/apache2/ssl/ssl.key

You will be asked a series of questions, but you don’t have to fill out most of those details. Just put your server’s domain name as the common name.

Then edit /etc/apache2/ports.conf and add your server’s IP address:

NameVirtualHost 12.34.56.78:443

Next, edit the configuration file for your webdav server. For example, if your Zotero data is being synced from domain1.net/webdav, then you should append the following to the vhost file for that domain:

<VirtualHost 12.34.56.78:443>
    SSLEngine On
    SSLCertificateFile /etc/apache2/ssl/ssl.pem
    SSLCertificateKeyFile /etc/apache2/ssl/ssl.key
 
    ServerAdmin webmaster@domain1.net
    ServerName domain1.net
    ServerAlias www.domain1.net
    DocumentRoot /path/to/domain1.net/
    ErrorLog /path/to/logs/error.log
    CustomLog /path/to/logs/access.log combined
 
        DAV On
        AuthType Basic
        AuthName "webdav"
        AuthUserFile /path/to/domain1.net/webdav/passwd.dav
        Require valid-user
</VirtualHost>

Now we need to enable SSL for the web server:

a2enmod ssl
service apache2 restart

Lastly, point Firefox at https://domain1.net and make an exception for your self-signed certificate, then change the sync option in Zotero from HTTP to HTTPS.

15. August 2011 · Comments Off · Categories: Computers, Internet, Linux · Tags: , , , , ,

All right, I finally solved that problem of running multiple vhosts on Apache with WSGI. So now evolvingpodcast.net brings back a different result from the IP address itself. As always, the solution was simple. You just don’t know what you don’t know.

30. July 2011 · Comments Off · Categories: Internet, Personal · Tags: , ,

This blog jumps around a lot. I just got a VM from AlienLayer: 190 MB RAM, 19 GB storage, 190 GB bandwidth, for — you guessed it, $19 a year. The server is in New York. The ping response is pretty good from my house (about 40 ms), which means there’s no lag over ssh. The node seems really fast, too. It’s comparable to my ChicagoVPS box, where The Evolving Scientist Podcast is hosted.

The Evolving Scientist Podcast runs on MediaCore, which is the Python app mentioned in the previous post. Since MediaCore talks to Apache through WSGI, and I can’t seem to get vhosts working properly with WSGI, for now my blog is a nomad. I was hosting it out of my lab for the last week, but figured I needed something more reliable than my personal computer.

Oh yeah, for the geeks out there, it’s an LNMP stack: Nginx running on Debian 6, with PHP-FPM pulled from Testing. It currently consumes ~130 of the 190 MB RAM.

29. July 2011 · Comments Off · Categories: Internet, Personal · Tags: , , ,

I cannot for the life of me get vhosts to work properly while running WSGI. I’ve searched for a solution, but nothing works. Right now, requests for all domains land on the Python app or I get an error that the request is not redirecting properly (or will never complete). I don’t know if anybody actually reads this blog (it gets a couple hundred hits a day, but that could all be spam bots), but if you do, and you know how to properly configure vhosts on Apache with WSGI, please let me know.

24. July 2011 · Comments Off · Categories: Internet, Technology · Tags: , , ,

The recent controversy over deleted Google+ accounts got me thinking again about the ephemeral nature of our digital lives. My mantra is that life is full of trade offs, and this seems to be one of them. Digital content is much easier to distribute than physical content. The trade off is that your entire digital life can be wiped out in an instant. What do you do then?

Over the last six months I have been systematically scanning old pictures that were, up until now, stored away at my parents’ house. Many of these pictures are from the 80s, when I was a kid, but some are from the 50s and 60s, when my parents were kids. Although the pictures are 25-50 years old, they are in surprisingly good condition. The pages of the albums have yellowed and the glue decomposed over time, but the pictures are as vivid as the day they were made. With continued care, they will outlast me.

That’s a guarantee that we don’t get with our digital content. The first problem is the instrumental fact that hard drives crash. One of the main reasons we sign up for “the cloud” is automation of back ups. When you upload your images to Picasa, they make redundant copies across geographically separated data centers. However, the second and bigger problem is the one we’re witnessing now: Terms of Service and Acceptable Use Policy violations.

Of course, some people (like Eben Moglen and Richard Stallman) have been warning us for years that turning our digital lives over to the whims of capricious third parties is a bad idea. Most people don’t care, because online services are so easy to use. I think the first time a significant number of people became aware of this problem was last year when WikiLeaks got booted off Amazon’s servers and bounced around several hosts. Amazon claimed that WikiLeaks had violated their AUP, but it’s not entirely clear that they had. It’s hard to argue that they were engaged in criminal activity when they still haven’t been charged with a crime.

The real problem here is that the new gatekeepers of our digital lives can do whatever the hell they want. They can inconsistently apply their AUPs when a Congressman calls them up, just as G+ is inconsistently applying its policy on real names now. If all your media are digital, then it’s just your lifetime of memories that’s at stake.

I have 750 pictures in my Picasa Web account, most of which were scanned during my recent project. If Google deleted my account without explanation (as is usually the case), what would I do? Well, ironically, I have the physical pictures, which are the ultimate back up. Short of that, the best solution is simply to make as many back ups as you can, in as many different places: multiple hard drives, different hosts, etc. If you know a little scripting, this can be automated, but it’s a nontrivial solution for most people.

The other thing that we must impress on our new gatekeepers is that, if they expect us to turn our digital lives over to them, they need to start taking their responsibilities seriously.

As a first step, be responsive to your users. We hear over and over again about accounts being deleted, and the universal problem (at Facebook, Google and elsewhere) is that you can’t reach anyone. They’re asking us to put all our eggs in their basket. We’re “betting on them”, as Vic Gundotra said to one of the recent victims of a random deletion. That power comes with responsibility. Take it seriously, and create a mechanism where problems are reported and addressed quickly.

Their apparent insouciance on this point is creating a lot of doubt. Maybe Eben Moglen was right, and a plug server in every home is the safest mechanism for storing your data. First, we would own the bare metal, so there wouldn’t be AUPs to worry about. Second, there would be no spying, like all cloud services currently do. Only a court order or warrant would give third parties access to your data. Third, distributed, encrypted backups (with version control, even) would ensure the integrity of your data. People are already working on this solution. You can read more here:

http://freedomboxfoundation.org/
http://wiki.debian.org/FreedomBox