Who gains from blocked content on YouTube?

When I want to hear a particular rap song from 1992 on YouTube, the video service shows me this:

Yeah yeah, copyright or whatever, but what is the point? Who gains what exactly? By the way, if you can hear the song in the country you're in, then fuck you :-)

Who's involved, let's see. Me (the user/customer), Google (the owner of YouTube), the EU (makers of regional copyright laws), Sony (the copyright holder), and CMW (the artist).

Does Google the owner of YouTube win?

No. Google looses straight away, because I can hear the song on GrooveShark just fine (albeit without the video):

Does the EU win?

No. The EU might gain a little bit, because CMW is an american band, so chances are that I'll listen to a EU artist like Dizzie Rascal instead:

But that's not going to happen, because I wanted to listen to CMW, and I've already found the song on another service, GrooveShark.

Does Sony Music Entertainment win?

No. I already bought the song on iTunes a couple of days ago. If I hadn't bought it, I would have downloaded it with a torrent. The YouTube video being there or not, did not factor in to my decision to buy the song. I bought the song because it was insanely easy to do on my iPhone. Period. In fact, I might choose to not buy a song in the future if it's owned by SME.

Does the artist win?

Hardly, in fact they loose. I'm sure they appreciate that I bought the song, though I'm sure Sony Music Entertainment appreciates it a hell of a lot more if I know anything about royalty splits! And the song being blocked on YouTube did not make me buy the song, as I've already said. I was about to make CMW more famous, by linking their video on my blog, but couldn't. Sorry CMW.

Do concerned mothers win?

Does the fictional organization of "concerned mothers against gangster rap" gain anything by a blocked gangster rap tune on the internet? Sure, but that is mere coincidence, it could just as well have been a song about flowers or teddy bears or a praise for "concerned mothers against gangster rap".

By the way, you may check out the song CMW sampled on "N 2 Deep". It's by Lyn Collins, and features the distinct sound of the JB's. Apparently the copyright holder (Polydor) is not insane:


I find that this blogpost and video on innovation from Edinburg by Ed Parsons is somehow related to this issue.

By the way Ed. If you watch the ping back. Sorry that I stole your look for WordPress. I kinda liked it, and I do listen to gangster rap occasionally so my morals are questionable.

Opening and closing ports on EC2 instances

Assuming that the EC2 tools have been installed like described in a previous post, opening and closing ports is done with the ec2-authorize and ec2-revoke commands respectively. These commands work on security groups rather than on instances. Recall that a set of instances belong to a security group.

Opening port 80 on EC2 instances in the 'default' security group.

ec2-authorize default -p 80

Close port 80 on EC2 instances in the 'default' security group

ec2-revoke default -p 80

See also the Amazon command reference for the EC2 API.

Hints for managing Amazon Linux on EC2

I'm using Mac OS X and running instances in the EU West Region. My instances are of the Amazon Linux AMI.

Installing the EC2 command line tools

Having command-line tools installed is a supplement to the AWS management console found online. I found a good tutorial about how to get started with the tools for EC2 on Mac OS X.

After downloading the tools from Amazon download site, the tutorial describes how to set environment variables and how to create X.509 certificates etc.

The only detail missing was that I'm running my instances in the EU West region. I found a hint in another tutorial on setting an additional environment variable. My resulting .profile file looks like this:

# Setup Amazon EC2 Command-Line Tools
export EC2_HOME=~/.ec2
export PATH=$PATH:$EC2_HOME/bin
export EC2_PRIVATE_KEY=`ls $EC2_HOME/pk-*.pem`
export EC2_CERT=`ls $EC2_HOME/cert-*.pem`
export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Home/
# This line is from second tutorial, for use with EU West Region:
export EC2_URL=https://eu-west-1.ec2.amazonaws.com

The first tutorial show many examples of using the command-line tools to start instances, open ports etc.

Package manager for Amazon Linux AMI

Maybe the tools can be used to install packages on the Amazon Linux AMI instance, but you could also use a package manager.

Amazon Linux AMI comes with the yum package manager installed. A tutorial which is specifically aimed at installing PHP on a Amazon Linux AMI instances also gives a quick tour of yum. Basically you do like this:

$ sudo yum install <PACKAGE_NAME>

Installing Apache Web Server

As an example of using the EC2 tools and the yum package manager is installing the Apache Web Server. The command ec2-describe-instances lists running instances in the region given in the environment variable EC2_URL.

$ ec2-describe-instances
RESERVATION	r-xxxxxxxx	xxxxxxxxxxxxx	default
INSTANCE	i-xxxxxxxx	ami-xxxxxxx	ec2-xx-xxx-xx-xx.eu-west-1.compute.amazonaws.com

default is the name of the security group for the instance. You may have used a different security group name. Security groups are used to make it easier to apply a set of permissions to a range of instances. The command ec2-authorize applies a permission to a security group, like opening up port 80 for httpd.

# open up port 80 on instances belonging to security group 'default'
$ ec2-authorize default -p 80
PERMISSION  default  ALLOWS  tcp  80 80  FROM  CIDR

Logging into the instance with ssh and then using the package manager to install httpd.

# use the key pair that you used when launcing your instance
$ ssh -i ~/.ec2/ec2-keypair ec2-user@c2-xx-xxx-xx-xx.eu-west-1.compute.amazonaws.com
# install httpd - starts an install process
$ sudo yum install httpd

Image search by sketching – continued

It's a simple question

Can you search for images by sketching a similar image?

I went looking online for a search engine that had implemented this feature, which I'll call image-search-by-sketching.

Update: Since I wrote this piece, GaZoPa no longer exists. In the meantime Google has implemented image-search-by-image. You can't sketch, but you can use an existing image.

Googles implemetation of image-search-by-image is did both a good and bad job when I tried it last (December 2011). When I tried with my test image (dog-shape below), I got this blog post, which is good. But the related images are way off, number one related image is a picture of a shoe?

I can see the similarity to my dog-shape in the results that Google suggested, but I didn't get a dog. No doubt it is a hard problem, and what I wish for is highly semantic, in the sense that I want the search engine to recognize that I'm looking for a dog. In my test below, GaZoPa could have gotten it right for a number of reasons. Maybe they simply had many fewer items in their database to match the dog against, and the best match happened to be... a dog? I guess I'll never know. R.I.P. GaZoPa.

And so I went looking for such a search engine...

First thing I did, was ask this question on Stackoverflow and got an reply which pointed my to a couple of cool websites.

These are all cool websites, but at first not exactly what I was looking for. After trying GazoPa I realized that the website is almost exactly what I was looking for (a service that allows you to sketch-up an image query).

Trying GazoPa

GazoPa allows you (among other things) to upload an image, and performs a search for similar images. I'm not quite sure which images are in its index, but I proceeded with the following experiment. I drew up a rather crude dog in Dia, and uploaded this image to GazoPa. Here is the dog:

It actually gave some pretty decent results, with this one being the first hit:

It is not hard to imagine a site that combines the sketching I did in Dia with the GazoPa service.

Update: Unfortunately GazoPa no longer exists. I guess you combine Google image search with a drawing program, but it would be more fun to do it with an indie search engine.

Image search by sketching in 2007

This is a post in my technology archaeology series.

What is search by sketching?

The idea is to search for images by drawing a sketch that roughly resembles what you are looking for. The sketch is your query. This idea was mentioned in years 2007, 2010 and sometime in the late 90's (according to my friend Rasmus)

The idea is not new. A friend told me about an art search engine (i forget the name) where you could search for works of art by splashing colors on a crude web canvas, e.g. drawing some purple in the top, some yellow in the corner, and voila: "Is this the painting you where looking for?"

That is, based on your quick sketch, the algorithm finds matches in an art image database.

Applications of the technology

Here are some ideas for applications of the top of my head

  • Search for vector data in a spatial datasource. The user draws a sketch on top of a map (to get scale correct), and relevant vectors are returned. I and my colleague talked about how Denmark looks like the word Foo.

    So we naturally thought about something geographical that looks like the word Bar. This could be a chain of islands or a series of lakes. In essence you'd draw the word "Bar" and ask for vector data that looks similar.

Online mentions of search by sketching

There is a blogpost that also talks about the idea and mentions concrete technology:

This guy has something that looks like a product and even a youtube video

Also Microsoft in Asia apparently has been working on this

But where is it? Why doesn't Google support this on their image search?

I've asked on stackoverflow


BitTorrent for geodata was big in 2005

Big in 2005...

Today I'm trying to find out whether BitTorrent + geodata is a "thing". I have found out that it WAS a thing... in 2005! Just like Coldplay, Gorillaz, Eminem, 50 cents, James Blunt, Green Day... but it never really took off.

  • In 2006 Chris Holmes had a blog post titled Distribution of Geodata, where he said stuff like «What is needed is a standard, so that clients tile up the earth in the same way and make the same requests.» and «instead of asking the server to return a set of tiles that represents an area, it could ask a p2p network»
  • In 2005 Ed Parsons has a blog post titled Peer to Peer Geodata anyone ?, where he said stuff like «The idea of distributing large geodata datasets as small chucks is quite appealing and I have no doubt that when open geodata becomes more mature – this will be the obvious mechanism of supply» and «peer to peer means piracy in many minds, an unfortunate perception».
  • He and others mention GeoTorrent.org, a site offering geographical datasets via bittorrent.
  • In 2008 people ask: What happened to geotorrent.org?
  • In 2011, I'm asking the same thing: What is going on with P2P and geodata? Either I'm hopelessly old school, or a good idea simply went missing without a trace...

Ok, so people are still talking about P2P+geodata in 2006, 2007 and 2008, but the fact is that it has not seen a wide breakthrough in 2011. Or am I missing something?

GeoTorrent.org no longer answers HTTP requests, but it is still registered. GeoTorrent.org was run by ERMapper, who was bought by Leica Geosystems, who merged with Erdas, according to some person in 2008. It was a site devoted to offering geodata via bittorrent. Richard Orchard was one of the people behind GeoTorrent.org. Maybe he knows what happened to geotorrent.org?

Using the keywords "P2P" and "geodata" I went looking on scholar.google.com. I did not find that much, and nothing that has been generally adopted (see some of the hits in the Links section below).

What am I looking for in 2011?

What I'm looking for is something like a plugin for GeoServer, or a web-gis framework that fetches tiles via P2P, or something like GeoNode with a P2P twist. Actually GeoNode could be it... is GeoNode it?

Conclusion: Some pros and cons of P2P geodata

  • In 2009 a guy on a mailing list said: «Pure P2P solutions are great for exchanging large files, but typically have too much latency to be practical»
  • In 2010 some chinese guys said: «P2P technology offered a novel solution to spatial information online service and a good platform for sharing mass spatial data, it can avoid "single point of failure"and "hot spots bottleneck" problem»
  • In 2007 some austrians said: «As disaster management inherently happens in highly dynamic environments, these applications suffer from deficiencies with respect to maintaining connections to the server representing their sole source of information. We propose to exploit peer-to-peer networks to interconnect field workers.»
  • They also said: «P2P oriented raster geo-data online services have been widely applied, whereas vector geo-data online services still have many issues that can′t be handled, such as vector geo-data organization pattern, segmentation, lossless reconstruction etc»
  • In 2006 Chris Holmes said: «The damn brilliant thing about using an architecture of participation for geospatial data information is that as a layer gets more popular it scales perfectly, since more people downloading and checking out the layer means that more people are serving it up to others.»

If «P2P oriented raster geo-data online services have been widely applied», then where has it gone now? I'd like to find out...