No one in ad tech needs to know your name

I work in the ad tech industry, which means that I track people online for a living. Mainly, I do it because the industry has interesting computer science problems and because the job pays well.
I will not defend ad tech. Mainly because ad tech is not important enough to humanity to defend. However, I do believe that ad tech’s algorithms are important to humanity because they can be applied to important areas, such as your health, personal finance and education. However, I have a different point today.

I have a subtle point about privacy. I have noticed that at no point does the ad tech industry need to know who you really are. Ad tech does not need to know what your real name is, what your parents real names are, your actual street address or any other piece of information that identifies you as you to another human being. It is a little bit hard to explain, but I will try. Ad tech is powered by algorithms and these algorithms operate in an abstract space where your true identity is not important. Most ad tech knows you by a random number that was assigned to you. All your interests are also represented by random numbers. The place you live yet another. Ad tech algorithms only care about the relationships between these numbers, not what the numbers actually represent in the real world.

Here is how it works. You get assigned a random number, e.g. 123, to represent you. Then, ad tech will attempt to link your number, 123, with the numbers of boxes that represent products or services that you might be interested in. For example, a box A could be people who need a vacation and box B could be people who could be tempted to buy a new BMW. Ideally, if you really need a vacation and someone really wants to sell you that vacation, then a connection between 123 and A should be made. From ad tech’s perspective, the number 123 is linked to the box A. The algorithm does not need to use labels like “Alice Anderson” or “Bob Biermann”, because the numbers 1 and 2 will get the job done just fine — from a mathematical point of view.

At some point your true identity becomes interesting, long after ad tech has left the scene. At some point, somebody (e.g. a human being or a robot) might need to print your real name and street address on a card box box, put the product you ordered inside and ship it via DHL. Up until that exact point, your name, street address or any other personally identifiable information is utterly unimportant to anybody. Nobody cares and no advertisement algorithm needs to know. I think this is an important point.
Ad-tech algorithms, if not ad tech itself, can have a massive and positive impact on areas of life that you probably care about. For example, algorithms can help you with your health, personal finance, insurances, education, whether you should buy Bitcoin or Ether today, or whether you should attend job interview A instead of job interview B today, or your kids attend school X or Y. In these areas, relatively un-altered algorithms from ad tech can help. It is important to keep in mind, that again no algorithm needs to know your name in order to work. Not even if that algorithm is looking through your medical record and correlating your stats with the stats of million of other patient records.
Of course it is true that your real identity can be learned from seemingly anonymised data. It might even be fairly trivial to do so, using good old detective skills. Differential privacy has some fairly hard results in that area. However, the main point is that someone has to make a conscious decision to look into the data on a mission to find you and possibly design a new algorithm for that purpose.

Now I get to my main point. Yes, ad tech CAN know who you are with some detective work. However, ad tech does not NEED to know who you are in order to work. This is so important because it means that we can potentially harness the power of algorithms in areas of life that matter — without compromising the privacy of anybody. It is not going to be easy to obtain the granular and self-controlled privacy that is needed, but it is worthwhile. And that is why I joined ad tech in the first place, because the computer science problems are interesting and important — and well, interesting and important things tend to pay well.

Neural networks on GPUs: cost of DIY vs. Amazon

I like to dabble with machine learning and specifically neural networks. However, I don’t like to wait for exorbitant amounts of time. Since my laptop does not have a graphics card that is supported by the neural network frameworks I use, I have to wait for a long time while my models git fitted. This is a problem.

The solution to the problem is to get access to a computer with a supported Nvidia GPU. Two approaches are to either get my own rig or rent one from Amazon. Which is cheaper?

Cost analysis

I will assume that I will train models on my machine (whether local or at Amazon) for two hours every day.

The Amazon p2 range of EC2 machines come with Nvidia K80 cards, which costs about 50.000 DKK. Already this analysis is going to be difficult; I will not buy a computer that costs 50.000 DKK just to train NN models. So, in this analysis I will be comparing apples to oranges, but that is how it is.

Cost of Amazon

The p2.xlarge EC2 instance has a single K80 GPU, which is at least as good as any rig I would consider buying.

The on-demand prie is $0.9/hour; the spot price about five times cheaper. Usage for two hours every day for a whole year costs 4.500 DKK for on-demand and 900 DKK for spot instances. However, the p2 instances is sometimes unavailable in the European spot markets.

Cost of DIY

What is the best GPU to get for a DIY machine learning rig? In 2016, Quora answers suggested that the Nvidia cards Titan X and GTX980TI would be best. Let’s go with that.

This is quite a bit more than 4.500 DKK and that is only for the graphics card. The finished rig would probably cost around 15.000 DKK (Titan) and 10.000 DKK (GTX).

The electricity also has to be factored in, plus that the cards are basically slower than the K80.

Best choice for increased usage

With increased usage the DIY approach will become cheaper than Amazon, albeit still a slower option. With usage of 5 or 7 hours/day the DIY approaches break even after a year.

Further reading

Build a deep learning rig for $800.

How AI, robotics and advanced manufacturing could impact everybody’s life on Earth

What if everybody could live a wealthy, healthy, job-less and creative life in a post-scarcity Universe? Are we currently on a trajectory to this new reality and what are the obstacles we may face on the way? What are the important game-changing technologies?

TODO: create and agenda (very tentative):

1) contrast current life circumstances with a potential future
2) identify the key problems that we could solve with technology
3) review the players in society that will take part in this change
3) contrast views on the opportunities and threats of these technologies
4) …

Our future life conditions here on Earth might soon be impacted by game-changing advancements in artifical intelligence, robotics, manufacturing and genetics; at least if you ask people like Elon Mush, Andrew Ng and Ray Kurzweil. What are the most important technologies and what is the impact they might have? What are the dangers? Opinions differ so the intention here is to review and contrast what leading fiction writers, scientists, visionaries and entrepreneurs think about the question: how will AI, robots, and advanced manufacturing impact everybody’s life circumstances here on Earth?

Fiction to be reviewed

Post-scarcity:
– The Culture series

AI:
– Asimov

The Human-Computer Cortex:
– That Swedish guy who wrote sci-fi computer implants in the 70’s

Non-fiction to be reviewed

AI:
– Douglas Hofstadter: GEB

Videos to be reviewed

AI:


The Human-Computer Cortex:

News articles to be reviewed

AI:
– https://aifuture2016.stanford.edu/
– http://fortune.com/2016/06/15/future-of-work-2/
– http://www.businessinsider.com/researchers-predictions-future-artificial-intelligence-2015-10?r=US&IR=T&IR=T

3D printing:
– https://hbr.org/2013/03/3-d-printing-will-change-the-world

When to be most careful about catching the flu?

Continuing on my blogification of Peter Norvigs excellent talk, the question is, when to watch out for the flu, e.g. if you live in Denmark?

1) Go to www.google.com/trends/
2) Type in the word “influenza”
3) Select your geographical region (Denmark in my case)
4) See data up to year 2008, to avoid the graph being squished by the outbreak of A(H1N1) (which leads to unusually many people talking about the flu)

Turns out the answer is: watch out in October and February.

How long is a year?

How to find out how long a year is on Earth by only analyzing text? This approach is lifted straight from an excellent and very inspiring talk by Peter Norvig.

1) Go to www.google.com/trends/
2) Type in the word “Icecream”
3) Measure the distance between the peaks (turns out that the average is exactly the length of a year)