PyBrain quickstart and beyond

After pip install bybrain, the PyBrain the quick start essentially goes as follows:

from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure import TanhLayer
from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer
 
# Create a neural network with two inputs, three hidden, and one output
net = buildNetwork(2, 3, 1, bias=True, hiddenclass=TanhLayer)
 
# Create a dataset that matches NN input/output sizes:
xor = SupervisedDataSet(2, 1)
 
# Add input and target values to dataset
# Values correspond to XOR truth table
xor.addSample((0, 0), (0,))
xor.addSample((0, 1), (1,))
xor.addSample((1, 0), (1,))
xor.addSample((1, 1), (0,))
 
trainer = BackpropTrainer(net, xor)
#trainer.trainUntilConvergence()
for epoch in range(1000):
    trainer.train()

However, it does not work, which can be seen by running the following test?

testdata = xor
trainer.testOnData(testdata, verbose = True)  # Works if you are lucky!

Kristina Striegnitz code has written and published an XOR example that works more reliably. The code is effectively reproduced below, in case the original should disappear:

# ... continued from above
 
# Create a recurrent neural network with four hidden nodes (default is SigmoidLayer) 
net = buildNetwork(2, 4, 1, recurrent = True)
 
# Train the network using arguments for learningrate and momentum
trainer = BackpropTrainer(net, xor, learningrate = 0.01, momentum = 0.99, verbose = True)
for epoch in range(1000):
    trainer.train()
 
# This should work every time...
trainer.testOnData(testdata, verbose = True)

Log devices on your network

Fing logger (finglogger.sh):

#!/bin/sh
 
FING_LOG_FILE=/path/to/fing.log
 
# append current public ip
echo `date +"%Y/%m/%d %T"`";publicip;"`curl -s ipecho.net/plain`";;;" >> $FING_LOG_FILE
 
# append current fing output
/usr/bin/fing -r1 -o log,csv,$FING_LOG_FILE,1000 --silent

Add to cron (run every hour):

0 * * * * /path/to/finglogger.sh

How to merge two disjoint random samples?

The problem: Given two random samples, s1 and s2, of size k over two disjoint populations, p1 and p2, how to combine the two k-sized random samples into one k-sized random sample over p1 ∪ p2?

The solution: k times, draw an element s1 ∪ s2; with probability d1 = |p1| / |p1 ∪ p2|, draw the next element from p1; with probability d2 = 1 – d1 draw the next element from p2.

(the solution was on stackoverflow)

In python:

import random
import numpy
 
# sizes
e1 = 1000
e2 = 1000000
 
# populations
p1 = xrange(e1)
p2 = xrange(e1, e2)
 
# sample size
k = 500
 
# random samples
s1 = random.sample(p1, k)
s2 = random.sample(p2, k)
 
# merge samples
merge = []
for i in range(k):
  if s1 and s2:
    merge.append(s1.pop() if random.random < len(p1) / float(len(p1)+len(p2)) else s2.pop())
  elif s1:
    merge.append(s1.pop())
  else:
    merge.append(s2.pop())
 
# Validate
hist = numpy.histogram(merge, bins=[0,500000,1000000])
# The two bins should be roughly equal, i.e. the error should be small.
print abs(hist[0][0] - hist[0][1]) / float(k)
 
# alternatively, use filter to count values below 500K
print abs(len(filter(lambda x: x<500000, merge)) - 250) / 500.0

How to compute Fibonacci sequence in SQL

Inspired and simplified from a set of slides on using RDBMS for storing, managing, and querying graphs:

WITH recursive fib(i,j) AS (
    SELECT 0,1
    UNION ALL
    SELECT j, i+j FROM fib WHERE j<1000
)
SELECT i FROM fib

Which files has a *nix process opened?

To list all files that are opened by a *nix process with a given pid, say 42, use the lsof command:

(sudo) lsof -p 42

Of course, a process may have many files open. To list only files that have a name containing “log”, use the grep command:

(sudo) lsof -p 42 | grep log

This of course assumes you know the process id (pid) of the process. To find the pid of processes with a given name, e.g. httpd, use the ps command together with the grep command:

# notice that this prints the grep process as well
ps aux | grep httpd

How long is the Doom Loop cycle currently?

Take a look at this Chomsky presentation, time it around 46:30. It seems that the most rational prediction would be that we are heading for another financial crisis, since financial systems are running a quote “Doom Loop”: Make huge gambles, make huge gains or fail. In the case of failure, get bailed out. This pattern of behaviour is rational, seen from the point of view of the financial sector, given the current environment. So, the good question is, what would the rational course of action be for us, the citizens, given that the financial sector is apparently acting, fully rationally, inside a Doom Loop?

The rational question would be, when is the next financial crisis coming? Given a good prediction of this point in time, how should we rationally act, e.g. in the real-estate market? If we should aspire to make rational decisions, we should not hope that another financial crisis will be avoided. We should expect it, and make rational decisions based upon it. For our own gain, if we so desire. Now, how do you do that? That is another question. It seems obvious that decisions in many areas should be influenced by this apparent fact, e.g. decisions in real-estate, entrepreneurship, family planning. If there is money to be made, somehow, in betting on the next financial crisis, maybe that would be the rational thing to do.

Must remember to salt my hashes

While a sha-256 hash may seem unbreakable, for many input strings it takes seconds to crack. If you don’t believe me, try the following or simply read this webpage:

$ python
>>> import hashlib
>>> print hashlib.sha256('megabrain').hexdigest()
f53c51616f4c7943a4117afa1d0ba193f9af901c6ce175a2207a594e71c98ef5

Go to crackstation.net, paste the hash into the text area, click “crack hashes” and see it the admittedly super lame password cracked in a second. The basic concept behind the cracking is to precompute hashes for a lot of passwords, and doing reverse lookups – from hashcode to password. This way, it really does not make any difference how “good” your hashing algorithm is. This is not an attack against hashing algorithms, but an attack against common hashcodes. In the wild, you are more likely to encounter the hash of “megabrain” than the hash of “2f2f0a446f828f”. While you should encourage everybody you meet to choose strong passwords, it is perhaps more sustainable to strengthen the security around weak passwords.

Salting

This is where salts come in, as weak passwords can be made stronger by salting. A salt is just a sequence of bytes, e.g. “c039b8f8a8…” that you concatenate with a password before computing a password hash. It is ineffective to use the same salt for all passwords, so by all means read this page to get the inside scoop on how to do this correctly.

$ python
>>> os
>>> import hashlib
>>> password = 'megabrain'
>>> salt = os.urandom(32)
>>> stored_hash = hashlib.sha256(salt + password).hexdigest()

If you try to crack the stored_hash on crackstation.net, you will see that it is not successful. So the moral of the story is, a bad password + a good salt = a good password. Users only have to remember their (bad) password, while you should remember the good salt.

To authenticate as user with a salted password, you will again combine the salt and password before comparing to a stored hash:

>>> password_to_authenticate = 'megabrain'
>>> if hashlib.sha256(salt + password_to_authenticate).hexdigest() == stored_hash:
>>>     print "User has been authenticated!"
>>> else:
>>>     print "Wrong password!"

Docker on Ubuntu VM running on Mac using Vagrant

Docker allows you to develop, ship and run any application, anywhere. The metaphor is that of the standard shipping container that fits on any ship, can be handled by any crane, and loaded onto any train or truck.

In a previous post, I covered how to run Ubuntu on Mac using Vagrant. In this post, I will show how to run Docker on the Ubuntu box we got running with Vagrant.

I will cover how to:

Provisioning Docker on “vagrant up”

First, create a Vagrant setup like previously described. Then, edit the install.sh script, and enter some Docker installation commands:

install.sh:

#!/bin/sh
curl -sSL https://get.docker.io/ | sh

Now, let’s test that docker was installed as intended:

vagrant up
vagrant ssh

(Fix) chown the docker socket:

# Now on vagrant machine
sudo chown vagrant /var/run/docker.sock  # TODO: need to address this issue in a different way

Check docker version:

docker version

Run a hello world:

docker pull ubuntu
docker run ubuntu echo "Hello, world"

Basic Docker usage

Get your applications into Docker containers

TODO

Shipping containers to team members

TODO

Deploying applications to production

TODO

Aside: Deploying containers on AWS

HOW TO

Summary

This post shows how to get up and running with Vagrant and Docker using the install scripts provided at get.docker.io. In the next post I will show how to use the “new” way to use Docker with Vagrant (thanks to Jens Roland for pointing me in the right direction).