kostas

kostas

Using the Python debugger

A few days ago I found out that using the Python debugger is so easy, I can’t believe I haven’t used it before. Import the module: import pdb Set a breakpoint somewhere in your code: def some_function(self, x, y, z):…

Clustering in Python

In a project I’m going to use clustering algorithms implemented in Python, such as k-means. Clustering scipy.cluster has been reported to have some problems, so for now I’ll use PyCluster (following advice given on stackoverflow). Install PyCluster: pip install…

Metrics cheat sheet

Question: When can a distance function d(x,y) be called metric, pseudo-metric, quasi-metric or semi-metric? Constraint Metric Pseudo Quasi Semi Non-negativity: d(x,y) ≥ 0 x x x x Identity of indiscernibles: d(x,y)=0 ⇒ x=y x x x Symmetry: d(x,y) = d(y,x)…

Having a look at Spacebase

Spacebase is a spatial datastore that began life as military-grade software, which at least sounds kind of cool. It’s an in-memory database, really, so switch off the cluster and the data is gone. Apparently the same thing was (unknown to…

Having a look at vbuckets

A distribution algorithm is used to map keys to servers in a distributed key-value store. There are several different ones, implemented in different systems, and with different properties. In this blog post I’ll briefly cover the best-known key hashing schemes,…