Category: Systems

  • Twitter HyperLogLog monoids in Spark

    Want to count unique elements in a stream without blowing up memory? In more specific words, do you want to use a HyperLogLog counter in Spark? Until today, I’d never heard the word “monoid” before. However, Twitter Algebird is a project that contains a collection of monoids including a HyperLogLog monoid, which can be used […]

  • Easiest way to install a PostgreSQL/PostGIS database on Mac

    Installing Postgres+PostGIS has never been easier on Mac. In fact, it is now an app! You download the app-file from, place it in your Applications folder, and you’re done. Really. If you think that was over too fast If you think that was over too fast, there is one more thing you can do. […]

  • Linked Data: First Blood

    Knowing a lot about something, makes me more prone to appraising its value. I unfortunately know very little about Linked data. For this reason, I’ve had a very biased and shamefully low opinion about the concept of linked data. I’ve decided to change this. A repository of linked data that I’ve recently taken an interest […]

  • Geocoding Python function for PostgreSQL

    Gratefully making use of what others have provided, i.e. geopy, Google and plpythonu. Type to hold result of geocoding: CREATE TYPE geocoding AS ( place text, latitude double precision, longitude double precision ); Function that does the actual geocoding (to be extended with more vendors. Hint: look at geopy wiki). Takes an (arbitrary) input string […]

  • Things related to Docker

    Docker is a cool idea and open-source product, that seems to be taking the tech community by storm. Wired will tell you why it is cool in a story titled The Man Who Built a Computer the Size of the Internet. The short version goes: Docker is a way to deploy and move applications with […]

  • Watched the RAMCloud video

    Today I watched a video on RAMCloud. I have made an index over the various sections of the video, with direct links. You’ll find this index in the bottom of this post. “The RAMCloud project is creating a new class of storage, based entirely in DRAM, that is 2-3 orders of magnitude faster than existing […]

  • How many requests per second can I get out of Redis?

    Warning: This is not a very interesting post. I’m toying around with the Redis benchmarking tool. What would be significantly more interesting would be to toy around with the Lua API in Redis, which I’ll do in a subsequent post. In this post, I’ll try to squeeze as many get/set requests out of Redis as […]

  • A stop watch for Postgres

    To time the execution of various stages of a long transaction, I’m using the following function: CREATE OR REPLACE FUNCTION CVL_TimerLap() RETURNS double precision AS $$ import time now = time.time() if not SD.has_key(‘t_last’): SD[‘t_last’] = now elapsed = now – SD[‘t_last’] SD[‘t_last’] = now return elapsed $$ LANGUAGE plpythonu; The “lap times” are returned […]

  • Running LP-solver in Postgres

    Having reinstalled PostgreSQL with support for Python and pointing at my non-system python, it is time to test whether I can use the convex optimizer library I’ve installed in my Python 2.7 (pip install cvxopt). Install PL/Python if not already installed — if not already installed. Doesn’t hurt. create extension plpythonu; Create a function that […]

  • Some good slides for using PostgreSQL with Python

    Peter Eisentraut has written some good slides on coding PostgreSQL clients in Python and on using Python as a stored procedures language in PostgreSQL. First half deals with using Python as a Postgres client. Second half deals with coding stored procedures in Python.

1 2 3 6
Next Page