This video was mentioned on highscalability.com, so I thought I’d have a look. Knowning this stuff is useful when you’re in the business of delivering large amounts of geographical data to a large amount of clients.
C10M = 10 million concurrent requests.
A few days ago I found out that using the Python debugger is so easy, I can’t believe I haven’t used it before.
Import the module:
Set a breakpoint somewhere in your code:
def some_function(self, x, y, z):
Run your program. Now every time ‘some_function’ is called, the Python interpreter will break. At this point you could:
x to inspect the argument passed to the
- hit the ‘n’ button to skip over the next line of code
- hit the ‘c’ button to resume the program
- hit the ‘h’ button to get help
OK, calling it a benchmark is a bit of an overstatement. It’s taking two different database libraries for a quick spin, and seeing how fast they can write a bunch of integers to disk. A second benchmark checks how fast we can read them.
In this mini-test, I’m running leveldb against a new embedded database library, let’s call it system_x. The purpose is really just so that I can remember some rough numbers regarding these useful database libraries.
I used the
time command to gather results, which shows real, user and sys time spent.
So you have a list of tuples, created with the zip built-in function in Python. Like this:
z = [(1, 'a'), (2, 'b'), (3, 'c')]
And you want to reverse zip, to get these two lists:
x = [1, 2, 3]
y = ['a', 'b', 'c']
Given a numpy matrix (mixed_float_matrix) with a variety of float values, how do you convert it into a matrix, with zeros and ones (ones in place of non-zero values in the original matrix)?
I’m learning about matplotlib, and actually just bought the book Matplotlib for Python Developers.
Browsing stackoverflow, the matplotlib homepage, and other resources, I eventually came by this stackoverflow post, which mentions BaseMap. Since the data that I’m plotting is inherently geographical, it makes sense to show the data on a map.
There are several nice examples on the basemap Github page.
Often I want to create heatmaps of the data, using matplotlib.
On stackoverflow there are several posts on this topic:
There are different colormaps available for matplotlib, if you want to try different colorschemes.
I like to install 3rd party Python libraries using pip. Pip and easy_install can automatically download and install Python code from PyPi (also known as The Cheese Shop). This is how to publish your own Python code on PyPi, so people can do this:
pip install yourawesomeproject
This is easy (using pip):
This tutorial uses an in memory SQLite database, which is cool in itself.
Why did I install gfortran? Well, not to write Fortran programs. I tried installing SciPy using pip install scipy, and I got a message that a Fortran compiler was needed.
This is how I installed gfortran on my Mac:
Visit hpc.sourceforge.net, and select a binary distribution for your version of Mac OS X, e.g. gfortran-snwleo-intel-bin.tar.gz for Snow Leopard.