How to split a file in two on even/odd line numbers

I have a (CSV) file that I want to split in two. I want all the even lines to go to one file, and all the odd lines to go to another file.

$ ls -1
FILE
[MAGIC]
$ ls -1
FILE
ALL_THE_EVEN_LINES
ALL_THE_ODD_LINES

This is how to do it in two passes using awk:

awk 'NR%2==0' FILE > ALL_THE_EVEN_LINES
awk 'NR%2==1' FILE > ALL_THE_ODD_LINES

How to install gfortran on Mac OS X

Why did I install gfortran? Well, not to write Fortran programs. I tried installing SciPy using pip install scipy, and I got a message that a Fortran compiler was needed.

Install

This is how I installed gfortran on my Mac:

Visit hpc.sourceforge.net, and select a binary distribution for your version of Mac OS X, e.g. gfortran-snwleo-intel-bin.tar.gz for Snow Leopard.

Continue reading “How to install gfortran on Mac OS X”

How to remove extended attributes from a file edited with TextMate

If you run ‘ll -l’ sometimes a file has ‘@’ permission. This means that the file has extended attributes. TextMate may extended attributes to a file. Consider a fictional Python file called xxx.py edited with TextMate:

View extended attributes:

$ xattr xxx.py
com.macromates.caret

Remove extended attribute (com.macromates.caret):

$ xattr -d com.macromates.caret xxx.py

Never forget your password again, version 1

The recommended practice is to have different passwords on different websites. But how do you remember all those passwords without storing them somewhere? The tricks is, you don’t. You remember a single strong password, and use a mechanism to generate other passwords from that.

This is not for securing government secrets, but should work for your twitter account.

Create a single very strong password

There are many ways to do this: http://xkcd.com/936/

Continue reading “Never forget your password again, version 1”

Benchmark: Reading uncompressed and compressed files from disc

In this post I’ll compare the running time of reading uncompressed and compressed files from disc.

I’ll run a test using two files, data.txt (858M) and data.txt.gz (83M), that have the same content.

About cat and zcat

The well-known command cat, prints the contents of a file. The lesser-known zcat, prints the contents of a GZIP’ed file.

Continue reading “Benchmark: Reading uncompressed and compressed files from disc”

Importing data from a CSV file into a Postgres table

Simple CSV file import

You have a CSV file called “data.csv”. It has a header line, and is delimited using “;”. You want to import it into Postgres and a table called “your_table”:

Create the database table. Set column-types so the string fields in the CSV file, can be cast to values in columns.

CREATE TABLE your_table
(
  -- Your columns
);

Continue reading “Importing data from a CSV file into a Postgres table”

How to load an ESRI Shapefile into a PostGIS DB

Assuming a shapefile called myshapefile.shp, a table mytable in schema xyz, in a PostGIS enabled database called mydb on localhost. The table is owned by user dbuser who has password “secret”.

Using shp2pgsql

shp2pgsql myshapefile -I xyz.mytable > statements.sql
psql -d mydb -h localhost -U dbuser -f statements.sql

This tip and many more can be read in Making Maps Fast.

Using ogr2ogr

This is even easier with ogr2ogr:

ogr2ogr -f "PostgreSQL" PG:"host=localhost user=dbuser dbname=mydb password=secret" -lco SCHEMA=xyz myshapefile.shp