Category: Command line Fu

  • How to randomly sample k lines from a file in *nix

    You can use the shell to extract a random sample of lines from a file in *nix. The two commands you need are “shuf” and “head” (+ “tail” for CSV files with a header). The shuf command will randomly shuffle all the lines of its input. The head command will cut of the input after the first k lines. Examples for both general files and CSV files are given below.

    General pattern

    To randomly sample 100 lines from any file in *nix:

    shuf INPUT_FILE | head -n 100 > SAMPLE_FILE
    

    Pattern for CSV

    If you file is a CSV file, you probably want to extract the header and only sample the body. You can use the head and tail commands, respectively, to extract the header and sample the contents of the CSV file.

    Extract the header of the CSV file:

    head -1 INPUT_FILE.csv > SAMPLE_FILE.csv
    

    Sample 100 lines from the body of the CSV file and append to sample file (notice “>” above versus “>>” below):

    tail +2 INPUT_FILE.csv | shuf | head -100 >> SAMPLE_FILE.csv
    

    Install dependencies on Mac

    On Mac, the shuf command is not shipped with the OS. You can get it via brew. It will be named “gshuf”:

    brew install coreutils
    

    So, on Mac you should replace shuf with gshuf in the example above.

  • Log devices on your network

    Fing logger (finglogger.sh):

    #!/bin/sh
    
    FING_LOG_FILE=/path/to/fing.log
    
    # append current public ip
    echo `date +"%Y/%m/%d %T"`";publicip;"`curl -s ipecho.net/plain`";;;" >> $FING_LOG_FILE
    
    # append current fing output
    /usr/bin/fing -r1 -o log,csv,$FING_LOG_FILE,1000 --silent
    

    Add to cron (run every hour):

    0 * * * * /path/to/finglogger.sh
    
  • Which files has a *nix process opened?

    To list all files that are opened by a *nix process with a given pid, say 42, use the lsof command:

    (sudo) lsof -p 42
    

    Of course, a process may have many files open. To list only files that have a name containing “log”, use the grep command:

    (sudo) lsof -p 42 | grep log
    

    This of course assumes you know the process id (pid) of the process. To find the pid of processes with a given name, e.g. httpd, use the ps command together with the grep command:

    # notice that this prints the grep process as well
    ps aux | grep httpd
    
  • Docker on Ubuntu VM running on Mac using Vagrant

    Docker allows you to develop, ship and run any application, anywhere. The metaphor is that of the standard shipping container that fits on any ship, can be handled by any crane, and loaded onto any train or truck.

    In a previous post, I covered how to run Ubuntu on Mac using Vagrant. In this post, I will show how to run Docker on the Ubuntu box we got running with Vagrant.

    I will cover how to:

    Provisioning Docker on “vagrant up”

    First, create a Vagrant setup like previously described. Then, edit the install.sh script, and enter some Docker installation commands:

    install.sh:

    #!/bin/sh
    curl -sSL https://get.docker.io/ | sh
    

    Now, let’s test that docker was installed as intended:

    vagrant up
    vagrant ssh
    

    (Fix) chown the docker socket:

    # Now on vagrant machine
    sudo chown vagrant /var/run/docker.sock  # TODO: need to address this issue in a different way
    

    Check docker version:

    docker version
    

    Run a hello world:

    docker pull ubuntu
    docker run ubuntu echo "Hello, world"
    

    Basic Docker usage

    Get your applications into Docker containers

    TODO

    Shipping containers to team members

    TODO

    Deploying applications to production

    TODO

    Aside: Deploying containers on AWS

    HOW TO

    Summary

    This post shows how to get up and running with Vagrant and Docker using the install scripts provided at get.docker.io. In the next post I will show how to use the “new” way to use Docker with Vagrant (thanks to Jens Roland for pointing me in the right direction).

  • Running Ubuntu on Mac with Vagrant

    Vagrant is cool:

    Vagrant provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

    Furthermore:

    Vagrant stands on the shoulders of giants. Machines are provisioned on top of VirtualBox, VMware, AWS, or any other provider. Then, industry-standard provisioning tools such as shell scripts, Chef, or Puppet, can be used to automatically install and configure software on the machine.

    In this post I’ll show you how to get started with Vagrant using a virtual Ubuntu Linux box. Moreover, I will cover how to use a simple provisioning technique (shell provisioning) for installing custom stuff into your virtual box on boot-up.

    Vagrant basics

    To follow along, you must first install Vagrant and VirtualBox. When you are done, cd to some folder (e.g. cd ~/Documents/trying-vagrant) and let’s get started:

    Initialize Vagrant (“vagrant init”) using an Ubuntu Trusty Tahr image (https://vagrantcloud.com/ubuntu/trusty64/version/1/provider/virtualbox.box):

    vagrant init ubuntu/trusty64
    

    This creates a new file, Vagrantfile:

    $ ls
    Vagrantfile
    

    Now, using the Vagrantfile that was created, boot the box (“vagrant up”). If this is the first time, vagrant will first download the image from the cloud (could take a while):

    vagrant up
    

    When done with the booting up, SSH into the machine (“vagrant ssh”):

    vagrant ssh
    # do some stuff, like ls and what not
    ^D  # to quit
    

    Bring down the box (“vagrant destroy”). Oh I love this, can’t help myself but sound it out “deSTROY” in a super villain voice:

    vagrant destroy
    

    If this worked, let’s move on to installing some custom stuff on boot-up.

    Installing stuff on “vagrant up”

    There are many ways to install stuff on vagrant up, e.g. using shell scripts, Chef, or Puppet. Here, I will use a shell script because it is simple and clean.

    Shell provisioning is a simple way to install stuff on “vagrant up”. First, let us create a shell script (“install.sh”) that we will later reference from the Vagrantfile. Furthermore, let’s live a little and install BrainFuck along with a hello world program.

    install.sh:

    #!/bin/sh
    sudo apt-get install bf
    echo '++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.' > helloworld.b
    

    (Remember to “chmod 744” the install.sh script). Now, add a few lines of code to your Vagrantfile and you’re golden. After the edit, the file should look like this.

    Vagrantfile:

    # -*- mode: ruby -*-
    # vi: set ft=ruby :
    
    # Vagrantfile API/syntax version. Don't touch unless you know what you're doing!
    VAGRANTFILE_API_VERSION = "2"
    
    Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
      config.vm.box = "ubuntu/trusty64"
      config.vm.provision "shell", path: "install.sh"
    end
    

    Now, let’s test that it worked:

    vagrant up
    vagrant ssh
    # now on virtual machine:
    $ bf helloworld.b
    Hello World!
    

    Summary

    In this post, I showed you how to get started with Vagrant, and how to provision stuff on “vagrant up” using a shell script.

  • Starting a web server and other PHP tricks

    Start PHP webserver (in current directory):

    php -S localhost:8080  # starts http server on port 8080
    

    Start PHP prompt (with illustrating example):

    php -a
    php >  echo base64_decode('QWxhZGRpbjpvcGVuIHNlc2FtZQ==');
    Aladdin:open sesame
    
  • Easiest way to install a PostgreSQL/PostGIS database on Mac

    Installing Postgres+PostGIS has never been easier on Mac. In fact, it is now an app! You download the app-file from postgresapp.com, place it in your Applications folder, and you’re done. Really.

    If you think that was over too fast

    If you think that was over too fast, there is one more thing you can do. Add the postgreapp “bin” directory to PATH.

    vi ~/.bash_profile
    
    # add line: export PATH=$PATH:/Applications/Postgres.app/Contents/Versions/9.3/bin
    

    Next time you open terminal you will be able to execute all of the following commands:

    PostgreSQL:

    clusterdb createdb createlang createuser dropdb droplang
    dropuser ecpg initdb oid2name pg_archivecleanup 
    pg_basebackup pg_config pg_controldata pg_ctl pg_dump 
    pg_dumpall pg_receivexlog pg_resetxlog pg_restore 
    pg_standby pg_test_fsync pg_test_timing pg_upgrade 
    pgbench postgres postmaster psql reindexdb vacuumdb 
    vacuumlo
    

    PROJ.4:

    cs2cs geod invgeod invproj nad2bin proj
    

    GDAL:

    gdal_contour gdal_grid gdal_rasterize gdal_translate 
    gdaladdo gdalbuildvrt gdaldem gdalenhance gdalinfo 
    gdallocationinfo gdalmanage gdalserver gdalsrsinfo 
    gdaltindex gdaltransform gdalwarp nearblack ogr2ogr 
    ogrinfo ogrtindex testepsg
    

    PostGIS:

    pgsql2shp raster2pgsql shp2pgsql
    

    That is pretty f’ing awesome!!

  • Poor man’s wget

    The command wget is useful, but unfortunately doesn’t come preinstalled with Mac. Yeah, you can install it of course, but if you’re doing it from source, the process has a few steps to satisfy all the dependencies; start by configure make‘ing the wget source and work your was backwards until ./configure runs for your wget source without hiccups.

    This is how to get a poor mans wget, or simply realize that you can use curl -O, unless you’re getting content via https.

    alias wget="curl -O"
    
  • How to set environment variables for single process

    In bash, you simply prefix a command with one or more XXX=”YYY” pairs, e.g.

    $ A="B" X="Y" python print_env.py
    ...
    A=B
    X=Y
    

    The code for print_env.py:

    import os
    
    for ev in os.environ:
      print "{0}={1}".format(ev, os.environ[ev])
    
  • Playing with GraphViz and MathGenealogy data

    Math in Genealogy is a great project (donate online). Sven Köhler from Potsdam, Germany has written a python script for visualizing the database, which I’m going to try.

    First step is to clone the git repo:

    $ git clone git@github.com:tzwenn/MathGenealogy.git
    

    His instructions are quite simple:

    $ ./genealogy.py --search 38586  # 30 seconds
    $ ./genealogy.py --display 38586 > euler.dot  # 0.1 seconds
    

    Next step is to install e.g. GraphViz, which is needed to visualize the dot file as a graph. Go to the download page for GraphViz, and follow instructions for your OS.

    This should install the commandline tool also. Now you can visualize Leonard Euler’s supervisor family tree (direct descendants) like this:

    $ dot euler.dot -Tpng -o euler.png
    

    Looking at the database is easy. Every invocation of ./genealogy.py –search writes to a sqlite3 database file (genealogy.db).

    $ sqlite3 genealogy.db
    

    This opens up a prompt. Have a look at the schema of the database like this:

    sqlite> .schema
    

    And see what is inside the thesis table like this:

    sqlite> select * from thesis;