How to display a Choropleth map in Jupyter Notebook

Here is the code:

%matplotlib inline
import geopandas as gpd
import matplotlib as mpl  # make rcParams available (optional)
mpl.rcParams['figure.dpi']= 144  # increase dpi (optional)
world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
world = world[ != 'Antarctica']  # remove Antarctica (optional)
world['gdp_per_person'] = world.gdp_md_est / world.pop_est
g = world.plot(column='gdp_per_person', cmap='OrRd', scheme='quantiles')
g.set_facecolor('#A8C5DD')  # make the ocean blue (optional)

Here is what the map looks like:


pip install matplotlib
pip install geopandas
pip install pysal  # for scheme option

(Integer) Linear Programming in Python

Step one:

brew install glpk
pip install pulp

Step two:

from pulp import * 
prob = LpProblem("test1", LpMinimize) 
# Variables 
x = LpVariable("x", 0, 4, cat="Integer") 
y = LpVariable("y", -1, 1, cat="Integer") 
z = LpVariable("z", 0, cat="Integer") 
# Objective 
prob += x + 4*y + 9*z 
# Constraints 
prob += x+y <= 5 
prob += x+z >= 10 
prob += -y+z == 7 
# Solution 
for v in prob.variables():
    print, "=", v.varValue 
print "objective=", value(prob.objective)

In the documentation there are further examples, e.g. one to minimise the cost of producing cat food.

How to select top-k items from each group in SQL

Here is an analytical query that you (and I) will often need to do if you work in e-commerce, marketing or similar domain. It answers the question, within each group of items (e.g. partitioned by territory, age groups or something else) what are the top-k items for some utility function over the items (e.g. the number of units sold)?

Assume we have a table, aggregated_sales(id INT, item TEXT, country TEXT, total_sold INT). Here is a sample:

1   Basketball        DK       125
2   Basketball        US       432943
3   Leather football  FO       64773 
4   Leather football  DK       56230
5   Boxing gloves     CU       9812

In SQL, here is how to get the top-3 selling items by country:

SELECT id, item, country, total_sold
    row_number() OVER (PARTITION BY country ORDER BY total_sold DESC) as rn
  FROM aggregated_sales
) t
WHERE rn = 1;

Bonus info: the table sample was generated from a CSV file with the command column -s , -t filename.csv.