Python script for geocoding a text file

Assume that you have a file with some locations as text with one location per line.

For example, here are some school names in Copenhagen, Denmark, stored in schools.csv:

Hyltebjerg Skole
Heibergskolen
Ellebjerg Skole
Katrinedals Skole
Peder Lykke Skolen
Amager Fælled Skole
Tingbjerg Heldagsskole
Øster Farimagsgades Skole
Sankt Annæ Gymnasiums Grundskole
Lykkebo Skole
Randersgades Skole
Strandvejsskolen
Sortedamskolen
Grøndalsvængets Skole
Sølvgades Skole
Skolen ved Sundet
Hanssted Skole
Holbergskolen
Den Classenske Legatskole
Tove Ditlevsens Skole
Lergravsparkens Skole
Vigerslev Allés Skole
Bavnehøj Skole
Ålholm Skole
Langelinieskolen
Guldberg Skole
Husum Skole
Nyboder Skole
Vanløse Skole
Kirkebjerg Skole
Christianshavns Skole
Bellahøj Skole
Kildevældsskolen
Korsager Skole
Nørrebro Park Skole
Utterslev Skole
Skolen på Islands Brygge
Brønshøj Skole
Kirsebærhavens Skole
Rødkilde Skole
Vesterbro Ny Skole
Blågård Skole
Sønderbro Skole
Højdevangens Skole
Oehlenschlægersgades Skole
Vibenshus Skole
Valby Skole
Rådmandsgades Skole
Lundehusskolen
Tagensbo Skole

Here is a script, geocode.py, that will attempt to geocode each location in an input stream. It prints CSV output to stdout with the fields input_line, input_line_no, result_no, place, latitude, longitude:

from geopy import geocoders
import sys
import time
import pdb
 
geocoder = geocoders.GoogleV3()
 
SEPARATOR='|'  # can also use tab. Comma is bad, since the place will most likely contain a comma.
dummy = ['', ['', '']]
i = 0
 
header = ['input_line', 'input_line_no', 'result_no', 'place', 'latitude', 'longitude']
print(SEPARATOR.join(header))
 
for line in sys.stdin:
    line = line.strip()
    results = geocoder.geocode(line, exactly_one=False) or [dummy]
    for j, res in enumerate(results):
        place = res[0]
        lat = str(res[1][0])
        lon = str(res[1][1])
        out = SEPARATOR.join([line, str(i), str(j), place, lat, lon])
        print (out)
    time.sleep(0.05)
    i += 1

Here is how you might use the script:

cat schools.csv | python geocode.py 

Tip: you might want to

Leave a Reply