-
(Tentative)
Symbiosen mellem mennesker og AI vil kunne transformere mennesket til en rationel organisme (jvf. Daniel Kahneman som har påvist at mennesket for sig selv ikke er en rationel organisme). Hvordan det? Vores minutiøse adfærd bliver i stigende grad sporet i alle livets væsentlige forhold. Kunstig intelligens bliver bedre og bedre til at skønne om vi […]
-
How to select top-k items from each group in SQL
Here is an analytical query that you (and I) will often need to do if you work in e-commerce, marketing or similar domain. It answers the question, within each group of items (e.g. partitioned by territory, age groups or something else) what are the top-k items for some utility function over the items (e.g. the […]
-
How to get structured Wikipedia data via DBPedia
Wikipedia contains a wealth of knowledge. While some of that knowledge consists of natural language descriptions, a rich share of information on Wikipedia is encoded in machine-readable format, such as “infoboxes” and other specially formatted parts. An infobox is rendered as a table that you typically see on the right-hand side of an article. While […]
-
How to compute the pagerank of almost anything
Whenever two things have a directional relationship to each other, then you can compute the pagerank of those things. For example, you can observe a directional relationships between web pages that link to each other, scientists that cite each other, and chess players that beat each other. The relationship is directional because it matters in […]
-
Quick introduction to RabbitMQ and Celery
I like to code in Python. I also like the concept of asynchronous workers to build loosely coupled applications. Luckily, RabbitMQ and Celery can help me do exactly that. This post is based on a very nice YouTube video by Max Mautner (the one below). For easy repeatability, I have transcribed the video in this […]
-
How to Become a Web Scraping Pro with Python pt. 1
Scrapy is an excellent Python library for web scraping. For example, you could create an API with data that is populated via web scraping. This article covers some basic scrapy features, such as the shell and selectors. Install scrapy in virtual environment on your machine: $ virtualenv venv $ source venv/bin/activate $ pip install scrapy […]
-
How to export CSV file to database with Python
Pandas and SQLAlchemy offer powerful conversions between CSV files and tables in databases. Here is a small example: import pandas as pd from sqlalchemy import create_engine df = pd.read_csv(‘mydata.csv’) engine = create_engine(‘sqlite:///mydata.db’) df.to_sql(‘mytable’, engine) Read more: pandas.DataFrame.to_sql sqlalchemy engines
-
How to use non-default profile in boto3
Given an AWS credentials file that looks like this: [default] aws_access_key_id = DEFAULT aws_secret_access_key = SECRET1 [dev] aws_access_key_id = DEV aws_secret_access_key = SECRET2 [prod] aws_access_key_id = PROD aws_secret_access_key = SECRET3 You can use any profile, say dev, like this in Python: import boto3.session dev = boto3.session.Session(profile_name=’dev’) s3 = dev.resource(‘s3’) for bucket in s3.buckets.all(): print(bucket.name) print(”)
-
What kind of Machine Learning person are you?
You may ask yourself, if I’m a machine learning person then what kind am I? See for yourself in Jason Eisner’s Three Cultures of Machine Learning.
-
How to work with spatial data in Amazon Redshift
While Redshift does not offer native support for spatial data, indexes and functions, there exists a partial workaround. Redshift supports Python UDFs and can also load custom Python libraries. Out of the box, Redshift has numpy, scipy, pandas and many other useful Python libraries. For spatial functionality, one saving grace is the high quality spatial […]