Double Machine Learning for Dummies as explained by ChatGPT

I asked ChatGPT to: explain double machine learning for dummies Here is the answer: Double machine learning (DML) is a statistical method used to estimate treatment effects in the presence of high-dimensional controls. To put it simply, it’s a way to answer questions about cause and effect using data and machine learning, even when there …

Double Machine Learning for Dummies as explained by ChatGPT Read More »

How to Draw an Owl

Taken from lecture 1 of Statistical Rethinking course (around the 44 minute mark). The course material is also on Github. How to draw an "owl" version 1: Create generative simulation (GS) Write an estimator Validate estimator using simulated data Analyze real data: … Reuse 1 to compute hypothetical interventions How to draw an "owl" version …

How to Draw an Owl Read More »

How to sort numbers with an evolutionary algorithm (CMA-ES)

Yes, this is clearly nonsense. Sorting is not a hard problem and standard algorithms such as quicksort and mergesort have O(x^2) and O(n log(n)) complexity. But let me scratch this itch of sorting numbers using an evolutionary algorithm, specifically Covariance matrix adaptation evolution strategy (CMA-ES). Technically, we will use what I think is the original …

How to sort numbers with an evolutionary algorithm (CMA-ES) Read More »

How to draw lines on map in Databricks

Imports: import plotly.graph_objects as go Plot: fig = go.Figure() fig.add_trace(go.Scattermapbox( mode = "markers+lines", lon = [10, 20, 30], lat = [10, 15,30], marker = {'size': 10})) fig.add_trace(go.Scattermapbox( mode = "markers+lines", lon = [-50, -60,40], lat = [30, 10, -20], marker = {'size': 10})) fig.update_layout( margin ={'l':0,'t':0,'b':0,'r':0}, mapbox = { 'center': {'lon': 10, 'lat': 10}, 'style': …

How to draw lines on map in Databricks Read More »

How to call an API from PySpark (in workers)

Tested in Databricks import pyspark.sql.functions as F import requests # create dataframe pokenumbers = [(i,) for i in range(100)] cols = ["pokenum"] df_pokenums = spark.createDataFrame(data=pokenumbers, schema=cols) # call API def get_name(rows): # take the first item in list (API doesn't support batch) first = rows[0] url = f'https://pokeapi.co/api/v2/pokemon-form/{first.pokenum}' try: resp = requests.get(url) name = resp.json()['pokemon']['name'] …

How to call an API from PySpark (in workers) Read More »

How to use bnlearn to learn causal structures

This article on causal machine learning covers a practical example of how to learn structural causal models (SCM) directly from data. We will use bnlearn, which is an open-source library for learning the graphical structure of Bayesian networks in Python. Check out my Github repo for additional code examples. For other frameworks, checkout my page …

How to use bnlearn to learn causal structures Read More »