# NumPy

This section is devoted to NumPy tricks.

## Sliding window (1D)

NumPy seems to lack (or I can't find) a simple sliding window function for arrays, so I've implemented this one:

```def sliding_1d(a, size, stride=1): last_i = len(a) - size num_seq = (last_i / stride) + 1 assert(num_seq == np.round(num_seq)) idx = np.arange(size)[None, :] + stride * np.arange(int(num_seq))[:, None] return a[idx]```

Use it like this:

```a = np.arange(10) # a = array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) sliding_1d(a, 2, stride=2) # array([[0, 1], # [2, 3], # [4, 5], # [6, 7], # [8, 9]])```

# Pandas

This section is devoted to Pandas tricks.

## Basics

Imports:

```import pandas as pd import numpy as np```

Shuffle rows:

`df.reindex(np.random.permutation(df.index))`

Make index incremental again (e.g. after a shuffle):

`df.reset_index(drop=True)`

Drop rows that contain NaN:

`df.dropna()`

Split dataframe in half:

`df1,df2 = np.array_split(df, 2)`

## Restructuring

From matrix to i,j,value:

```from string import ascii_lowercase import numpy as np   # Create nrows x ncols dataframe nrows = 5 ncols = 3 a = np.random.rand(nrows,ncols) df = pd.DataFrame(a) df.columns = list(ascii_lowercase)[0:ncols] df.index = list(ascii_lowercase.upper())[0:nrows]   # Restructure to i,j,value dataframe df.T.unstack().reset_index(name='value')   # Note that I use .T because I like row-by-row enumeration```

## Merging

Basic merge on shared columns (inner, outer, left, right):

```df1 = pd.DataFrame({'A': [1,2,3], 'B': [1,2,3]}) """ A B 0 1 1 1 2 2 2 3 3 """   df2 = pd.DataFrame({'A': [3,4,5], 'C': [1,2,3]}) """ A C 0 3 1 1 4 2 2 5 3 """   df1.merge(df2) # inner (default) """ A B C 0 3 3 1 """   df1.merge(df2, how='outer') """ A B C 0 1 1 NaN 1 2 2 NaN 2 3 3 1 3 4 NaN 2 4 5 NaN 3 """   df1.merge(df2, how='left') """ A B C 0 1 1 NaN 1 2 2 NaN 2 3 3 1 """   df1.merge(df2, how='right') """ A B C 0 3 3 1 1 4 NaN 2 2 5 NaN 3 """```

This site uses Akismet to reduce spam. Learn how your comment data is processed.