Deep Learning Tutorial


These examples were run on the ESI HPC cluster. This is why we use esi_cluster_setup() to set up a parallel computing client. They are perfectly reproducible on any other cluster or local machine by instead using slurm_cluster_setup() or local_cluster_setup() respectively.

The following Python code demonstrates how to use ACME to perform parallel deep learning model fitting with PyTorch to evaluate the best model for a dataset. This is a somewhat toy example, in which we will vary the model architecture randomly. Nevertheless, this general approach can be used to perform a grid search over a set of parameters. This problem is inspired by some fantastic DeepLearning course from Mike X. Cohen.

First, we import the necessary packages:

import torch
import torch.nn as nn
import torch.nn.functional as F
from import DataLoader,TensorDataset
from sklearn.model_selection import train_test_split
import numpy as np
import scipy.stats as stats
import pandas as pd
import matplotlib.pyplot as plt
from IPython import display
from acme import cluster_cleanup, esi_cluster_setup,  ParallelMap
import itertools

Getting the Data Ready

We pass each PyTorch DataLoader along with the model parameters to ParallelMap.

url  = ""
data = pd.read_csv(url,sep=';')
data = data[data['total sulfur dioxide']<200] # drop a few outliers

# z-score all columns except for quality
norm_cols = data.keys().drop('quality')
data[norm_cols ] = data[norm_cols].apply(stats.zscore)

# create a new column for binarized (boolean) quality
data['binqual'] = data.apply(lambda x: 1 if x['quality']>5 else 0,axis=1)

X_train, X_test, y_train, y_test = train_test_split(torch.tensor( data[norm_cols].values ).float(),\
    torch.tensor( data['binqual'].values ).float()[:,None], test_size=.1)

# then convert them into PyTorch Datasets (note: already converted to tensors)
trainLoader = DataLoader(TensorDataset(X_train,y_train),batch_size=32,shuffle=True)
testLoader  = DataLoader(TensorDataset(X_test,y_test),batch_size=X_test.shape[0],shuffle=True)

Here we generate the inputs to our parallel function. We vary the number of units for each layer as powers of 2 from 16 to 512 and use all possible permutations of this set.

# Prepare inputs for parallelization
params = list(itertools.permutations([2**i for i in range(4,10)]))

# set up client
client = esi_cluster_setup(partition="8GBS",n_workers=200)

# compute
with ParallelMap(parallel_model_eval, params, trainLoader, testLoader, n_inputs=len(params), write_worker_results=False) as pmap:
    results = pmap.compute()


In this example we do not write the results to disk, because write_worker_results = False. If we want to save the models however, or if the output becomes large, it is highly recommended to save to disk and not collect in local memory.

After the computation is done, we can inspect the different outcome parameters that were returned:

  • test set accuracy time courses (as a function of epochs)

  • train set accuracy time courses

  • losses

for i, param in enumerate(params):
    trainAcc,testAcc,losses = results[i]

Which model performed best over the last 50 epochs?

bestModel = np.argmax([np.mean(model[0][-50:]) for model in results])