Optimizing Millions of Hyperparameters by Implicit Differentiation

This repository is an implementation of Optimizing Millions of Hyperparameters by Implicit Differentiation.

Running Experiments

Setup Environment

Create a Python 3.7 environment and install required packages:

conda create -n ift-env python=3.7
source activate ift-env
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
pip install -r requirements.txt

Install Jupyter lab:

conda install -c conda-forge jupyterlab

Simple test

Consider the following tests to verify the environment is correctly setup:

mnist_test.py

python mnist_test.py 
  --datasize <train set size> 
  --valsize <validation set size> 
  --lrh <hyperparameter lr need to be negative> 
  --epochs <min epochs for training model> 
  --hepochs <# of iterations for hyperparameter update> 
  --l2 <initial log weight decay> 
  --restart <reinitialize model weight after each hyperparameter update or not> 
  --model <cnn for lenet like model, mlp for logistic regession and mlp>
  --dataset <CIFAR10 or MNIST>
  --num_layers <# of hidden layer for mlp>
  --hessian<KFAC: KFAC estiamte; direct:true hessian and inverse>
  --jacobian<direct: true jacobian; product: use d_L/d_theta * d_L/d_lambda>

Trained models after each hyperparameter update will be stored in folder defined in line 627 in mnist_test.py. To use CG to compute inverse of hessian, change line 660's hyperparameter updator.

python mnist_test.py --datasize 40000 --valsize 10000 --lrh 0.01 --epochs=100 --hepochs=10 --l2=1e-5 --restart=10 --model=mlp --dataset=MNIST --num_layers=1 --hessian=KFAC --jacobian=direct

Deployment

First, make sure you are on the master node:

ssh <USERNAME>@q.vectorinstitute.ai

Submit a job to the Slurm scheduler:

srun --partion=gpu --gres=gpu:1 --mem=4GB python mnist_test.py

Or, submit a batch of jobs defined by srun_script.sh:

sbatch --array=0-2 srun_script.sh

View queued jobs for a user:

squeue -u $USERNAME

Cancel jobs for a user:

scancel -u $USERNAME

Cancel a specific job:

scancel $JOBID

Experiments

Here, we should place commands for deploying experiments with and without Slurm

To deploy all of the experiments data generation:

sbatch run_all.sh

Train Data Augmentation Network and/or Loss Reweighting Network

Data Augmentation Network

python train_augment_net2.py --use_augment_net

Loss Reweighting Network

python train_augment_net2.py --use_reweighting_net --loss_weight_type=softmax

Regularization Experiments

LSTM Experiments

The LSTM code in this repository is built on the AWD-LSTM codebase. These commands should be run from inside the rnn folder.

First, download the PTB dataset by running:

./getdata.sh

Tune LSTM hyperparameters with 1-step unrolling

python train.py

STN Comparison

To train an STN, run the following command from inside the stn folder:

python hypertrain.py --tune_all --save

Train a baseline model to get a checkpoint

python train_checkpoint.py --dataset cifar10 --model resnet18 --data_augmentation

Finetune the trained checkpoint

python finetune_checkpoint.py --load_checkpoint=baseline_checkpoints/cifar10_resnet18_sgdm_lr0.1_wd0.0005_aug1.pt --num_finetune_epochs=10 --wdecay=1e-4

Experiment 1

Explain what experiment does, and what figure it is in the paper.

To run python script:

python script.py

To deploy with Slurm:

srun ...

Project Structure

.
├── HAM_dataset.py
├── README.md
├── cutout.py
├── data_loaders.py
├── finetune_checkpoint.py
├── finetune_ift_checkpoint.py
├── grid_search.py
├── images
├── inverse_comparison.py
├── isic_config.py
├── isic_loader.py
├── kfac.py
├── kfac_utils.py
├── minst_ref.py
├── mnist_test.py
├── models
│   ├── __init__.py
│   ├── resnet.py
│   ├── resnet_cifar.py
│   ├── simple_models.py
│   ├── unet.py
│   └── wide_resnet.py
├── papers
│   ├── haoping_project
│   │   ├── main.tex
│   │   ├── neurips2019.tex
│   │   ├── neurips_2019.sty
│   │   └── references.bib
│   └── nips
│       ├── main.tex
│       ├── neurips_2019.sty
│       └── references.bib
├── random_search.py
├── requirements.txt
├── rnn
│   ├── config_scripts
│   │   ├── dropoute_ift_no_lrdecay.yaml
│   │   ├── dropouto
│   │   │   ├── dropouto_2layer_lrdecay.yaml
│   │   │   ├── dropouto_2layer_no_lrdecay.yaml
│   │   │   ├── dropouto_ift_lrdecay.yaml
│   │   │   ├── dropouto_ift_neumann_1_lrdecay.yaml
│   │   │   ├── dropouto_ift_neumann_1_no_lrdecay.yaml
│   │   │   ├── dropouto_ift_no_lrdecay.yaml
│   │   │   ├── dropouto_lrdecay.yaml
│   │   │   ├── dropouto_no_lrdecay.yaml
│   │   │   └── dropouto_perparam_ift_no_lrdecay.yaml
│   │   └── wdecay
│   │       ├── ift_wdecay_per_param_no_lrdecay.yaml
│   │       ├── wdecay_ift_lrdecay.yaml
│   │       └── wdecay_ift_neumann_1_lrdecay.yaml
│   ├── create_command_script.py
│   ├── data.py
│   ├── embed_regularize.py
│   ├── getdata.sh
│   ├── locked_dropout.py
│   ├── logger.py
│   ├── model_basic.py
│   ├── plot_utils.py
│   ├── rnn_utils.py
│   ├── run_grid_search.py
│   ├── train.py
│   ├── train2.py
│   └── weight_drop.py
├── search_configs
│   ├── cifar100_wideresnet_bern_dropout_sep.yaml
│   ├── cifar100_wideresnet_gauss_dropout_sep.yaml
│   ├── cifar10_resnet32_data_aug.yaml
│   ├── cifar10_resnet32_grid.yaml
│   ├── cifar10_resnet32_random.yaml
│   ├── cifar10_resnet32_wdecay_per_layer.yaml
│   ├── cifar10_wideresnet_bern_dropout.yaml
│   ├── cifar10_wideresnet_bern_dropout_sep.yaml
│   ├── cifar10_wideresnet_gauss_dropout.yaml
│   ├── cifar10_wideresnet_gauss_dropout_sep.yaml
│   ├── isic_grid.yaml
│   └── isic_random.yaml
├── search_scripts
│   ├── cifar100_wideresnet_bern_dropout_sep
│   ├── cifar100_wideresnet_gauss_dropout_sep
│   ├── cifar100_wideresnet_random
│   ├── cifar10_wideresnet_bern_dropout
│   ├── cifar10_wideresnet_bern_dropout_sep
│   ├── cifar10_wideresnet_gauss_dropout
│   └── cifar10_wideresnet_gauss_dropout_sep
├── srun_script.sh
├── stn
│   ├── datasets
│   │   ├── __init__.py
│   │   ├── cifar.py
│   │   └── loaders.py
│   ├── hypermodels
│   │   ├── __init__.py
│   │   ├── alexnet.py
│   │   ├── hyperconv2d.py
│   │   ├── hyperlinear.py
│   │   └── small.py
│   ├── hypertrain.py
│   ├── models
│   │   ├── __init__.py
│   │   ├── alexnet.py
│   │   └── small.py
│   └── util
│       ├── __init__.py
│       ├── cutout.py
│       ├── dropout.py
│       └── hyperparameter.py
├── train.py
├── train_augment_net2.py
├── train_augment_net_graph.py
├── train_augment_net_multiple.py
├── train_augment_net_slurm.py
├── train_baseline.py
├── train_checkpoint.py
└── utils
    ├── csv_logger.py
    ├── discrete_utils.py
    ├── logger.py
    ├── plot_utils.py
    └── util.py

17 directories, 103 files

Authors

Jonathan Lorraine - Github
Paul Vicol - Github
Haoping Xu - Github

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optimizing Millions of Hyperparameters by Implicit Differentiation

Running Experiments

Setup Environment

Simple test

mnist_test.py

Deployment

Experiments

Train Data Augmentation Network and/or Loss Reweighting Network

Regularization Experiments

LSTM Experiments

STN Comparison

Train a baseline model to get a checkpoint

Finetune the trained checkpoint

Experiment 1

Project Structure

Authors

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
images		images
models		models
rnn		rnn
search_scripts		search_scripts
stn		stn
utils		utils
.gitignore		.gitignore
HAM_dataset.py		HAM_dataset.py
LICENSE.md		LICENSE.md
README.md		README.md
augment_net_experiment.py		augment_net_experiment.py
constants.py		constants.py
cutout.py		cutout.py
data_loaders.py		data_loaders.py
finetune_checkpoint.py		finetune_checkpoint.py
finetune_hyperparameters.py		finetune_hyperparameters.py
finetune_ift_checkpoint.py		finetune_ift_checkpoint.py
grid_search.py		grid_search.py
hyper_optimizer.py		hyper_optimizer.py
inverse_comparison.py		inverse_comparison.py
isic_config.py		isic_config.py
isic_loader.py		isic_loader.py
kfac.py		kfac.py
kfac_utils.py		kfac_utils.py
main_train_augment_net2.ipynb		main_train_augment_net2.ipynb
minst_ref.py		minst_ref.py
mnist_test.py		mnist_test.py
random_search.py		random_search.py
requirements.txt		requirements.txt
srun_script.sh		srun_script.sh
train.py		train.py
train_augment_net2.py		train_augment_net2.py
train_augment_net_graph.py		train_augment_net_graph.py
train_augment_net_multiple.py		train_augment_net_multiple.py
train_augment_net_slurm.py		train_augment_net_slurm.py
train_baseline.py		train_baseline.py
train_checkpoint.py		train_checkpoint.py

License

ThrunGroup/implicit-hyper-opt

Folders and files

Latest commit

History

Repository files navigation

Optimizing Millions of Hyperparameters by Implicit Differentiation

Running Experiments

Setup Environment

Simple test

mnist_test.py

Deployment

Experiments

Train Data Augmentation Network and/or Loss Reweighting Network

Regularization Experiments

LSTM Experiments

STN Comparison

Train a baseline model to get a checkpoint

Finetune the trained checkpoint

Experiment 1

Project Structure

Authors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages