Jacob Kelly

Jacob Kelly

I'm a Research Engineer at DeepMind. I completed my undergrad in Computer Science, Math, and Stats at the University of Toronto, where I was fortunate to work with Roger Grosse and David Duvenaud at the Vector Institute. My goal is to use machine learning to understand biology. I'm interested in energy-based models, latent variable models, neural ODEs, and genomics.

Previously, I was a Machine Learning Research Intern at Deep Genomics. Before that, I did computational biology research with Benjamin Haibe-Kains at the Princess Margaret Cancer Centre. My hobbies include running, rock climbing, and reading. Feel free to get in touch if you'd like to chat.


Directly Training Joint Energy-Based Models for Conditional Synthesis and Calibrated Prediction of Multi-Attribute Data

Multi-attribute classification generalizes classification, presenting new challenges for making accurate predictions and quantifying uncertainty. We build upon recent work and show that architectures for multi-attribute prediction can be reinterpreted as energy-based models (EBMs). We propose a simple extension that allows us to directly maximize the likelihood of data and labels under the unnormalized joint distribution. Our models are capable of both accurate, calibrated predictions and high-quality conditional synthesis of novel attribute combinations.

Jacob Kelly, Richard Zemel, Will Grathwohl

ICML 2021 Workshop on Uncertainty & Robustness in Deep Learning

pdf | poster | code | bibtex

No MCMC for me: Amortized Sampling for Fast and Stable Training of Energy-Based Models

Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty. In this work, we present a simple method for training EBMs at scale which uses an entropy-regularized generator to amortize the MCMC sampling typically used in EBM training. We apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training. This allows us to extend JEM models to semi-supervised classification on tabular data from a variety of continuous domains.

Will Grathwohl*, Jacob Kelly*, Milad Hashemi, Mohammad Norouzi, Kevin Swersky, David Duvenaud

International Conference on Learning Representations (ICLR), 2021

pdf | poster | code | bibtex

Learning Differential Equations that are Easy to Solve

Neural ODEs become expensive to solve numerically as training progresses. We introduce a differentiable surrogate for the time cost of standard numerical solvers using higher-order derivatives of solution trajectories. These derivatives are efficient to compute with Taylor-mode automatic differentiation. Optimizing this additional objective trades model performance against the time cost of solving the learned dynamics.

Jacob Kelly*, Jesse Bettencourt*, Matthew James Johnson,
David Duvenaud

Neural Information Processing Systems (NeurIPS), 2020

pdf | poster | slides | code | bibtex

*Equal contribution.


I was a teaching assistant for the following courses:


Elastic Net Regression to Predict Drug Response from Gene Expression

We develop drug-specific models to detect biomarkers in gene expression data for predicting the area under the drug-dose response curve (AUC) of cell lines for acute myeloid leukemia (AML). We apply our model to two pharmacogenomic datasets, one consisting of data from immortalized cell lines, the other ex-vivo primary cell lines from patients. We use linear regression with elastic net regularization as our model and compare methods for feature selection.

Jacob Kelly, Arvind Mer, Sisira Nair, Hassan Mahmoud, Benjamin Haibe-Kains


Genomic Sequencing

To learn about genomic sequencing, I implemented Boyer-Moore for performing exact sequence alignment. I also implemented the Z algorithm to efficiently generate indices of the query string needed for the Boyer-Moore algorithm.

Jacob Kelly


ICLR 2019 Reproducibility Challenge

We implemented the ICLR 2019 submission (later accepted) "Initialized Equilibrium Propagation for Backprop-Free Training" by O'Connor et al. as part of the ICLR 2019 Reproducibility Challenge.

Matthieu Chan Chee, Jad Ghalayini, Jacob Kelly, Winnie Xu



I implemented a sequence to sequence model to learn to sort numbers. I used an encoder-decoder LSTM architecture with a "pointer" attention mechanism developed by Vinyals et al. (Pointer Networks) to exploit the fact that the output of the model is a permutation of its input.

Jacob Kelly

blog | code

Pocket Guide

We developed an Android app for indoor navigation through SickKids hospital using bluetooth signals from Estimote Beacons. We use SPFA on a map of beacon locations to efficiently generate the shortest path from the user's current location to their destination. We estimate the current location using signal strengths from nearby beacons and track over time using an exponentially weighted average. Developed in summer 2016 at Cossette.

Bruno Almeida, Jacob Kelly, Gabriel Yeung

slides | code

Alphabetical order.