Talks

Open Source Bayesian Inference in Python with PyMC3

FOSSCON, August 25, 2017

Slides

Jupyter Notebook

Abstract: In the last ten years, there have been a number of advancements in the study of Hamiltonian Monte Carlo algorithms that have enabled effective Bayesian statistical computation for much more complicated models than were previously feasible. These algorithmic advancements have been accompanied by a number of open source probabilistic programming packages that make them accessible to programmers and statisticians. PyMC3 is one such package written in Python and supported by NumFOCUS. This workshop will give an introduction to probabilistic programming with PyMC3. No preexisting knowledge of Bayesian statistics is necessary; a working knowledge of Python will be helpful.

Empowering Marketers with Bionic AI

phlAI, August 15, 2017

With David Brussin, Founder & Chief Product Officer, Monetate

Abstract: Marketers have for many years worked to use data to improve the business outcomes from the experiences they deliver. Statistical discipline, and then AI, have markedly improved the ability to drive these improvements. As we have entered what Forrester calls ‘the age of the customer,’ customer expectations have in some ways begun to exceed competitive pressures in marketing, leading to a desire to align business outcomes more directly with customer outcomes. In this talk, we will focus on the use of AI in empowering marketers to provide each of their individual customers with better experiences. AI has been previously used to automate actions taken by humans, often enabling new scale. Solutions that replace human creative input altogether are frequently imagined, but hardly imminent. We will survey, from marketer, customer, and data scientist perspectives, this progression in marketing, resulting in new ‘bionic’ techniques that combine marketer creativity with machine-driven scale.


Probabilistic Programming in Python with PyMC3

ODSC East 2017, May 5, 2017

Slides

Jupyter Notebook

Abstract: Probabilistic programming is a paradigm in which the programmer specifies a generative probability model for observed data and the language/software library infers the distributions of unobserved quantities. By separating model specification from inference, probabilistic programming allows the modeler to “tell the story” of how the data were generated and then perform inference without explicitly developing an inference algorithm. This separation makes inference more accessible for many complex models. PyMC3 is a Python package for probabilistic programming built on top of Theano that provides advanced sampling and variational inference algorithms and is undergoing rapid development. This talk will give an introduction to probabilistic programming using PyMC3 and will conclude with a brief overview of the wider probabilistic programming ecosystem.


Variational Inference in Python

PyData DC 2016, October 8, 2016

Slides

Jupyter Notebook

Abstract: Bayesian inference has proven to be a valuable approach to many machine learning problems. Unfortunately, many interesting Bayesian models do not have closed-form posterior distributions. Simulation via the family Markov chain Monte Carlo (MCMC) algorithms is the most popular method of approximating the posterior distribution for these analytically intractible models. While these algorithms (appropriately used) are guaranteed to converge to the posterior given sufficient time, they are often difficult to scale to large data sets and hard to parallelize. Variational inference is an alternative approach that approximates intractible posteriors through optimization instead of simulation. By restricting the class of approximating distributions, variational inference allows control of the tradeoff between computational complexity and accuracy of the approximation. Variational inference can also take advantage of stochastic and parallel optimization algorithms to scale to large data sets. One drawback of variational inference is that in its most basic form, it can require a lot of model-specific manual calculations. Recent mathematical advances in black box variational inference (BBVI) and automatic differentiation variational inference (ADVI) along with advances in open source computational frameworks such as Theano and TensorFlow have made variational inference more accessible to non-specialists. This talk will begin with an introduction to variational inference, BBVI, and ADVI, then illustrate some of the software packages (PyMC3 and Edward) that make these variational inference algorithms available to Python developers.


An Introduction to Probabilistic Programming

DataPhilly Meetup, July 13, 2016

Slides

Jupyter Notebook

Abstract: Probabilistic programming is a paradigm in which the programmer specifies a generative probability model for observed data and the language/software library infers the (approximate) values/distributions of unobserved parameters. By separating the task of model specification from inference, probabilistic programming allows the modeler to “tell the story” of how the data were generated without explicitly developing an inference algorithm. This separation makes inference in many complex models more accessible.

This talk will give an introduction to probabilistic programming in Python using pymc3 and will also give a brief overview of the wider probabilistic programming ecosystem.


Bayesian Optimization with Gaussian Processes

DataPhilly Meetup, February 18, 2016

Slides

Jupyter Notebook

Abstract: Bayesian optimization is a technique for finding the extrema of functions which are expensive, difficult, or time-consuming to evaluate. It has many applications to optimizing the hyperparameters of machine learning models, optimizing the inputs to real-world experiments and processes, etc. This talk will introduce the Gaussian process approach to Bayesian optimization, with sample code in Python.