Learning Bayesian Statistics podcast | Listen online for free

209 episodes

#162 Bayesian Hydrology & GPU AI, with Christopher Krapu
2026-07-28 | 1h 4 mins.
Support & Resources
→ Support the show on Patreon
→ Bayesian Modeling Course (first 2 lessons free)

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work

Takeaways:

Q: How does putting a Gaussian process on unknown coordinates fix noisy location data in mineral prospecting?
A: In mining and geostatistics, the classic Gaussian process model, known there as kriging, assumes you know exactly where each sample was taken. Chris’ project broke that assumption on purpose: the recorded coordinates for each core sample were only accurate to within a rough radius. By treating the true locations as latent variables and putting a Gaussian process over them jointly with the measurements, the model could still reconstruct the underlying gold-concentration field, even though the exact sampling locations were never known precisely. It's a demonstration that Gaussian processes can absorb structural uncertainty that looks, at first glance, like it should make the problem impossible.

Q: What is "Poverty Bayes," and what did it cost to train a two-million-parameter Bayesian model?
A: Poverty Bayes was Chris’ experiment in seeing how cheaply a large Bayesian model could be trained using modern cloud infrastructure. He fit a hierarchical logistic regression with close to two million parameters, using PyMC's Hamiltonian Monte Carlo on a single A100 GPU rented through Modal, a serverless platform that deploys a Python script straight to GPU hardware with almost no setup. He'd originally guessed it would cost around five dollars, the price of a Big Mac, but the real bill came in an order of magnitude lower. A model that would take a Gibbs sampler weeks to run, and that once required a research lab's dedicated GPU, now costs pocket change and a few minutes of setup.

Q: What's the current bottleneck in Bayesian-at-scale tooling?
A: Chris argues the software has largely caught up: PyMC's JAX backend and NumPyro make GPU-accelerated Bayesian modeling work out of the box for most problems. What's missing is common knowledge. Companies are clearly running large Bayesian models in production, but the results stay behind corporate firewalls. Chris’ proposal is a community benchmark effort: which frameworks handle a million-parameter Markov random field on a given GPU out of the box, since this kind of expensive, slow-running benchmark is a poor fit for standard CI pipelines but valuable for the field to know.

Chapters:
22:57 When does GPU acceleration actually pay off for a Bayesian model?
26:33 What did it cost to train a two-million-parameter model on Modal?
30:36 What happened when Chris asked 200 different LLMs to flip a coin?
34:50 Where do Bayesian ideas show up in the agentic AI systems Chris builds at Nvidia?
40:16 Are statisticians being made obsolete by large language models?
41:19 How does putting a Gaussian process on unknown coordinates fix noisy data in mineral prospecting?
58:05 What is Chris looking forward to working on next?

Thank you to my Patrons for making this episode possible!

Links from the show here
The Next Step Beyond LLMs: Foundation Models for Inference
2026-07-22 | 5 mins.
Today's clip is from episode 161, featuring Luigi Acerbi. In this conversation, Luigi explains one of the biggest engineering bottlenecks facing transformer-based probabilistic models—and how his group found a way around it.

The core challenge is that many inference models treat data as an unordered set, making them naturally permutation invariant. That's statistically elegant, but computationally painful: every time a new data point arrives, the model has to recompute attention over the entire dataset from scratch, preventing the kind of KV caching that makes modern language models so efficient.

Luigi walks through his team's solution: a hybrid architecture that keeps the original context fully set-based while introducing a causal-attention buffer for newly arriving data. The result is dramatically faster inference- up to 100× faster in some settings - opening the door to applications like reinforcement learning, active data acquisition, and, ultimately, Luigi's long-term vision of a foundation model for Bayesian inference.

Get the full discussion here

Support & Resources
→ Support the show on Patreon
→ Bayesian Modeling Course (first 2 lessons free)

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work
#161 Amortized Inference & Neural Processes, with Luigi Acerbi
2026-07-16 | 1h 32 mins.
Support & Resources
→ Support the show on Patreon
→ Bayesian Modeling Course (first 2 lessons free)

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work

Takeaways:
Q: What is Variational Bayesian Monte Carlo (VBMC) and how is it different from Bayesian optimization?
A: VBMC borrows the machinery of Bayesian optimization but aims at a different target. Bayesian optimization fits a Gaussian process surrogate to an expensive function and uses it to hunt for the optimum. VBMC instead treats the log-posterior as the function to model, evaluates it at a few carefully chosen points, and keeps the whole reconstructed shape rather than just its peak. That gives you the full posterior, not a single best-fit value. Where MCMC might need tens of thousands to millions of evaluations, VBMC often reconstructs a good posterior approximation from a few hundred, which matters when each evaluation is slow.

Q: When should you reach for PyVBMC, and when is it the wrong tool?
A: Two symptoms tell you PyVBMC might help. First, speed: if a single evaluation of your log density takes on the order of a second, running MCMC over tens of thousands of evaluations becomes painful, and PyVBMC's few-hundred-evaluation budget pays off. Second, dimensionality: because it leans on a Gaussian process surrogate, it works well up to roughly 10 to 15 parameters and degrades beyond that. If your model already runs fine in Stan or PyMC, you do not need it. It shines for expensive, low-dimensional models common in science and engineering, where you are modeling a process rather than composing nice distributions.

Full takeaways here

Chapters:
00:18:13 What is Variational Bayesian Monte Carlo (VBMC) and how does it differ from Bayesian optimization?
00:30:21 When should you use VBMC versus BADS in practice?
00:31:20 What is Bayesian Adaptive Direct Search (BADS) and how does its hybrid optimization strategy work?
00:39:18 What are neural processes, and why are transformers a natural neural process architecture?
00:45:54 What is the Amortized Conditioning Engine (ACE) and what problem does it unify?
00:55:42 What do PriorGuide and the new autoregressive buffer paper solve for amortized inference?
01:02:03 How does the new autoregressive buffer speed up predictions in transformer probabilistic models?
01:06:11 What is Luigi Acerbi's vision for a foundation model for inference?
01:09:26 What is ALINE and how does it add active data acquisition to amortized inference?
01:12:43 How does Luigi Acerbi connect LLM agents, Bayesian decision theory, and the nature of intelligence?
01:18:44 For a PyMC, Stan, or NumPyro user, where should you start with VBMC, BADS, or BayesFlow?

Thank you to my Patrons for making this episode possible!

Links from the show here
Bayesian Statistics vs Epistemology, with Vaden Masrani
2026-06-29 | 1h 40 mins.
Support & Resources
→ Support the show on Patreon
→ Bayesian Modeling Course (first 2 lessons free)

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work

Takeaways:
Q: What's the difference between Bayesian statistics and Bayesian epistemology?
A: Bayesian statistics uses Bayes' theorem on actual data: you put a prior over parameters, combine it with a likelihood, and the data is allowed to tell you your model is wrong. Vaden loves it. Bayesian epistemology, in his tongue-in-cheek phrase, is "Bayesian statistics minus the statistics" - taking Bayes' theorem as a general account of how anyone should reason under uncertainty, including about events where there is nothing to count. The first is falsifiable and grounded; the second, he argues, lets people attach authoritative-sounding numbers to pure belief.

Q: Why is it a problem to put a probability on a one-off future event like human extinction?
A: Because there are no statistics behind it. Vaden's trigger example is Toby Ord's The Precipice, where a data-derived probability (supervolcanoes per millennium) is placed side by side with a probability of extinction-by-superintelligence that came from no data at all. His reaction is the statistician's first instinct: where are the numbers coming from, and what could ever make them come out differently? A subjective degree of belief is fine as a hunch. The trouble starts when it is communicated as though it were an objective, data-grounded frequency.

Q: What does Vaden Masrani actually like about Bayesian statistics?
A: The freedom to encode domain knowledge as a prior and have the result respect common sense - estimating an average human height, you can rule out zero and a hundred feet before seeing a single measurement. But the part he keeps stressing is falsifiability: you fit the model, compare it to data, and the data can tell you the model was bad. That contact with reality is exactly what makes the statistics legitimate and what the epistemology lacks. On Bayesian-versus-frequentist for engineering problems, he says he has no dog in the fight -- both are useful, and any working statistician uses both.

Full takeaways here

Chapters:

00:24:01 What's the difference between Bayesian statistics and Bayesian epistemology?
00:33:12 How can Bayesian epistemology lead to bad real-world decisions?
00:36:36 Is Bayesian or frequentist statistics better for real-world problems?
00:39:31 What is the problem of induction, and how does Bayesian epistemology try to solve it?
00:43:50 What are the main logical problems with Bayesian epistemology?
00:48:40 What is Popper's critical rationalism, and how does falsifiability fit in?
00:52:31 How does critical rationalism work when you can't run a clean experiment?
01:15:03 Why should you treat criticism as a gift, even when it hurts?
01:19:54 How do Stoicism and equanimity help you handle criticism?
01:23:19 Why does critical rationalism apply to everyday life, not just science?

Thank you to my Patrons for making this episode possible!

Links from the show here
Why Bayesian Statistics Is More Computational Than Ever
2026-06-19 | 4 mins.
Today's clip is from Episode 158 featuring Stefan Radev. In this conversation, Alex Andorra and Stefan break down a core argument from their paper: Bayesian statistics has never been more computational than it is now, and simulation is the thread that ties the whole workflow together.

Stefan parcellates the Bayesian workflow into four stages, and this clip covers the first two. Stage one is model specification, where the workflow community has long recommended prior predictive checks. You can do this informally, just running simulations from your model and eyeballing whether the output meets your expectations, or formally, à la Michael Betancourt, by pushing your model's high-dimensional output through a transformation into a low-dimensional, interpretable space and checking it against reality.

The punchline: a surprising number of models can be discarded before you've even seen real data, yet Stefan notes these checks remain underused in practice.

Stage two is model verification, where the question shifts to whether your inferences are well calibrated. This is the territory of simulation-based calibration and parameter recovery studies, classic tools that have always carried a steep computational price. You simulate thousands of synthetic datasets and run inference on every single one, which is exactly why these checks are so often skipped in papers, even though doing one well can be a contribution in its own right.

Here's where amortized simulation-based inference changes the math entirely. Checks that used to take days now take seconds, and instead of laboriously running inference dataset by dataset, you get millions of posterior samples essentially for free. The calibration checks that the field has always known it should be doing finally become cheap enough to actually do.

Get the full discussion here

Support & Resources
→ Support the show on Patreon
→ Bayesian Modeling Course (first 2 lessons free)

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work

More Education podcasts

Trending Education podcasts

About Learning Bayesian Statistics

Are you a researcher or data scientist / analyst / ninja? Do you want to learn Bayesian inference, stay up to date or simply want to understand what Bayesian inference is? Then this podcast is for you! You'll hear from researchers and practitioners of all fields about how they use Bayesian statistics, and how in turn YOU can apply these methods in your modeling workflow. When I started learning Bayesian methods, I really wished there were a podcast out there that could introduce me to the methods, the projects and the people who make all that possible. So I created "Learning Bayesian Statistics", where you'll get to hear how Bayesian statistics are used to detect black matter in outer space, forecast elections or understand how diseases spread and can ultimately be stopped. But this show is not only about successes -- it's also about failures, because that's how we learn best. So you'll often hear the guests talking about what *didn't* work in their projects, why, and how they overcame these challenges. Because, in the end, we're all lifelong learners! My name is Alex Andorra by the way. By day, I'm a Senior data scientist. By night, I don't (yet) fight crime, but I'm an open-source enthusiast and core contributor to the python packages PyMC and ArviZ. I also love Nutella, but I don't like talking about it – I prefer eating it. So, whether you want to learn Bayesian statistics or hear about the latest libraries, books and applications, this podcast is for you -- just subscribe! You can also support the show and unlock exclusive Bayesian swag on Patreon!

Podcast website

Education Science Technology

Listen to Learning Bayesian Statistics, Coffee Break French and many other podcasts from around the world with the radio.net app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

Open app

Get the free radio.net app

Stations and podcasts to bookmark
Stream via Wi-Fi or Bluetooth
Supports Carplay & Android Auto
Many other app features

Learning Bayesian Statistics

Scan code,
download the app,
start listening.