Bayesian Data Analysis, Third Edition (2013) [pdf]
Posted3 months agoActive3 months ago
sites.stat.columbia.eduResearchstoryHigh profile
supportivepositive
Debate
20/100
Bayesian StatisticsData AnalysisStatistics Education
Key topics
Bayesian Statistics
Data Analysis
Statistics Education
The post shares a PDF of 'Bayesian Data Analysis, Third edition' by Andrew Gelman, sparking a discussion on Bayesian statistics, its applications, and resources for learning.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
1h
Peak period
36
0-6h
Avg / period
9.7
Comment distribution68 data points
Loading chart...
Based on 68 loaded comments
Key moments
- 01Story posted
Sep 28, 2025 at 1:23 PM EDT
3 months ago
Step 01 - 02First comment
Sep 28, 2025 at 2:35 PM EDT
1h after posting
Step 02 - 03Peak activity
36 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 2, 2025 at 3:01 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45406109Type: storyLast synced: 11/20/2025, 8:56:45 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
https://hn.algolia.com/?q=statmodeling.stat.columbia.edu
- https://statmodeling.stat.columbia.edu/2025/08/25/what-writi...
- https://statmodeling.stat.columbia.edu/2025/09/04/assembling...
You need 16 times the sample size to estimate an interaction than to estimate a main effect https://statmodeling.stat.columbia.edu/2018/03/15/need16/
Debate over effect of reduced prosecutions on urban homicides; also larger questions about synthetic control methods in causal inference. https://statmodeling.stat.columbia.edu/2023/10/12/debate-ove...
Bayesians moving from defense to offense: “I really think it’s kind of irresponsible now not to use the information from all those thousands of medical trials that came before. Is that very radical?” https://statmodeling.stat.columbia.edu/2023/12/23/bayesians-...
[0] https://en.wikipedia.org/wiki/Statistical_Rethinking
1) Do you want "theoretical" knowledge(math background required)? If so, then you need to get a decent mathematical statistics book like Casella-Berger. I think a good US CS degree grad could handle it, but you might need to go a bit slow and google around/ maybe fill in some gaps in probability/calculus.
2)Introduction to Statistical Learning is unironically a great intro to "applied" stats. You have most of the "vanilla" models/algorithms, theoretical background behind each but not too much, you can follow along with the R version and see how stuff actually works and exercises that vary in difficulty.
With regards to Gelman and Bayesian data analysis, I should note that in my experience the Bayesian approach is 1st year MS /4th year of a Bachelors in the US. It's very useful to know and have in your toolbox but IMO it should be left aside until you are confident in the "frequentist" basics.
https://www.inference.org.uk/itprnn/book.pdf
It's a little dated now but it connects Bayesian statistics with neural nets and information theory in an elegant way.
I also recommend Gelman’s (one of the authors of the linked book) Regression and Other Stories as a more approachable text for this content.
Think Bayes and Bayesian Methods for Hackers are introductory books from a beginner coming from a programming background.
If you want something more from the ML world that heavily emphasizes the benefits of probabilistic (Bayesian) methods, I highly recommend Kevin Murphy’s Probabilistic Machine Learning. I have only read the first edition before he split it into two volumes and expanded it, but I’ve only heard good things about the new volumes too.
https://www.oreilly.com/library/view/bayesian-methods-for/97...
It took me about a year to work through this book on the side (including the exercises) and it provided the foundation for years of fruitful research into hierarchical Bayesian models. It’s a definitely not an introductory read, but for any looking to advance their statistical toolkit, I cannot recommend this book highly enough.
As a starting point, I’d strongly suggest the first 5 chapters for an excellent introduction to Gelman’s modeling philosophy, and then jumping around the table of contents to any topics that look interesting.
There is an example in his book discussing efficacy trials across seven hospitals. If you stratify the data, you lose a lot of confidence, if you aggregate the data, you end up just modeling the difference between hospitals.
Hierarchical modeling allows you to split your dataset under a single unified model. This is really powerful for extracting signal for noise because you can split your dataset according to potential confounding variables eg the hospital from which the data was collected.
I am writing this on my phone so apologies for the lack of links, but in short the approach in this book is extremely relevant of medical testing.
Once you realize this you can easily develop very sophisticated testing models (if necessary) that are also easy to understand and reason about. This dramatically simplifies.
If you're looking for a specific book recommendation Statistical Rethinking does a good job covering this at length and Bayesian Statistics the Fun Way is a more beginner friendly book that covers the basics of Bayesian hypothesis testing.
Edit: Haha I just found the textbook and I’m remembering now that I actually worked through sections of it back when I was working through BDA several years back.
Statistical Rethinking is a good option too.
It goes through fundamentals of Bayesian ideas in the context of applications in communication and machine learning problems. I find his explanations uncluttered.
First learn some basic probability theory: Peter K. Dunn (2024). The theory of distributions. https://bookdown.org/pkaldunn/DistTheory
Then frequentist statistics: Chester Ismay, Albert Y. Kim, and Arturo Valdivia - https://moderndive.com/v2/ Mine Çetinkaya-Rundel and Johanna Hardin - https://openintrostat.github.io/ims/
Finally Bayesian: Johnson, Ott, Dogucu - https://www.bayesrulesbook.com/ This is a great book, it will teach you everything from very basics to advanced hierachical bayesian modeling and all that by using reproducible code and stan/rstanarm
Once you master this, next level may be using brms and Solomon Kurz has done full Regression and Other Stories Book using tidyerse/brms. His knowledge of tidyverse and brms is impressive and demonstrated in his code. https://github.com/ASKurz/Working-through-Regression-and-oth...
After than Statistical Rethinking will take you much deeper into more complex experiment design using linear models and beyond as well as deepening your understanding of other areas of math required.
It’s just a relatively dense book. There’s some other really good suggestions in this thread, most of which I’ve heard good things about. If you have a background in programming, I’d suggest Bayesian Methods for Hackers as a really good starting point. But you can also definitely tackle this book head on, and it will be very rewarding.
That course is a good balance between theory and practice. It gave me a practical intuition understanding why posterior distribution of parameters and data are important and how to compute them.
I took the course in 2016 so a lot could have changed.
Also, IMHO, his best work has been done describing how to do statistics. He has written somewhere I cannot find now that he sees himself as a user of mathematics, not as a creator of new theories. His book Regression and Other Stories is elementary but exceptionally well written. He describes how great Bayesian statisticians think and work, and this is invaluable.
He is updating Data Analysis Using Regression and Multilevel/Hierarchical Models to the same standard, and I guess BDA will eventually come next. As part of the refresh, I imagine everything will be ported to Stan. Interestingly, Bob Carpenter and others working on Stan are now pursuing ideas on variational inference to scale things further.
[1] https://sites.stat.columbia.edu/gelman/research/unpublished/...
I would say his work with Stan and his writings, along with theorists like Radford Neal, really opened the door to a computational approach to hierarchical modeling. And I think this is a meaningfully different field.
Before Stan existed we used BUGS [1] and then JAGS [2]. And most of the work on computation (by Neal and others) was entirely independent of Gelman.
[1] https://en.wikipedia.org/wiki/Bayesian_inference_using_Gibbs...
[2] https://en.wikipedia.org/wiki/Just_another_Gibbs_sampler
One example is rare class prediction on long form text data eg phone calls, podcasts, transcripts. Other networks including neural networks and LLMs are either not flexible enough or require far too much data to achieve the necessary performance. Structured hierarchical modeling is the balance between those two extremes.
Another example is in genomic analysis. Similarly high dimensional, noisy, low data. Additionally, you don’t actually care about the predictions, you want to understand what genes or sets of genes are driving phenotypic behaviors.
I’d be happy to go into more depth via email or chat if this is something you are interested in (on my profile).
Some useful reads
[1] https://sturdystatistics.com/articles/text-classification
[2] https://pmc.ncbi.nlm.nih.gov/articles/PMC5028368/
What was surprising, though, was how reluctant the engineers are to learn such basic techniques. It's not like the math was hard. They all went through the first-year college math and I'm sure they did reasonably well.
Plenty of engineers have to take an introductory stats course, but it's not clear why you'd want your engineers to learn bayesian statistics? I would be surprised if they could correctly interpret a p-value or regression coefficient, let alone one with interaction effects. (It'd be wholly useless if they could, fwiw).
It'd be nice if the statisticians/'data scientists' on my team learned their way around the CI/CD pipelines, understood kubernetes pods, and could write their own distributed training versions of their pytorch models, but division-of-labor is a thing for a reason, and I don't expect them to nor need them to.
On a side note, I believe it is an individual's responsibility to find the coolness in their project. What's the fun of building a dashboard that I have done a thousand times? What's the fun of carrying out a routine that does not challenge me? But solving a problem in a most rigorous and generalized way? That is something in which an engineer can find some fun. Or maybe it's just me.
Also nobody fits neural networks and use variation inference using any priors that aren’t some standard form that makes algorithm easy
[0] The rule of thumb that signal-to-noise improves with the square root of the number of measurements. Also, as my dad put it: "The more bad data we average together, the closer we get to the wrong answer."
You can use BDA for forward problems too, via posterior predictive samples. The benefit over neural networks for this task is that with BDA you get dependable uncertainty quantification about your predictions. The disadvantage is that the modalities are somewhat limited to simple structured data.
You can also use neural networks for inverse problems, such as for example with Neural Posterior Estimation. This approach shows promise since it can tackle more complex problems than the standard BDA approach of Markov Chain Monte Carlo and with much faster results, but the accuracy and dependability are still quite lacking.
I also write a book on the topic which is focused a code and example approach. It's available for open access here. https://bayesiancomputationbook.com
Bayesian Data Analysis, Third Edition [pdf] - https://news.ycombinator.com/item?id=23091359 - May 2020 (48 comments)
https://mlu-explain.github.io/linear-regression/
It cited Regression and Other Stories (though not the Bayesian chapters, which I'm now inspired to dig into before checking this out).