Reading List (Textbooks)

The ML/AI field is huge. It involves way too many fields and subfields. Since any number, whether directly recorded or derived from physical observations, or even psychological perceptions, can be considered “data”, hence there are simply too many subjects to be tagged “data science”.

Below I will begin compiling a list of books (though some may simply be manuscripts from professors) that are well known, read, and/or cited for Ph.D. students to grip the noteworthy theories and practices. I will update this list frequently so please feel free to come back often.

“Birds-eye-view” textbooks:

Pattern Recognition and Machine Learning. Christopher Bishop.

Machine Learning: A Probabilistic Perspective. Kevin P. Murphy.

Deep Learning. Ian Goodfellow, Yoshua Bengio, Aaron Courville.

Computer Age Statistical Inference: Algorithms, Evidence and Data Science. Bradley Efron, Trevor Hastie.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani, Jerome Friedman.

Subject-focused textbooks:

Graphical Models

Graphical Models, Exponential Families, and Variational Inference. Martin J. Wainwright, Michael I. Jordan.

Discrete Models

Categorical Data Analysis. Alan Agresti.


Introductory Lectures on Convex Optimization. Yurii Nesterov.

Convex Optimization. Stephen Boyd, Lieven Vandenberghe.

Probability Theory / Measure Theory

Introduction to Probability Models. Sheldon M. Ross.

Measure Theory and Fine Properties of Functions. Lawrence Craig Evans, Ronald F. Gariepy

Probability Essentials. Jean Jacod, Philip Protter.

Probabilistic Symmetries and Invariance Principles. Olav Kallenberg.

Stochastic Process / Stochastic Differential Equations

Poisson Processes. J. F. C. Kingman.

Stochastic Methods. Crispin Gardiner.

An Introduction to Stochastic Differential Equations. Lawrence Craig Evans.

Stochastic Differential Equations: An Introduction with Applications. Bernt √ėksendal.

Optimal Transport

Computational Optimal Transport.¬†Gabriel PeyreŐĀ,¬†Marco Cuturi.

Linear Algebra


Real Analysis


Complex Analysis


Functional Analysis


Ordinary / Partial Differential Equations

Partial Differential Equations. Lawrence Craig Evans.

Differential Geometry


Statistical Inference (classical)

Statistical Inference. George Casella, Roger L. Berger.

Testing Statistical Hypotheses. Erich L. Lehmann, Joseph P. Romano.

Bayesian Statistics

Bayesian Data Analysis. Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, Donald B. Rubin.

Bayesian Approximate Inference

Handbook of Markov Chain Monte Carlo. Steve Brooks, Andrew Gelman, Galin L. Jones, Xiao-Li Meng.

Reinforcement Learning

Reinforcement Learning: An Introduction. Richard S. Sutton, Andrew G. Barto.

Causal Inference

Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Guido W. Imbens, Donald B. Rubin.

Causality: Models, Reasoning and Inference. Judea Pearl.

Counterfactuals and Causal Inference: Methods and Principles for Social Research. Christopher Mordan, Stephen Winship.

Information Retrieval

Information Retrieval. Christopher Manning, Prabhakar Raghavan,¬†Hinrich Sch√ľtze.

Data Mining



Mostly Harmless Econometrics: An Empiricist’s Companion. Joshua D. Angrist, J√∂rn-Steffen Pischke.

Mathematical Finance


Quantum Physics / Chemistry


Algebraic Game Theory