The most recommended data science books

Who picked these books? Meet our 18 experts.

18 authors created a book list connected to data science, and here are their favorite data science books.
Shepherd is reader supported. When you buy books, we may earn an affiliate commission.

What type of data science book?

Loading...
Loading...

Book cover of Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures

Valliappa Lakshmanan Author Of Data Science on the Google Cloud Platform: Implementing End-To-End Real-Time Data Pipelines: From Ingest to Machine Learning

From my list on if you want to become a data scientist.

Why am I passionate about this?

I started my career as a research scientist building machine learning algorithms for weather forecasting. Twenty years later, I found myself at a precision agriculture startup creating models that provided guidance to farmers on when to plant, what to plant, etc. So, I am part of the movement from academia to industry. Now, at Google Cloud, my team builds cross-industry solutions and I see firsthand what our customers need in their data science teams. This set of books is what I suggest when a CTO asks how to upskill their workforce, or when a graduate student asks me how to break into the industry.

Valliappa's book list on if you want to become a data scientist

Valliappa Lakshmanan Why did Valliappa love this book?

It is not enough for a data scientist to be able to analyze data and build ML models. You have to be able to communicate the insights to decision-makers concisely and accurately. This book shows you bad and good visualizations — you’ll be surprised by how often you would have defaulted to the bad way without the guidance provided by this book!

By Claus O. Wilke,

Why should I read it?

1 author picked Fundamentals of Data Visualization as one of their favorite books, and they share why you should read it.

What is this book about?

Effective visualization is the best way to communicate information from the increasingly large and complex datasets in the natural and social sciences. But with the increasing power of visualization software today, scientists, engineers, and business analysts often have to navigate a bewildering array of visualization choices and options.

This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures. What visualization type is best for the story you want to tell? How do you make informative figures that are visually pleasing? Author Claus O. Wilke…


Book cover of Effective Pandas

Valliappa Lakshmanan Author Of Data Science on the Google Cloud Platform: Implementing End-To-End Real-Time Data Pipelines: From Ingest to Machine Learning

From my list on if you want to become a data scientist.

Why am I passionate about this?

I started my career as a research scientist building machine learning algorithms for weather forecasting. Twenty years later, I found myself at a precision agriculture startup creating models that provided guidance to farmers on when to plant, what to plant, etc. So, I am part of the movement from academia to industry. Now, at Google Cloud, my team builds cross-industry solutions and I see firsthand what our customers need in their data science teams. This set of books is what I suggest when a CTO asks how to upskill their workforce, or when a graduate student asks me how to break into the industry.

Valliappa's book list on if you want to become a data scientist

Valliappa Lakshmanan Why did Valliappa love this book?

Even if you are ultimately going to be working with terabytes of data, you’ll start out doing exploratory data analysis. The tool that you’ll use for that is most likely going to be Pandas. One of the best investments that you can make when becoming a data scientist is to become a Pandas expert, and there is no better book than Harrison’s to help you get there. Plus, many of the interview questions you will face during the hiring process will probably involve Pandas. Blow your interviewers out of the water by showing them corners of the Pandas library they didn’t even know!

By Matt Harrison,

Why should I read it?

1 author picked Effective Pandas as one of their favorite books, and they share why you should read it.

What is this book about?

Best practices for manipulating data with Pandas. This book will arm you with years of knowledge and experience that are condensed into an easy to follow format. Rather than taking months reading blogs and websites and searching mailing lists and groups, this book will teach you how to write good Pandas code.

It covers: Series manipulation Creating columns Summary statistics Grouping, pivoting, and cross-tabulation Time series data Visualization Chaining Debugging code and more...


Book cover of R For Dummies

Tilman M. Davies Author Of The Book of R: A First Course in Programming and Statistics

From my list on intro to programming and data science with R.

Why am I passionate about this?

I’m an applied statistician and academic researcher/lecturer at New Zealand’s oldest university – the University of Otago. R facilitates everything I do – research, academic publication, and teaching. It’s the latter part of my job that motivated my own book on R. From first-year statistics students who have never seen R to my own Ph.D. students using R to implement novel and highly complex statistical methods and models, my experience is that all ultimately love the ease with which the R language permits exploration, visualisation, analysis, and inference of one’s data. The ever-growing need in today’s society for skilled statisticians and data scientists means there's never been a better time to learn this essential language.

Tilman's book list on intro to programming and data science with R

Tilman M. Davies Why did Tilman love this book?

A gentle yet detailed book for beginner programmers. A great book for those who know they'll be getting up to some programming in R but who are very new to programming in general. The book's chapters are filled with content on the syntax, usage, and 'best practice' guidelines. The examples guide the reader in a step-by-step fashion to maximise understanding. An especially unique chapter providing examples on things you can do in R that you might've otherwise done in Excel is one of its stand-out features.

By Andrie De Vries, Joris Meys,

Why should I read it?

1 author picked R For Dummies as one of their favorite books, and they share why you should read it.

What is this book about?

Mastering R has never been easier Picking up R can be tough, even for seasoned statisticians and data analysts. R For Dummies, 2nd Edition provides a quick and painless way to master all the R you'll ever need. Requiring no prior programming experience and packed with tons of practical examples, step-by-step exercises, and sample code, this friendly and accessible guide shows you how to know your way around lists, data frames, and other R data structures, while learning to interact with other programs, such as Microsoft Excel. You'll learn how to reshape and manipulate data, merge data sets, split and…


Book cover of All-in On AI: How Smart Companies Win Big with Artificial Intelligence

Roger W. Hoerl Author Of Statistical Thinking: Improving Business Performance

From my list on AI and data science that are actually readable.

Why am I passionate about this?

As a professional statistician, I am naturally interested in AI and data science. However, in our current information age, everyone, in all segments of society, needs to understand the basics of AI and data science. These basics include such things as what these disciplines are, what they can contribute to society, and perhaps most importantly, what can go wrong. However, I have found that much of the literature on these topics is highly technical and beyond the reach of most readers. These books are specifically selected because they are readable by virtually everyone, and yet convey the key concepts needed to be data-literate in the 21st century. Enjoy!

Roger's book list on AI and data science that are actually readable

Roger W. Hoerl Why did Roger love this book?

Books on AI often go to extremes, either promoting it as the solution to all the world’s problems, or depicting it as an evil that will destroy humanity.

This book is much more practical, and based on experience using AI in actual business applications. It is the result of considerable research, involving investigation of applications not only in silicon-valley, but from various business sectors, such as Airbus, Ping, Progressive Insurance, and Capital One Bank.

Don’t let the title fool you; this book is not simply a promotion of AI, but addresses the practical issues that have to be considered if success is to be achieved. For example, they argue that “the most important aspect in AI success is not machinery, but human leadership, behavior, and change.”

By Thomas H. Davenport, Nitin Mittal,

Why should I read it?

1 author picked All-in On AI as one of their favorite books, and they share why you should read it.

What is this book about?

A Wall Street Journal bestseller

A Publisher's Weekly bestseller

A fascinating look at the trailblazing companies using artificial intelligence to create new competitive advantage, from the author of the business classic, Competing on Analytics, and the head of Deloitte's US AI practice.

Though most organizations are placing modest bets on artificial intelligence, there is a world-class group of companies that are going all-in on the technology and radically transforming their products, processes, strategies, customer relationships, and cultures.

Though these organizations represent less than 1 percent of large companies, they are all high performers in their industries. They have better business…


Book cover of Competing on Analytics: The New Science of Winning

Jeremy Adamson Author Of Minding the Machines: Building and Leading Data Science and Analytics Teams

From my list on for data science and analytics leaders.

Why am I passionate about this?

I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector.  I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics. 

Jeremy's book list on for data science and analytics leaders

Jeremy Adamson Why did Jeremy love this book?

This is a foundational book on analytics and data science as a business function and helped to shape the development of the practice. It provides a view of the discipline through a business lens and avoids deep technical examinations. Though much has changed in the 15 years since it was originally published, it is still essential reading for a leader in the field. No book since has captured as well the competitive differentiation that analytics provides.

By Thomas H. Davenport, Jeanne G. Harris,

Why should I read it?

1 author picked Competing on Analytics as one of their favorite books, and they share why you should read it.

What is this book about?

You have more information at hand about your business environment than ever before. But are you using it to "out-think" your rivals? If not, you may be missing out on a potent competitive tool. In Competing on Analytics: The New Science of Winning, Thomas H. Davenport and Jeanne G. Harris argue that the frontier for using data to make decisions has shifted dramatically. Certain high-performing enterprises are now building their competitive strategies around data-driven insights that in turn generate impressive business results. Their secret weapon? Analytics: sophisticated quantitative and statistical analysis and predictive modeling. Exemplars of analytics are using new…


Book cover of R in Action: Data Analysis and Graphics with R

Tilman M. Davies Author Of The Book of R: A First Course in Programming and Statistics

From my list on intro to programming and data science with R.

Why am I passionate about this?

I’m an applied statistician and academic researcher/lecturer at New Zealand’s oldest university – the University of Otago. R facilitates everything I do – research, academic publication, and teaching. It’s the latter part of my job that motivated my own book on R. From first-year statistics students who have never seen R to my own Ph.D. students using R to implement novel and highly complex statistical methods and models, my experience is that all ultimately love the ease with which the R language permits exploration, visualisation, analysis, and inference of one’s data. The ever-growing need in today’s society for skilled statisticians and data scientists means there's never been a better time to learn this essential language.

Tilman's book list on intro to programming and data science with R

Tilman M. Davies Why did Tilman love this book?

This provides a superb balance between technical aspects of R coding and the statistical methods that motivate its use. It's rare to find a book on topics like this that are written with Kabacoff's easygoing yet precise style, which makes it ideal for beginners. From my own experience, it is obvious the author has spent many years teaching this type of content, knowing where things deserve extra explanation up front and where other more technical details can be relegated to more advanced texts.

By Robert I. Kabacoff,

Why should I read it?

1 author picked R in Action as one of their favorite books, and they share why you should read it.

What is this book about?

DESCRIPTION

R is a powerful language for statistical computing and graphics that can handle virtually any data-crunching task. It runs on all important platforms and provides thousands of useful specialized modules and utilities. This makes R a great way to get meaningful information from mountains of raw data.



R in Action, Second Edition is language tutorial focused on practical problems. Written by a research methodologist, it takes a direct and modular approach to quickly give readers the information they need to produce useful results. Focusing on realistic data analyses and a comprehensive integration of graphics, it follows the steps that…


Book cover of Python for Everyone

Daniel Zingaro Author Of Learn to Code by Solving Problems: A Python Programming Primer

From my list on for a rock solid python programming foundation.

Why am I passionate about this?

Some programmers learn through online articles, videos, and blog posts. Not me. I need a throughline—a consistent, expert distillation of the material to take me from where I am to where I want to be. I am not good at patching together information from disparate sources. I need a great book. I have a PhD in computer science education, and I want to know what helps people learn. More importantly, I want to know how we can use such discoveries to write more effective books. The books I appreciate most are those that demonstrate not only mastery of the subject matter but also mastery of teaching.

Daniel's book list on for a rock solid python programming foundation

Daniel Zingaro Why did Daniel love this book?

I used this book for several years starting in 2013 when the first edition came out. It absolutely holds up today. Learning the Python language (the syntax) is one thing. Learning how to design programs using this syntax is another. We need both but, unfortunately, many books forgo the latter for the former. Not this book! I like the Problem Solving and Worked Example sections: they help learners apply a disciplined, step-by-step strategy to programming projects. There are multiple, varied contexts here as well, which helps capture a broader base of learners. Bonus feature: the Computing & Society boxes.

By Cay S. Horstmann, Rance D. Necaise,

Why should I read it?

1 author picked Python for Everyone as one of their favorite books, and they share why you should read it.

What is this book about?

Python for Everyone, 3rd Edition is an introduction to programming designed to serve a wide range of student interests and abilities, focused on the essentials, and on effective learning. It is suitable for a first course in programming for computer scientists, engineers, and students in other disciplines. This text requires no prior programming experience and only a modest amount of high school algebra. Objects are used where appropriate in early chapters and students start designing and implementing their own classes in Chapter 9. New to this edition are examples and exercises that focus on various aspects of data science.


Book cover of The R Book

Tilman M. Davies Author Of The Book of R: A First Course in Programming and Statistics

From my list on intro to programming and data science with R.

Why am I passionate about this?

I’m an applied statistician and academic researcher/lecturer at New Zealand’s oldest university – the University of Otago. R facilitates everything I do – research, academic publication, and teaching. It’s the latter part of my job that motivated my own book on R. From first-year statistics students who have never seen R to my own Ph.D. students using R to implement novel and highly complex statistical methods and models, my experience is that all ultimately love the ease with which the R language permits exploration, visualisation, analysis, and inference of one’s data. The ever-growing need in today’s society for skilled statisticians and data scientists means there's never been a better time to learn this essential language.

Tilman's book list on intro to programming and data science with R

Tilman M. Davies Why did Tilman love this book?

An authoritative tome on R. This book is the ultimate reference guide, heavy on statistical methods from the simple to the advanced. Of the 29 chapters, only the first five chapters or so have R syntactical and programming skills as their main focus; the remaining content highlights the many and varied statistical techniques R is capable of. I think this is a fantastic book to have on the shelf for people who are likely to need R and its contributed packages for a variety of different statistical analyses, but might not know where to initially start for any given statistical method.

By Michael J. Crawley,

Why should I read it?

1 author picked The R Book as one of their favorite books, and they share why you should read it.

What is this book about?

Hugely successful and popular text presenting an extensive and comprehensive guide for all R users The R language is recognized as one of the most powerful and flexible statistical software packages, enabling users to apply many statistical techniques that would be impossible without such software to help implement such large data sets. R has become an essential tool for understanding and carrying out research. This edition: * Features full colour text and extensive graphics throughout. * Introduces a clear structure with numbered section headings to help readers locate information more efficiently. * Looks at the evolution of R over the…


Book cover of A First Course in Statistical Programming with R

Tilman M. Davies Author Of The Book of R: A First Course in Programming and Statistics

From my list on intro to programming and data science with R.

Why am I passionate about this?

I’m an applied statistician and academic researcher/lecturer at New Zealand’s oldest university – the University of Otago. R facilitates everything I do – research, academic publication, and teaching. It’s the latter part of my job that motivated my own book on R. From first-year statistics students who have never seen R to my own Ph.D. students using R to implement novel and highly complex statistical methods and models, my experience is that all ultimately love the ease with which the R language permits exploration, visualisation, analysis, and inference of one’s data. The ever-growing need in today’s society for skilled statisticians and data scientists means there's never been a better time to learn this essential language.

Tilman's book list on intro to programming and data science with R

Tilman M. Davies Why did Tilman love this book?

From well-known authorities in the R-sphere (including a former R Core Team member), this is a long-standing text whose first edition was one of the early books intended to teach R to beginners. It provides concise instructions and examples on how R is used as a programming language before focusing on 'number-crunching' statistical methods that are typically seen as computationally intensive. One of the notable features of this book is the statistical methods at hand are not just illustrated using 'black-box' code--the reader is provided with the necessary mathematical detail to understand what's going on behind the scenes for those that are so inclined.

By W. John Braun, Duncan J. Murdoch,

Why should I read it?

1 author picked A First Course in Statistical Programming with R as one of their favorite books, and they share why you should read it.

What is this book about?

This third edition of Braun and Murdoch's bestselling textbook now includes discussion of the use and design principles of the tidyverse packages in R, including expanded coverage of ggplot2, and R Markdown. The expanded simulation chapter introduces the Box-Muller and Metropolis-Hastings algorithms. New examples and exercises have been added throughout. This is the only introduction you'll need to start programming in R, the computing standard for analyzing data. This book comes with real R code that teaches the standards of the language. Unlike other introductory books on the R system, this book emphasizes portable programming skills that apply to most…


Book cover of Computer Age Statistical Inference, Algorithms, Evidence, and Data Science

Ron S. Kenett Author Of The Real Work of Data Science: Turning Data into Information, Better Decisions, and Stronger Organizations

From my list on how numbers turn into information.

Why am I passionate about this?

I was trained as a mathematician but have always been motivated by problem-solving challenges. Statistics and analytics combine mathematical models with statistical thinking. My career has always focused on this combination and, as a statistician, you can apply it in a wide range of domains. The advent of big data and machine learning algorithms has opened up new opportunities for applied statisticians. This perspective complements computer science views on how to address data science. The Real Work of Data Science, covers 18 areas (18 chapters) that need to be pushed forward in order to turning data into information, better decisions, and stronger organizations

Ron's book list on how numbers turn into information

Ron S. Kenett Why did Ron love this book?

The text covers classic statistical inference, early computer-age methods, and twenty-century topics. This puts a unique perspective on current analytic technologies labeled machine learning, artificial intelligence, and statical learning. The examples used provide a powerful description of the methods covered and the compare and contrast sections highlight the evolution of analytics. This book by Efron and Hastie is a natural follow-up source for readers interested in more details.

By Bradley Efron, Trevor Hastie,

Why should I read it?

2 authors picked Computer Age Statistical Inference, Algorithms, Evidence, and Data Science as one of their favorite books, and they share why you should read it.

What is this book about?

The twenty-first century has seen a breathtaking expansion of statistical methodology, both in scope and influence. 'Data science' and 'machine learning' have become familiar terms in the news, as statistical methods are brought to bear upon the enormous data sets of modern science and commerce. How did we get here? And where are we going? How does it all fit together? Now in paperback and fortified with exercises, this book delivers a concentrated course in modern statistical thinking. Beginning with classical inferential theories - Bayesian, frequentist, Fisherian - individual chapters take up a series of influential topics: survival analysis, logistic…