The most recommended data science books

Who picked these books? Meet our 18 experts.

18 authors created a book list connected to data science, and here are their favorite data science books.
Shepherd is reader supported. When you buy books, we may earn an affiliate commission.

What type of data science book?

Loading...
Loading...

Book cover of Python for Everyone

Daniel Zingaro Author Of Learn to Code by Solving Problems: A Python Programming Primer

From my list on for a rock solid python programming foundation.

Why am I passionate about this?

Some programmers learn through online articles, videos, and blog posts. Not me. I need a throughline—a consistent, expert distillation of the material to take me from where I am to where I want to be. I am not good at patching together information from disparate sources. I need a great book. I have a PhD in computer science education, and I want to know what helps people learn. More importantly, I want to know how we can use such discoveries to write more effective books. The books I appreciate most are those that demonstrate not only mastery of the subject matter but also mastery of teaching.

Daniel's book list on for a rock solid python programming foundation

Daniel Zingaro Why did Daniel love this book?

I used this book for several years starting in 2013 when the first edition came out. It absolutely holds up today. Learning the Python language (the syntax) is one thing. Learning how to design programs using this syntax is another. We need both but, unfortunately, many books forgo the latter for the former. Not this book! I like the Problem Solving and Worked Example sections: they help learners apply a disciplined, step-by-step strategy to programming projects. There are multiple, varied contexts here as well, which helps capture a broader base of learners. Bonus feature: the Computing & Society boxes.

By Cay S. Horstmann, Rance D. Necaise,

Why should I read it?

1 author picked Python for Everyone as one of their favorite books, and they share why you should read it.

What is this book about?

Python for Everyone, 3rd Edition is an introduction to programming designed to serve a wide range of student interests and abilities, focused on the essentials, and on effective learning. It is suitable for a first course in programming for computer scientists, engineers, and students in other disciplines. This text requires no prior programming experience and only a modest amount of high school algebra. Objects are used where appropriate in early chapters and students start designing and implementing their own classes in Chapter 9. New to this edition are examples and exercises that focus on various aspects of data science.


Book cover of All-in On AI: How Smart Companies Win Big with Artificial Intelligence

Roger W. Hoerl Author Of Statistical Thinking: Improving Business Performance

From my list on AI and data science that are actually readable.

Why am I passionate about this?

As a professional statistician, I am naturally interested in AI and data science. However, in our current information age, everyone, in all segments of society, needs to understand the basics of AI and data science. These basics include such things as what these disciplines are, what they can contribute to society, and perhaps most importantly, what can go wrong. However, I have found that much of the literature on these topics is highly technical and beyond the reach of most readers. These books are specifically selected because they are readable by virtually everyone, and yet convey the key concepts needed to be data-literate in the 21st century. Enjoy!

Roger's book list on AI and data science that are actually readable

Roger W. Hoerl Why did Roger love this book?

Books on AI often go to extremes, either promoting it as the solution to all the world’s problems, or depicting it as an evil that will destroy humanity.

This book is much more practical, and based on experience using AI in actual business applications. It is the result of considerable research, involving investigation of applications not only in silicon-valley, but from various business sectors, such as Airbus, Ping, Progressive Insurance, and Capital One Bank.

Don’t let the title fool you; this book is not simply a promotion of AI, but addresses the practical issues that have to be considered if success is to be achieved. For example, they argue that “the most important aspect in AI success is not machinery, but human leadership, behavior, and change.”

By Thomas H. Davenport, Nitin Mittal,

Why should I read it?

1 author picked All-in On AI as one of their favorite books, and they share why you should read it.

What is this book about?

A Wall Street Journal bestseller

A Publisher's Weekly bestseller

A fascinating look at the trailblazing companies using artificial intelligence to create new competitive advantage, from the author of the business classic, Competing on Analytics, and the head of Deloitte's US AI practice.

Though most organizations are placing modest bets on artificial intelligence, there is a world-class group of companies that are going all-in on the technology and radically transforming their products, processes, strategies, customer relationships, and cultures.

Though these organizations represent less than 1 percent of large companies, they are all high performers in their industries. They have better business…


Book cover of The Golem: What You Should Know about Science

Aubrey Clayton Author Of Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science

From my list on for data scientists trying to be ethical people.

Why am I passionate about this?

I studied statistics and data science for years before anyone ever suggested to me that these topics might have an ethical dimension, or that my numerical tools were products of human beings with motivations specific to their time and place. I’ve since written about the history and philosophy of mathematical probability and statistics, and I’ve come to understand just how important that historical background is and how critically important it is that the next generation of data scientists understand where these ideas come from and their potential to do harm. I hope anyone who reads these books avoids getting blinkered by the ideas that data = objectivity and that science is morally neutral.

Aubrey's book list on for data scientists trying to be ethical people

Aubrey Clayton Why did Aubrey love this book?

The thing you should know about science is that it’s a human enterprise. As a result, it’s dependent on human factors like social consensus and prejudice. In this series of case studies of famously expensive and difficult-to-replicate experiments probing the limits of scientific understanding from biology to theoretical physics, Collins and Pinch show how scientific knowledge gathering is rarely straightforward because there are always alternative explanations available for the data. Was the phenomenon real or was the experiment set up badly? We can never know for sure, but we decide collectively what we believe. Scientists are experts participating in human culture, they argue, not mysterious clergy issuing declarations of absolute truth.

By Harry M. Collins, Trevor Pinch,

Why should I read it?

1 author picked The Golem as one of their favorite books, and they share why you should read it.

What is this book about?

Harry Collins and Trevor Pinch liken science to the Golem, a creature from Jewish mythology, powerful yet potentially dangerous, a gentle, helpful creature that may yet run amok at any moment. Through a series of intriguing case studies the authors debunk the traditional view that science is the straightforward result of competent theorisation, observation and experimentation. The very well-received first edition generated much debate, reflected in a substantial new Afterword in this second edition, which seeks to place the book in what have become known as 'the science wars'.


Book cover of The Practice of Management

Jeremy Adamson Author Of Minding the Machines: Building and Leading Data Science and Analytics Teams

From my list on for data science and analytics leaders.

Why am I passionate about this?

I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector.  I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics. 

Jeremy's book list on for data science and analytics leaders

Jeremy Adamson Why did Jeremy love this book?

Management as a skill is typically established and honed by osmosis, mimicry, and corporate crash courses. Data scientists pursuing management roles need to understand management from base principles to create meaningful change and establish productive team conventions. After almost 70 years, Drucker’s book still stands up as a foundational piece of reading.

By Peter F. Drucker,

Why should I read it?

1 author picked The Practice of Management as one of their favorite books, and they share why you should read it.

What is this book about?

A classic since its publication in 1954, The Practice of Management was the first book to look at management as a whole and being a manager as a separate responsibility. The Practice of Management created the discipline of modern management practices. Readable, fundamental, and basic, it remains an essential book for students, aspiring managers, and seasoned professionals.


Book cover of Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals

Jeremy Adamson Author Of Minding the Machines: Building and Leading Data Science and Analytics Teams

From my list on for data science and analytics leaders.

Why am I passionate about this?

I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector.  I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics. 

Jeremy's book list on for data science and analytics leaders

Jeremy Adamson Why did Jeremy love this book?

Data scientists and analytics specialists are great at building models and algorithms, but often wrap them in a presentation or dashboard that diminishes their value and reduces the likelihood of their work being adopted. This book encourages practitioners to always consider the last mile and to pay as much attention to presentation and aesthetics as we do to the model itself. 

By Brent Dykes,

Why should I read it?

1 author picked Effective Data Storytelling as one of their favorite books, and they share why you should read it.

What is this book about?

Master the art and science of data storytelling-with frameworks and techniques to help you craft compelling stories with data.

The ability to effectively communicate with data is no longer a luxury in today's economy; it is a necessity. Transforming data into visual communication is only one part of the picture. It is equally important to engage your audience with a narrative-to tell a story with the numbers. Effective Data Storytelling will teach you the essential skills necessary to communicate your insights through persuasive and memorable data stories.

Narratives are more powerful than raw statistics, more enduring than pretty charts. When…


Book cover of R in Action: Data Analysis and Graphics with R

Tilman M. Davies Author Of The Book of R: A First Course in Programming and Statistics

From my list on intro to programming and data science with R.

Why am I passionate about this?

I’m an applied statistician and academic researcher/lecturer at New Zealand’s oldest university – the University of Otago. R facilitates everything I do – research, academic publication, and teaching. It’s the latter part of my job that motivated my own book on R. From first-year statistics students who have never seen R to my own Ph.D. students using R to implement novel and highly complex statistical methods and models, my experience is that all ultimately love the ease with which the R language permits exploration, visualisation, analysis, and inference of one’s data. The ever-growing need in today’s society for skilled statisticians and data scientists means there's never been a better time to learn this essential language.

Tilman's book list on intro to programming and data science with R

Tilman M. Davies Why did Tilman love this book?

This provides a superb balance between technical aspects of R coding and the statistical methods that motivate its use. It's rare to find a book on topics like this that are written with Kabacoff's easygoing yet precise style, which makes it ideal for beginners. From my own experience, it is obvious the author has spent many years teaching this type of content, knowing where things deserve extra explanation up front and where other more technical details can be relegated to more advanced texts.

By Robert I. Kabacoff,

Why should I read it?

1 author picked R in Action as one of their favorite books, and they share why you should read it.

What is this book about?

DESCRIPTION

R is a powerful language for statistical computing and graphics that can handle virtually any data-crunching task. It runs on all important platforms and provides thousands of useful specialized modules and utilities. This makes R a great way to get meaningful information from mountains of raw data.



R in Action, Second Edition is language tutorial focused on practical problems. Written by a research methodologist, it takes a direct and modular approach to quickly give readers the information they need to produce useful results. Focusing on realistic data analyses and a comprehensive integration of graphics, it follows the steps that…


Book cover of Jumpstart Snowflake: A Step-by-Step Guide to Modern Cloud Analytics

Valliappa Lakshmanan Author Of Data Science on the Google Cloud Platform: Implementing End-To-End Real-Time Data Pipelines: From Ingest to Machine Learning

From my list on if you want to become a data scientist.

Why am I passionate about this?

I started my career as a research scientist building machine learning algorithms for weather forecasting. Twenty years later, I found myself at a precision agriculture startup creating models that provided guidance to farmers on when to plant, what to plant, etc. So, I am part of the movement from academia to industry. Now, at Google Cloud, my team builds cross-industry solutions and I see firsthand what our customers need in their data science teams. This set of books is what I suggest when a CTO asks how to upskill their workforce, or when a graduate student asks me how to break into the industry.

Valliappa's book list on if you want to become a data scientist

Valliappa Lakshmanan Why did Valliappa love this book?

In industry, your data is very likely to live within a data warehouse such as BigQuery, Redshift, or Snowflake. Therefore, to be an effective data scientist in the industry, you should learn how to use data warehouses effectively. 

Once you learn data warehousing and SQL with any one of these products, it is quite easy to pick up another. So which one do you start with?

You can use Snowflake on all three of the major public clouds. Because it’s a standalone product, it is the most similar to a “traditional” data warehouse and can be picked up easily even if you are not familiar with cloud computing. That makes it a good data warehouse to start with, and is the reason my second book pick is this book on Snowflake.

BigQuery is also available on all three major public clouds, but it works best (and is used most commonly)…

By Dmitry Anoshin, Dmitry Shirokov, Donna Strok

Why should I read it?

1 author picked Jumpstart Snowflake as one of their favorite books, and they share why you should read it.

What is this book about?

Explore the modern market of data analytics platforms and the benefits of using Snowflake computing, the data warehouse built for the cloud.

With the rise of cloud technologies, organizations prefer to deploy their analytics using cloud providers such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform. Cloud vendors are offering modern data platforms for building cloud analytics solutions to collect data and consolidate into single storage solutions that provide insights for business users. The core of any analytics framework is the data warehouse, and previously customers did not have many choices of platform to use.

Snowflake was…


Book cover of A First Course in Statistical Programming with R

Tilman M. Davies Author Of The Book of R: A First Course in Programming and Statistics

From my list on intro to programming and data science with R.

Why am I passionate about this?

I’m an applied statistician and academic researcher/lecturer at New Zealand’s oldest university – the University of Otago. R facilitates everything I do – research, academic publication, and teaching. It’s the latter part of my job that motivated my own book on R. From first-year statistics students who have never seen R to my own Ph.D. students using R to implement novel and highly complex statistical methods and models, my experience is that all ultimately love the ease with which the R language permits exploration, visualisation, analysis, and inference of one’s data. The ever-growing need in today’s society for skilled statisticians and data scientists means there's never been a better time to learn this essential language.

Tilman's book list on intro to programming and data science with R

Tilman M. Davies Why did Tilman love this book?

From well-known authorities in the R-sphere (including a former R Core Team member), this is a long-standing text whose first edition was one of the early books intended to teach R to beginners. It provides concise instructions and examples on how R is used as a programming language before focusing on 'number-crunching' statistical methods that are typically seen as computationally intensive. One of the notable features of this book is the statistical methods at hand are not just illustrated using 'black-box' code--the reader is provided with the necessary mathematical detail to understand what's going on behind the scenes for those that are so inclined.

By W. John Braun, Duncan J. Murdoch,

Why should I read it?

1 author picked A First Course in Statistical Programming with R as one of their favorite books, and they share why you should read it.

What is this book about?

This third edition of Braun and Murdoch's bestselling textbook now includes discussion of the use and design principles of the tidyverse packages in R, including expanded coverage of ggplot2, and R Markdown. The expanded simulation chapter introduces the Box-Muller and Metropolis-Hastings algorithms. New examples and exercises have been added throughout. This is the only introduction you'll need to start programming in R, the computing standard for analyzing data. This book comes with real R code that teaches the standards of the language. Unlike other introductory books on the R system, this book emphasizes portable programming skills that apply to most…


Book cover of Effective Pandas

Valliappa Lakshmanan Author Of Data Science on the Google Cloud Platform: Implementing End-To-End Real-Time Data Pipelines: From Ingest to Machine Learning

From my list on if you want to become a data scientist.

Why am I passionate about this?

I started my career as a research scientist building machine learning algorithms for weather forecasting. Twenty years later, I found myself at a precision agriculture startup creating models that provided guidance to farmers on when to plant, what to plant, etc. So, I am part of the movement from academia to industry. Now, at Google Cloud, my team builds cross-industry solutions and I see firsthand what our customers need in their data science teams. This set of books is what I suggest when a CTO asks how to upskill their workforce, or when a graduate student asks me how to break into the industry.

Valliappa's book list on if you want to become a data scientist

Valliappa Lakshmanan Why did Valliappa love this book?

Even if you are ultimately going to be working with terabytes of data, you’ll start out doing exploratory data analysis. The tool that you’ll use for that is most likely going to be Pandas. One of the best investments that you can make when becoming a data scientist is to become a Pandas expert, and there is no better book than Harrison’s to help you get there. Plus, many of the interview questions you will face during the hiring process will probably involve Pandas. Blow your interviewers out of the water by showing them corners of the Pandas library they didn’t even know!

By Matt Harrison,

Why should I read it?

1 author picked Effective Pandas as one of their favorite books, and they share why you should read it.

What is this book about?

Best practices for manipulating data with Pandas. This book will arm you with years of knowledge and experience that are condensed into an easy to follow format. Rather than taking months reading blogs and websites and searching mailing lists and groups, this book will teach you how to write good Pandas code.

It covers: Series manipulation Creating columns Summary statistics Grouping, pivoting, and cross-tabulation Time series data Visualization Chaining Debugging code and more...


Book cover of Competing on Analytics: The New Science of Winning

Jeremy Adamson Author Of Minding the Machines: Building and Leading Data Science and Analytics Teams

From my list on for data science and analytics leaders.

Why am I passionate about this?

I am a leader in analytics and AI strategy, and have a broad range of experience in aviation, energy, financial services, and the public sector.  I have worked with several major organizations to help them establish a leadership position in data science and to unlock real business value using advanced analytics. 

Jeremy's book list on for data science and analytics leaders

Jeremy Adamson Why did Jeremy love this book?

This is a foundational book on analytics and data science as a business function and helped to shape the development of the practice. It provides a view of the discipline through a business lens and avoids deep technical examinations. Though much has changed in the 15 years since it was originally published, it is still essential reading for a leader in the field. No book since has captured as well the competitive differentiation that analytics provides.

By Thomas H. Davenport, Jeanne G. Harris,

Why should I read it?

1 author picked Competing on Analytics as one of their favorite books, and they share why you should read it.

What is this book about?

You have more information at hand about your business environment than ever before. But are you using it to "out-think" your rivals? If not, you may be missing out on a potent competitive tool. In Competing on Analytics: The New Science of Winning, Thomas H. Davenport and Jeanne G. Harris argue that the frontier for using data to make decisions has shifted dramatically. Certain high-performing enterprises are now building their competitive strategies around data-driven insights that in turn generate impressive business results. Their secret weapon? Analytics: sophisticated quantitative and statistical analysis and predictive modeling. Exemplars of analytics are using new…