data science from scratch review

Data Science from Scratch Review: Find Out What’s Under The Hood

I remember when I first decided that I wanted to learn data science.

I was fresh out of college and working a job as an engineer that I wasn’t in love with.

I knew I wanted a change, but I needed to figure out where to start.

So, I did what any 20-something-year-old does these days: I went to google and looked for solutions.

Along with finding 10,000 random Youtube tutorials, I learned about Coursera, Udemy (I’ve taken SO many Udemy courses), and other online course providers.

Here is me with my copy

data science from scratch

And, for a while, I took advantage of them.

I completed dozens of courses in programming, math, and statistics.

But, after a while, I realized something wasn’t quite right.

Sure, I learned a lot of helpful information.

But, I wasn’t making the progress that I wanted to make.

Nothing was connecting.

I was learning (very disconnectedly) the “why it’s done” but not the “how it’s done” of data science.

That’s when a friend recommended a book to me: “Data Science From Scratch.”

And, let me tell you, it was a game-changer for me.


What is Data Science from Scratch

If you’re interested in learning data science, Data Science from Scratch is a book by Joel Grus – that takes you from A to Z in data science.

This book will (literally) teach you the basics of data science, starting with the very first step, without using any modules (no pandas here!!) or shortcuts.

You’ll learn about fundamental concepts such as basic statistics, core data science algorithms, visualizations every data scientist uses, and so much more.

There are 27 chapters within the book; each is explained well, as Joel Grus provides many examples, exercises, and quick tips (like most O’Reilly Books) to help reinforce your understanding.

Below I listed the chapters from the Table of Contents:

  1. Introduction
  2. A Crash Course In Python
  3. Visualizing Data
  4. Linear Algebra
  5. Statistics
  6. Probability
  7. Hypothesis and Inference
  8. Gradient Descent
  9. Getting Data
  10. Working With Data
  11. Machine Learning
  12. k-Nearest Neighbors
  13. Naive Bayes
  14. Simple Linear Regression
  15. Multiple Regression
  16. Logistic Regression
  17. Decision Trees
  18. Neural Networks
  19. Deep Learning
  20. Clustering
  21. Natural Language Processing
  22. Network Analysis
  23. Recommender Systems
  24. Databases and SQL
  25. MapReduce
  26. Data Ethics
  27. Next in Data Science

In just 27 chapters, Joel can take you from “What is Python” to NLP and some powerful machine learning models!

While this seems like a “long” book, it’s only about ~350 pages, so most chapters are thoroughly explained in about 15 pages.

If you’re looking for an “under the hood” guide to data science, the one that shows you how these algorithms are made (and why), then Data Science from Scratch is worth picking up and reading.

my favorite data science books


A Couple of My Favorite Things About Data Science from Scratch

1. Learn the actual code that makes data science, data science.

2. Understand the principles, code, and ideas behind famous data science software, libraries, modules, processes, and toolkits.

3. Master the tools you need to be a data scientist.

4. Get familiar with the simple mathematics that data science is built upon.

5. Gain the “hacking” skills needed to thrive as a data scientist in the real world.

6. A Lineage look at data science. Taking you from simple topics like visualizing your data through recommender systems and natural language processing.

7. Familiarize yourself with reading other data scientists’ code

8. Learn More Python!

9. Possible to work through even without an awesome CPU/Computer set-up.

As you can see, one of the top benefits of Data Science from Scratch is that you learn the actual code that makes data science work.

This is important because it allows you to understand the principles and ideas behind data science software, libraries, modules, processes, and toolkits.

Not only that, but while you’re taking all of this in – you’re gaining insights that many of your teammates won’t have.

advantages

You’re essentially re-building the tools you’ll use daily as a data scientist.

Think about it this way – would you trust a mechanic that doesn’t know how an engine works?

(I wouldn’t)

Learning these skills will come in handy as you move on to more complex concepts and algorithms later in your data science journey.

Finally, the book contains new material on deep learning, statistics, and natural language processing.

This makes it an excellent resource for staying up-to-date with the latest advancements in data science.


Best Features of Data Science from Scratch

  • Comprehensive guide to data science
  • It Covers Python, linear algebra, statistics, probability, hypothesis testing, inference, machine learning, etc.
  • Well written and easy to understand
  • Includes exercises and examples

One of the best features of Data Science from Scratch is that it is a comprehensive guide to data science.

This means you will learn about all the essential topics and concepts related to data science, such as Python, linear algebra, statistics, probability, hypothesis testing, inference, machine learning, etc.

The whole book is basically a walk-through, so you won’t be short on “engaged learning.”

walk through


Data Science from Scratch Pros and Cons

While there are many reasons to purchase Data Science From Scratch, we wanted to give you (in our opinion) some of the Pros and Cons below.

Pros

  • Learn the actual code that makes data science work, giving you an informational advantage over your peers.
  • Understand the principles and ideas behind data science libraries, frameworks, modules, and toolkits
  • With an understanding of how things work, you’ll be able to add improvements and innovations to your data science projects.
  • Get comfortable with mathematics and statistics at the core of data science.
  • Fill any “gaps” you may not know you have, allowing you complete confidence in interviews.
  • Gain hacking skills needed to get started in data science
  • See how other data scientists write actual production-ready code.
  • Tons of Python 3 experience
  • Physical Book (Always my preference) that can be marked up and taken anywhere (Write Code On The couch!!)


Cons

  • The book does not go super in-depth on any particular topic, mainly showing how to code all parts of data science and then moving on.
  • Some readers may find the book too coding dense, as a bit of python skill is required.
  • Some chapters need more exercises and examples.
  • More the “how” it’s done, not “why” it’s done (I liked this, but it could be a problem for some).
  • Not a “Quick-Look-up” Resource.

pros and cons


The book Data Science From Scratch is an excellent resource for those wanting to learn how to code in data science.

It goes in-depth on various topics and provides readers with plenty of exercises and examples.

However, some readers may find the book too coding-dense, as a bit of python skill is required.

Additionally, some chapters need more exercises and examples.

The book does provide readers with a wealth of knowledge on “how” data science is done, but it falls short of explaining “why” specific methods are used.

Ultimately, this isn’t a Quick-Look-Up Resource, but it’s an excellent resource for those wanting to learn more about coding in data science.


My first experience with Data Science From Scratch

I was always one for understanding the how and why of things.

When I was younger, I’d always bug my dad about why things worked and was always interested in going deeper on topics than most.

That’s why I needed a book to show me how everything worked when I started learning data science.

When I started, there were so many different modules and packages that people were throwing at me, and it was all so overwhelming.

But once I found this book, everything started to make sense.

The author walked me through different algorithms and techniques and showed me how they could be applied in practice.

While I still use the packages that all those people were throwing out at me, I now know why they were throwing them at me – which is a HUGE difference.

And who knows – one day, I’ll write my book on data science!


Is Data Science From Scratch Good For Interviews?

This book is great for data scientists who want to excel in technical interviews.

As the title suggests, it is highly technical and code-heavy.

This can be a benefit or a drawback depending on your learning style.

If you like to ease into topics and need a lot of hand-holding, this book might not be for you.

hand holding

However, if you are a self-learner who likes to figure things out on your own, this book will prepare you very well for the types of questions you will face in technical interviews.

The book covers various topics, from basic statistics to machine learning algorithms.

The author provides code examples in each chapter to illustrate the key concepts.

On a personal note, I had the famous XOR problem show up as a question in an interview, and I remembered from the Deep Learning Chapter (19) what had changed so that we could solve it.

Since this book could be coded through in a couple of weekends, it is an excellent resource for anyone who wants to brush up on their data science skills before an interview.


Conclusion: Should you buy Data Science From Scratch?

Overall, Data Science from Scratch is an excellent introduction to data science.

It covers a lot of ground and is packed with examples and exercises.

The book is well-written and easy to understand.

It is a great resource for staying up-to-date with some of the more timeless advances in data science.

If you are interested in learning Python, linear algebra, statistics, probability, hypothesis testing, inference, machine learning, etc., this book is for you.

Let me know in the comments if you decide to pick up this book, and I’d love to hear your thoughts on it!!


Other Book Recommendations

At enjoymachinelearning.com, we only recommend six books.

Those 6 books are the ones in the images below

my favorite data science books

Check out some of those posts:

Stewart Kaplan