# Category General

## Before You Start: Why Are You Interested In Data Science?

When I was younger, I loved puzzles and spent hours trying to put together the jigsaw puzzle in my family’s living room. I always enjoyed the feeling of satisfaction when I finally completed it. Puzzles are an excellent analogy for…

## 37 Research Topics In Data Science To Stay On Top Of

As a data scientist, staying on top of the latest research in your field is essential. The data science landscape changes rapidly, and new techniques and tools are constantly being developed. To keep up with the competition, you need to…

## ML 101: 8 Heatmaps in Python (Full Code)

Heatmaps are great for quickly visualizing data that normally isn’t easy to ingest. However, it sometimes feels impossible to find a coding resource that shows you how to code up these heatmaps in Python, and what they’ll look like when…

## ML 101: Gini Index vs. Entropy for Decision Trees (Python)

The Gini Index and Entropy are two important concepts in decision trees and data science. While both seem similar, underlying mathematical differences separate the two. Understanding these subtle differences is important as one may work better for your machine learning…

## Ultimate Guide: F1 Score In Machine Learning

While you may be more familiar with choosing Precision and Recall for your machine learning algorithms, there is a statistic that takes advantage of both.  The F1 Score is a statistic also used to measure the performance of a machine-learning…

## ML 101: Parameter Versus Variable [MUST KNOW]

Sometimes mistakenly used interchangeably, we’re here to tell you that a parameter is not a variable. However, depending on the context – a parameter and a variable can mean many different things. In this post, we’ll break down the following:…

## Machine Learning 101: Criterion vs Predictor (With Coded Examples)

In data science, there are many different ways to slice the pie. While many refer to independent and dependent variables differently, they usually mean the same thing. Your predictor variables are your independent variables, and with these, you’ll (hopefully) be…

## Machine Learning 101: Normal Distribution vs Uniform Distribution

Normal Distribution Vs. Uniform Distribution Python Code Using Pandas import numpy as np import pandas as pd # generate an array of normal values normal_distribution = np.random.normal(size=100000) # generate an array of uniform values uniform_distribution = np.random.uniform(size=100000) # create a…

## ML 101: Feature Selection with SelectKBest Using Scikit-Learn (Python)

In some machine learning problems, it’s not uncommon to have thousands of variables. While this is nice, keeping that many variables in your dataset will lead to many problems down the road. But don’t worry, we’ll teach you how to…