What is Bootstrap Replication & How Do I Use it?

What is bootstrap replication? For those catching up here, bootstrap sampling refers to the process of sampling a given dataset ‘with replacement’…. And this is where most people get lost. You take many samples and build a distribution to mark your confidence interval. Lets take a quick example. Crypto at College Lets say that you …

Build your First Chatbot in three minutes

30 sec explanation of how Chatbots work Whether you’re a data scientist, data analyst, or software engineer; and whether you have a strong handle on NLP tools and approaches, if you’re here, you’ve likely wondered how a chatbot works and how to build one, but haven’t ever had the need or chance. Well… you’re here …

Three Key Charts for Visualizing Proportion Data

Proportion data examples Whatever your application of data analytics & data science, there are proportions everywhere. Proportions are all about understanding the different parts that make up a whole. Proportions are pretty much just a count of something across a given categorical variable. That could be number of customers across different industries, number of sales …

Build Your First Neural-Network with Keras

What is Keras? Keras is a deep learning framework that sits on top of backend frameworks like TensorFlow. Why use Keras? Keras is excellent because it allows you to experiment with different neural-nets with great speed! It sits atop other excellent frameworks like TensorFlow, and lends well to the experienced as well as novice data …

Random Forest for Classification in R

Introduction How are bagged trees & random forests similar? Random forests are similar to bagged trees in that each tree in a random forest or bagged tree model are trained on random subsets of the data. In fact this process of sampling different groups of the data to train separate models is an ensemble method …

Learn Bagged Trees for Classification in R

Introduction Hi there! Get ready to become a bagged tree expert! Bagged trees are famous for improving the predictive capability of a single decision tree. The way we use & evaluate them in R is also very similar to decision trees. Check out my other post on decision trees if you aren’t familiar with them, …

Revolutionize Product with AB Testing in R

Introduction What is Ab testing? When it comes to your typical product or engineering org, team members are often left wondering whether the thing they did had an impact, or whether the option they went with among many different design options was actual the best. As these organizations want to move towards data informed design …

Learn Classification with Decision Trees in R

Introduction When it comes to classification, using a decision tree classifier is one of the easiest to use. Why to use a decision tree Incredibly easy to interpret It handles missing data & outliers very well and as such requires far less up front cleaning You get to forego the categorical variable encoding as decision …

How to do feature engineering with categorical data in R

Purpose: Machine learning models have difficulty interpreting categorical data; feature engineering allows us to re-contextualize our categorical data to improve the rigor of our machine learning models. Feature engineering also provides added layers of perspective to data analysis. The big question that feature engineering approaches solve is; how to I utilize my data in interesting …

Getting Started with Experimental Design in R

This quick blog is designed to help you get off to the races quickly in world of data science; and here specifically, Experimental design. Enjoy! When it comes to experiemental design there are three main streps it can be broken down to: PlanningDesignAnalysis Planning & Design Planning should always begin with a well formed hypothesis. …