Machine Learning, Simplified. Be Apart of the Conversation.

What’s all the buzz about? Machine learning is a concept and frequently dropped buzz word in today’s tech environment that leaves a lot to be desired as far as explanation goes. People often refer to machine learning algorithms as a black box; and while there may be certain aspects of machine learning that may lack …

A Must-have Algorithm for Your Machine Learning Toolbox: XGBoost

One of the most performant machine learning algorithms XGBoost is a supervised learning algorithm that can be used for both regression & classification. Like all algorithms it has its virtues & draws, of which we’ll be sure to walk through. For this post, we’ll just be learning about XGBoost from the context of classification problems. …

Build your First Chatbot in three minutes

30 sec explanation of how Chatbots work Whether you’re a data scientist, data analyst, or software engineer; and whether you have a strong handle on NLP tools and approaches, if you’re here, you’ve likely wondered how a chatbot works and how to build one, but haven’t ever had the need or chance. Well… you’re here …

Three Key Charts for Visualizing Proportion Data

Proportion data examples Whatever your application of data analytics & data science, there are proportions everywhere. Proportions are all about understanding the different parts that make up a whole. Proportions are pretty much just a count of something across a given categorical variable. That could be number of customers across different industries, number of sales …

Random Forest for Classification in R

Introduction How are bagged trees & random forests similar? Random forests are similar to bagged trees in that each tree in a random forest or bagged tree model are trained on random subsets of the data. In fact this process of sampling different groups of the data to train separate models is an ensemble method …

Learn Bagged Trees for Classification in R

Introduction Hi there! Get ready to become a bagged tree expert! Bagged trees are famous for improving the predictive capability of a single decision tree. The way we use & evaluate them in R is also very similar to decision trees. Check out my other post on decision trees if you aren’t familiar with them, …

Revolutionize Product with AB Testing in R

Introduction What is Ab testing? When it comes to your typical product or engineering org, team members are often left wondering whether the thing they did had an impact, or whether the option they went with among many different design options was actual the best. As these organizations want to move towards data informed design …

Learn Classification with Decision Trees in R

Introduction When it comes to classification, using a decision tree classifier is one of the easiest to use. Why to use a decision tree Incredibly easy to interpret It handles missing data & outliers very well and as such requires far less up front cleaning You get to forego the categorical variable encoding as decision …

Getting Started with Experimental Design in R

This quick blog is designed to help you get off to the races quickly in world of data science; and here specifically, Experimental design. Enjoy! When it comes to experiemental design there are three main streps it can be broken down to: PlanningDesignAnalysis Planning & Design Planning should always begin with a well formed hypothesis. …

Principal Component Analysis in R

Hi there! Welcome to my blog on pricipal component analysis in R. Purpose: PCA is a dimensionality rediction technique; meaning that each additional variable you’re including in your modeling process represents a dimension. What does it do?: In terms of what PCA actually does, it takes a dataset with high dimensionality, and reduces them down …