Machine Learning, Simplified. Be Apart of the Conversation.

What’s all the buzz about? Machine learning is a concept and frequently dropped buzz word in today’s tech environment that leaves a lot to be desired as far as explanation goes. People often refer to machine learning algorithms as a black box; and while there may be certain aspects of machine learning that may lack …

A Must-have Algorithm for Your Machine Learning Toolbox: XGBoost

One of the most performant machine learning algorithms XGBoost is a supervised learning algorithm that can be used for both regression & classification. Like all algorithms it has its virtues & draws, of which we’ll be sure to walk through. For this post, we’ll just be learning about XGBoost from the context of classification problems. …

Random Forest for Classification in R

Introduction How are bagged trees & random forests similar? Random forests are similar to bagged trees in that each tree in a random forest or bagged tree model are trained on random subsets of the data. In fact this process of sampling different groups of the data to train separate models is an ensemble method …

Learn Bagged Trees for Classification in R

Introduction Hi there! Get ready to become a bagged tree expert! Bagged trees are famous for improving the predictive capability of a single decision tree. The way we use & evaluate them in R is also very similar to decision trees. Check out my other post on decision trees if you aren’t familiar with them, …

Revolutionize Product with AB Testing in R

Introduction What is Ab testing? When it comes to your typical product or engineering org, team members are often left wondering whether the thing they did had an impact, or whether the option they went with among many different design options was actual the best. As these organizations want to move towards data informed design …

Learn Classification with Decision Trees in R

Introduction When it comes to classification, using a decision tree classifier is one of the easiest to use. Why to use a decision tree Incredibly easy to interpret It handles missing data & outliers very well and as such requires far less up front cleaning You get to forego the categorical variable encoding as decision …

How to do feature engineering with categorical data in R

Purpose: Machine learning models have difficulty interpreting categorical data; feature engineering allows us to re-contextualize our categorical data to improve the rigor of our machine learning models. Feature engineering also provides added layers of perspective to data analysis. The big question that feature engineering approaches solve is; how to I utilize my data in interesting …