Machine-learning

A Dashboard with Weak AI

Most dashboards display the same variables each day. I wanted something that was fresh - that changed based on spikes in the data.

Lessons From My First Kaggle Contest

Kaggle is a forum for interacting with other data scientists and competing to see who can write code that will best predict features of data. It’s a way to test your skills at statistics and machine learning, and to do a lot of human learning in the process (sorry, bad pun). When I entered the contest to categorize crimes that occurred in San Francisco, my initial goal was to do better than random chance.

Can Machine Learning Tell Two Fictional Characters Apart?

I wanted to see if it was possible to train a model to detect the difference between two fictional authors created by the same novelist based only on the frequency of common stop words, e.g., the, at, is

Using Machine Learning to Detect Stylometric Differences Between Nick and Amy in Gone Girl

I wanted to see if it was possible to train a model to detect the difference between two fictional authors created by the same novelist based only on the frequency of common stop words, e.g., “the.” It worked: The randomForest model correctly selected Nick 93% of the time and Amy 91%. Background When I first started using R for data analysis, I was mesmerized by all of the packages and what they made possible.