My introduction to machine learning

29/12/2019 • 2 minute read • Go home

A friend at work recently showed me some work they'd been doing with machine learning using fast.ai, a high level machine learning framework built on PyTorch. The results they'd got we're really impressive, and it inspired me to go away and do some experimenting. They suggested I watch Practical Deep Learning for Coders, a course that assumes you know coding principles but not a lot about machine learning (me 🙋‍♂️).

After doing the first and second course I wanted to get into the code and build something for myself so I decided to build a machine learning classifier that could distinguish between two types of roller skates, aggressive skates which have smaller wheels and a lower chassis, and inline / street skates which have larger wheels and a smaller chassis. These differences were subtle but enough of a difference that the classifier would be able to classify them.

The first challenge was to collect a training set of images for each type, I used this guide to download a batch of training images. After some file renaming, the help of fast.ai and the free tier of Gradient I had a Jupyter notebook and all my images downloaded and ready to go.

Once I had everything setup, I could just modify the first lesson from the first course to fit the roller skates images; creating two classes: aggressive_skates and inline_skates for the classifier to learn. The training function ran pretty quick (~1min 30secs) and then I could see the error rate: about 6% which was pretty good: Screenshot 2019-12-29 at 21.17.49 This was really cool to see, so cool in fact, that I adapted the code from the next lesson to allow me to upload an image and ask the stored model to predict the class of the image: result And it worked! Granted it had a 50/50 chance, but I tested it 3 more times with different images and it got it right every time.

So these were my baby steps into machine learning and I was amazed how abstracted and easy the process was with fast.ai. The only problem is that this level of abstraction makes it really easy to miss the theory and rely on magic, but its all covered in later lessons of the course mentioned above. What the abstraction does give you though is a super accessible way to get started and begin making real world applications with this tech.

Update I've since created a repo with the learned model and a test of giving it a new image to predict the class of.

Update 2 I found a fresh image on reddit of a pair of aggressive skates that it predicted incorrectly, this is the first failure I've found. Need to train the model some more!

Rob is a web engineer working in London, who focusses on performance, simple code and accessible design.