In the following article, I will present a machine learning app I created from scratch.
What I wanted to build was an app that would take as input a brain MRI image. From there, the app would return a prediction, saying if there is or not a tumor present on the image.
I found the idea interesting because the app could be used by anyone to determine the presence (or not) of a brain tumor. No need for coding skills or knowledge about brains.
To achieve this goal, three steps needed to be achieved, i.e. the creation of a model…
In early 2020, I was working on a school project using Keras/Tensorflow. Everything was working great. Then, I shared my code with my teammate so he could start working on it too. However, for some reason, it was not working for him.
We spent hours trying to figure it out. Eventually, we realized that the problem was caused by the fact we had different packages versions installed. Once we managed to make it all work, I decided to find a way to avoid this problem occuring ever again.
That’s where virtual environments come in.
If you are already familiar with…
Natural Language Generation (NLG) has made incredible strides in recent years. In early 2019, OpenAI released GPT-2, a huge pretrained model (1.5B parameters) capable of generating text of human-like quality.
Generative Pretrained Transformer 2 (GPT-2) is, like the name says, based on the Transformer. It therefore uses the attention mechanism, which means it learns to focus on previous words that are most relevant to the context in order to predict the next word (for more on this, go here).
The goal of this article is to show you how you can fine-tune GPT-2 to generate context relevant text, based on…
In most countries, people are no longer allowed into stadiums, at least not in their normal capacity.
Any fan, of any sport, will tell you that watching games without fans is just not the same. There is a missing element.
While the spectacle might not be the same, this unlikely situation has allowed us to see one thing. Does home field advantage still matter if fans are not there?
We know through years of observation that home field advantage is a real thing in any sport. …
In this article, I will show you how to implement the value iteration algorithm to solve a Markov Decision Process (MDP). It is one of the first algorithm you should learn when getting into reinforcement learning and artifical intelligence.
Reinforcement learning is an area of Machine Learning that focuses on having an agent learn how to behave/act in a specific environment. MDPs are simply meant to be the framework of the problem, the environment itself.
MDPs are composed of 5 elements.
Form is temporary, class is permanent. That used to be the saying at Arsenal. The club just lost 7 of its last 9 games. They just lost their 4th home game in a row. They haven’t scored more than one goal in a game since October.
Im not sure if that form can still be classified as temporary. What’s certain is that class is gone.
Most Arsenal fans were attracted to the club because of it’s offensive, free flowing football. Through all the defeats, humiliations, and disappointments, at least, it was always exciting. …
If you are here to learn more about the movies, sadly, this is not the article you are looking for. I love Optimus Prime and Megatron as much as the next guy, but here, I will be talking about Transformer, the deep learning model!
The Transformer was first introduced in 2017 in the paper “Attention is all you need”, which can be found right here. You will see, the title is revealing.
It really has revolutionized the NLP world, so you should definitely learn all about it. The issue? It’s not the easiest model to understand.
Thankfully, in this article…
Pyspark is a Python API that supports Apache Spark, a distributed framework made for handling big data analysis. It’s an amazing framework to use when you are working with huge datasets, and it’s becoming a must-have skill for any data scientist.
In this tutorial, I will present how to use Pyspark to do exactly what you are used to see in a Kaggle notebook (cleaning, EDA, feature engineering and building models).
I used a database containing information about customers for a telecom company. The objective is to predict which clients will leave (Churn) in the upcoming three months. …
About 5 months ago, I stumbled upon this article on TheScore. The summary: the traditional 5 positions are no longer enough to describe NBA players. The game has changed after all. The authors come up with a way to classify players in 9 classes, based on the way they play the game.
In this article, I will take another shot at classifying players in various clusters, depending on what they do on the court. However, I will do it using data science and more precisely the K-Means clustering.
I will also take a deeper look at what makes a winning…
Since travelling has become practically impossible, I decided to think about how I could start preparing for my next trip. That’s right, I got travel on my mind.
As most people know, plane tickets prices vary a lot. That is because airline companies use yield management, a pricing strategy that aims to anticipate consumer behaviour to maximize profits.
For instance, with the code I will present below, I looked at a plane ticket Montreal-Oslo 48 times per day for 2 days. The price varied between 891$ and 1298$ (all data was scrapped from Kayak, right here). …
Master’s degree student and aspiring data scientist