My experience with Machine Learning
Tools
- Python
- Pandas
- Seaborn
Categories
- Machine Learning
- Data wrangling
- Data exploration
I was the runner-up in the Bosch wide AI/ML challenge.
I got to work on data exploration and apply my understanding for Machine Learning concepts
on a real world problem.
Motivation: I had participated in a small ML challenge a couple of
years earlier and was pretty far down in the rankings.
With this challenge, I wanted to put my learnings into practice and have a objective evaluation
of my improvement.
Process: The challenge was to select the 4 sets of data out of 82
sets to predict a certain value. While it is easy to do the prediction of the value
with all 82 sets of data, it is resource intensive and hence had to be avoided.
First task was to explore the data using plots & other methods to identify the
data sets that are most sensitive to the value.
Once the data sets are narrowed down to 4, use a Machine learning model to
predict the value as close to the actual values as possible.
All of the above steps are recursive.
Challenge: Almost everything was a challenge starting from data
wrangling to understanding the theory behind the data sets to plotting the data.
One this that help was the the quantity of data - in terms on computer memory -
was not big and I could train & runs the ML models on my computer instead of having to
rely on servers.
Lesson learnt: Data exploration and visualization is awesome. A lot of the times, the overlooked parts of the data and shown wonderfully during visualization. In a way, visualization is an art it in itself.