I’ve finished my Data Science Bootcamp course and I’m a bit more free to focus on whatever I want now, so I’m going to take one more crack at the Titanic problem on Kaggle.

Now I know a bit more than I did the first time around, I can do a slightly better job at data cleaning and feature engineering, but mainly I thought I’d try something technically known as “not throwing random models I don’t understand properly at the data”.

It’s something I think most beginners in the Titanic competition go through. After trying a basic logistic regression, we discover a mass of different classifier types we didn’t know existed so we try a bunch from other peoples notebooks (only most of them have also cut & pasted from other places without understanding them either).

I’m going to go and sit in the corner now and learn properly about how the different types of model work (hence why I was learning about the monkeys inside the random forest).

Then I’m going to try the ones which actually make some sense, then I’m going to have a good crack at a TensorFlow, and then I’m going to get on with something else.

See you in a bit…