Encoding tabular data as images
Trolling Kaggle. Just because.
TL;DR. I’m in the top 9% on the Titanic leaderboard using a CNN. Yes rly.
I was playing with fastai.tabular (v1) the other day and I could not get it to be my friend.
All the time I could hear Jeremy’s voice in the back of my head saying “make things which aren’t pictures, into pictures”, so I had a quick look to see if there are any papers about encoding tabular data as images. I figured that must be A Thing people are doing.
I found a paper describing a method called SuperTML which is based off of something called Super Characters; both essentially involve putting your non-image data (text or number values) onto an image and then using a CNN to process them.
I mean, literally like that. File this under “so stupid there’s no way it should work”, but it actually does.
The first thing which occured to me though is that the model is going to understand the underlying values an awful lot quicker if I just use them as color codes or greyscale values. In my limited experimenting so far this seems to be the case, and at the moment I’m generating images which look like this.
The second thing which occured to me is that maybe I could feature engineer the data to death, throw everything including the kitchen sink into the images and have the model figure out which features mattered. That would have been so awesome. It did not work.
So far I’m finding simpler architectures are better for this. That’s not terribly surprising. It’s also very quick to train as there’s very little to figure out.
I managed to get top 9% on Titanic using ResNet-18 (again with fastai), with an accuracy of 0.79665. That’s a PB for me (although I was very wet behind the ears when I first tried this competition). As best I can tell, the real record for this competition (once you get past all the 1.0 idiots) was around .85 (anything above about 80% is ok) and it involved ensembling quite a few different models. I did 5 epochs at 1e-2 and then 5 at 1e-3. I didn’t even need to unfreeze, it’s basically looking at minecraft blocks.
If you want to play around with this, I’ve put the image encoder onto github.
github.com/joedockrill/DFToImageEncoder
The code is in the following cell. It's not complicated and there may be other directions this could go in, other information which could be encoded into the images. Drop a comment or have a play with it if you can think of anything.
#collapse-hide
from PIL import Image as PImage
from PIL import ImageDraw as PImageDraw
import numpy as np
from math import sqrt, ceil
from sklearn import preprocessing
class DFToImageEncoder():
def __init__(self):
self.__scaler = None
self.__encoders = None
self.__data = None
self.__mms_data = None
self.__exclude_cols = None
@property
def data(self):
return self.__data
@data.setter
def data(self, df):
self.__data = df
self.__mms_data = df.copy(); mms = self.__mms_data;
# drop excluded cols
if(self.__exclude_cols is not None): mms.drop(self.__exclude_cols, axis=1, inplace=True)
# fit if we haven't already
if(self.__scaler is None): self.fit(mms)
# label encode any cat cols and scale from 0-255
if(self.__encoders is not None):
for col,enc in self.__encoders.items():
mms[col] = enc.transform(mms[col])
mms[mms.columns] = self.__scaler.transform(mms[mms.columns])
@property
def exclude_cols(self):
return self.__exclude_cols
@exclude_cols.setter
def exclude_cols(self, cols):
# cols to exclude from the image (like your target)
self.__exclude_cols = cols
if(self.data is not None): self.data = self.data
def fit(self, df):
# fit to all your data then process train/val/test by changing .data
df = df.copy()
if(self.__exclude_cols is not None): df.drop(self.__exclude_cols, axis=1, inplace=True)
for col in df.columns:
if df[col].dtype == np.object:
if(self.__encoders is None): self.__encoders = {}
enc = preprocessing.LabelEncoder().fit(df[col])
self.__encoders[col] = enc
df[col] = enc.transform(df[col]) # have to actually tfm here or the scaler can't fit
self.__scaler = preprocessing.MinMaxScaler(feature_range=(0, 255))
self.__scaler.fit(df)
def iterrows(self):
# index and row from the original df + generated image
for index, row in self.__data.iterrows():
img = self.create_image(self.__mms_data.loc[index].values)
yield index, row, img
@staticmethod
def create_image(vals):
# you can call this directly with an array of 0-255 values (floats or ints, i don't care)
img_size = 200
mtx_size = ceil(sqrt(len(vals)))
div_size = img_size // mtx_size
img = PImage.new("L", (img_size, img_size))
drw = PImageDraw.Draw(img)
i = 0
for y in range(0, mtx_size):
for x in range(0, mtx_size):
x0 = x*div_size; x1 = x0 + div_size
y0 = y*div_size; y1 = y0 + div_size
if i < len(vals):
drw.rectangle([x0,y0,x1,y1], fill=(int(vals[i])))
else:
drw.line((x0+5,y0+5,x1-5,y1-5), fill=128, width=5)
drw.line((x0+5,y1-5,x1-5,y0+5), fill=128, width=5)
i += 1
for i in range(1, mtx_size):
drw.line((i*div_size,0, i*div_size,img_size), fill=0)
drw.line((0,i*div_size, img_size,i*div_size), fill=0)
return img
@staticmethod
def fastai_img(img):
# for getting preds directly from a fastai model
from fastai.vision.image import Image
import torchvision.transforms as tfms
img_tensor = tfms.ToTensor()(img)
return Image(img_tensor)
You use it like this:
# setup
enc = DFToImageEncoder()
enc.exclude_cols = ["PassengerId", "Survived"]
enc.fit(df_all) # fit to ALL the data
# create training images saved to disc
enc.data = df_train
for index, row, img in enc.iterrows():
# exclude_cols are still returned for you to inspect
if row.Survived == True:
path = "images/Survived/"
else:
path = "images/Died/"
img.save(path + str(row.PassengerId) + ".jpg")
# train your model...
train_model()
# get predictions, use in memory images directly
enc.data = df_test # switch to test data
for index, row, img in enc.iterrows():
# helper function to convert to a fastai image
fast_img = DFToImageEncoder.fastai_img(img)
pred,_,_ = learn.predict(fast_img)