Transfer Learning

March 29, 2017

Some learning notes for transfer learning.

In real world, we normally wouldn’t train a brand new CNN from scratch with random initialization due to time and sufficient data size. Most the case, we leverage on pretrained CNN on a sufficient/well-known data set (e.g. ImageNet, 1.2 million samples), then use that as starting point.

Feed as feature extractor

This is straigtforward. Take a CNN pretrained model with ImageNet, we strip out last fully connected layer and just treat the rest the CNN as a feature extractor. With that, it will output feature vector/encoding directly. By applying your new dataset labelling, you can add one more layer of model training. That layer could be any techniques we know: SVM, Linear classifier etc

Fine tuning some layer

Take one step further. People could also fine tune the weights of the pretrained CNN by continuing the backpropagation. The order recommended is starting from later layers so that it could save effort or avoid over fitting concerns.