Deep Learning - Image processing with VGG16 and Keras for house price prediction

Jay Kim
Dec 19, 2017
3 min read

Deep learning is a widely used method for image processing. Deep learning uses neural networks to extract features from images, and enables us to build classification or regression models with the extracted features. While deep learning models work really well on data, it requires millions of data to train model which means that it requires huge computational power to process on millions of images. Researchers already built neural networks with a lot of images and we can appreciate their effort and use their pre-trained models on our data. In this post, I scraped data from Redfin.com and used VGG16 developed by Oxford Visual Geometry Group to extract features from house pictures and built regression models to predict house price.

1. Preprocessing

VGG16 requires the input images in 224 by 224 format. We can use Keras library for preprocessing.

X_pics.shape will be (#of samples, 224, 224, 3). The last element 3 represents RGB colors.

2. Extract features from images with VGG16

An attribute for VGG16 is 'include_top = True' which will add three layers at the end of VGG16 model to predict an input image. Since we are not predicting, we will remove the last two layers to simply extract 4096 features from image using the pre-trained VGG16 structure. As you can see the summary of the model, the last layer (fc1) will produce 4096 outputs and there are 117,479,232 parameters that the model will have, and they are all non-trainable parameters which means we do not have train any parameter to extract image features. If we build our own model instead of using the pre-trained models, we need to train all parameters and it will require very high computational power.

3. Feature extraction with PCA

If we include the all 4096 features in regression model, the model complexity is too high. I only have about 1800 samples and the number of features is much greater than that of samples and it can result in the curse of dimensionality. Principal Component Analysis (PCA) can extract features to reduce the model complexity. I reduced the complexity to 150 that was chosen by explained variance that captured about 73% of features.

output: 0.73

4. Regression modeling with image features

I made functions to build regression models. We are going to build 4 models; GBT, RF, Ridge, and Lasso.

The data was collected from Seattle area and the average prices of houses vary a lot based on zipcode. The regression model does not include zipcode or metadata. The mean absolute error was about $174,000. It was not really bad considering we analyzed only pictures of house without metadata.

There are limitations of analyzing only pictures because the first pictures are not always the front view of house. Here are the examples of pictures. The first picture is the front view of house and the price was predicted correctly. Another picture is actually capturing a lake and was not predicted correctly.

This post showed how to use pre-trained models such as VGG16 and how to control them with keras. I scraped the first pictures of houses from Redfin.com and extracted features from pictures using VGG16, keras and PCA. I was able to build regression models with the features. The main limitation was that there is no zipcode info in pictures which is very significant features to determine the price, and not all pictures captured the front view of houses. If we include metadata and have consistent pictures, we will be able to decrease the error of prediction.