Last weekend I started playing around with Deep Dreams, an artistic, surreal application of Google’s deep belief neural networks and applied it to a picture from my colleague Steve Eldridge of Vixlet HQ’s inspiring view of downtown LA:

steve-e

Here’s what it looks like at the deepest layer of the deep dream neural network (I’ll explain what that means shortly).

steve-e-deep-inception_5a

Although there is a tutorial already, I thought I’d make a blog post about it because I had to make some modifications to the code there and people were asking about how I got it working.

First of all, the part that I had to modify was the image loading commands.  The tutorial used ipython notebooks, which I’m not very famliar with. Also they loaded the image into a numpy array via PIL, python image library.  I’m not sure if it was due to PIL or ipython notebooks, but this didn’t work for me. Instead, I loaded the image to a numpy array via opencv (called cv2 in python).  That’s an obvious thing to try if you’re familiar with computer vision in python, but otherwise it would be a place to get stuck.  So basically, I used the code from the tutorial but added it to a single script and changed a couple of lines:

The core python library that performs the neural network operations is Caffe. If the code above doesn’t work, it could be because I used a version of Caffe that I compiled to make use of the NVidia Cuda libraries.  I don’t think that is necessary, but I had been using that to speed up unrelated computer vision work I’ve been doing.

Now I’ll explain a little bit about the inception neural network that deep dreams is based on.  The neural network is visualized below.

googlenet

You may want to open the pdf (https://abewrites.files.wordpress.com/2015/07/tmp.pdf) to see it in more detail.  The left is the lower level of the neural network that processes the image input and the right is the deeper level of the neural network.  The intermediate levels range from the shallow end which identifies edges and shapes, to patterns, and then to patches and objects in the deep end.  The “end” named argument to the deepdream function specifies what layer to output, given the input image.  Below are a selection of various layers from  shallow to deep:

conv1/7x7_s2

steve-e-deep-conv1 

pool1/3x3_s2

steve-e-deep-pool1

pool1/norm1

steve-e-deep-pool1norm1

conv2/3x3_reduce

steve-e-deep-conv2_3x3_reduce

conv2/3×3

steve-e-deep-conv2_3x3

conv2/norm2

steve-e-deep-conv2_norm2

pool2/3x3_s2

steve-e-deep-pool2_3x3_s2

inception_3a/pool

steve-e-deep-inception_3a_pool

inception_3a/5x5_reduce

steve-e-deep-inception_3a_5x5_reduce

inception_3a/1×1

steve-e-deep-inception_3a_1x1

inception_3a/3x3_reduce

steve-e-deep-inception_3a_3x3_reduce

inception_3a/pool_proj

steve-e-deep-inception_3a_pool_proj

inception_3a/5×5

steve-e-deep-inception_3a_5x5

inception_3a/3×3

steve-e-deep-inception_3a_3x3

inception_3a/output

steve-e-deep-inception_3a

inception_3b/output

steve-e-deep-inception_3b

inception_4a/pool

steve-e-deep-inception_4a_pool

inception_4a/5x5_reduce

steve-e-deep-inception_4a_5x5_reduce

inception_4a/1×1

steve-e-deep-inception_4a_1x1

inception_4a/3x3_reduce

steve-e-deep-inception_4a_3x3_reduce

inception_4a/pool_proj

steve-e-deep-inception_4a_pool_proj

inception_4a/5×5

steve-e-deep-inception_4a_5x5

inception_4a/3×3

steve-e-deep-inception_4a_3x3

inception_4a/output

steve-e-deep-inception_4a

inception_4b/pool

steve-e-deep-inception_4b_pool

inception_4b/output

steve-e-deep-inception_4b

inception_4c/output

steve-e-deep-inception_4c

inception_4d/output

steve-e-deep-inception_4d

inception_4e/output

steve-e-deep-inception_4e

inception_5a/output

steve-e-deep-inception_5a

inception_5b/output

steve-e-deep-inception_5b

 

Here’s one where I ran the deep dream several times:

tmp10

 

Advertisements