Sunday, November 20, 2016

Trip Report: Demystifying Deep Learning and Artificial Intelligence @ Oakland

This weekend, I was at Oakland, attending the Demystifying Deep Learning and Artificial Intelligence Workshop. The workshop was organized by Accel.AI. The goal of the workshop was to bring together people who are looking to get into Artificial Intelligence and Deep Learning with people who are a little further along in this journey. Some of us from the Deep Learning Enthusiasts meetup group at San Francisco, including myself, presented at this workshop. This is my trip report of the event.

There was an Introductory and Advanced track that ran in parallel. I attended all the advanced tracks and one introductory track. I came away with the humbling realization that my knowledge is filled with more holes than a slice of Swiss cheese. While I understand Deep Learning well enough to build models and get results, each time I hear someone speak, I invariably come away with a fresh perspective about something I hadn't thought about before.

Below I provide a summary of the talks I attended. I don't have the slides and/or github repositories, but once they are made available I will update the post.

Day 1 (Saturday)

Internal workings of a convnet and the process of implementing it on Spark - by Jeremy Nixon

I got in late, and missed the first 10-15 minutes of the talk. It was a good introduction to Convolutional Neural Networks (CNN). One thing I got from this talk was a different way of thinking about weight sharing in CNNs. Instead of each neuron in a layer connecting to all the neurons in the next layer as happens for Fully Connected Networks, the talk described neurons in a CNN as connecting to their corresponding neuron and its immediate neighbors in the next layer. I thought that was quite insightful, compared to my prior mental model of alternating convolutions and pooling. Jeremy also briefly touched upon how a DL model would be implemented on Spark (using a parameter server). I spoke to him briefly after the talk and it turns out that a Spark based CNN is under development and should be available shortly as part of Spark.

Interactive Group Presentations

Attendees from both tracks got together and were broken up into 10 groups. Each group was given a topic and 5 minutes to come up with a short group presentation. We got Convolutional Neural Networks. Most groups presented their topics in non-ML terms, such as K-Nearest Neighbors (KNN) in terms of asking what your friends are getting for lunch and ordering based on that. Our presentation was a bit more specialized and computer science-y, mainly because we didn't anticipate that other groups would do it that way. Also, we couldn't think of a way to present CNNs in that way.

Overfitting and regularization in Machine Learning - by Dmitry Lituev

Dmitry presented examples of overfitting and subsequent regularization for different Machine Learning (including Deep Learning) models. He demo-ed overfitting and fixing them with Regularization and Dropout on Scikit-Learn and Keras models. You can find his Jupyter Notebooks here. While I knew about regularization and dropout, Dmitry's presentation helped me truly appreciate what happens during the regularization process.

Transfer Learning and Fine Tuning for Cross-Domain Image Classification with Keras - by Sujit Pal

This was my presentation. I had done some work with Transfer Learning using Caffe pretrained models in the past, so I decided to try it again (including Fine Tuning) using Keras pretrained models and (a sample of) the Diabetic Retinopathy Competition data from Kaggle. I covered transfer learning and fine tuning a pretrained VGG-16 network trained with IMAGENET data. Here are the slides for the talk, and the Github repository for the code.

Day 2 (Sunday)

Deep Learning for Recommendation systems - by Rumman Chowdry

Once again, I missed the first 10-15 minutes. Fortunately, the first part of Rumman's talk had quite an in-depth coverage of Recommendation System basics, so I got in before she started on the Deep Learning part, the reason I wanted to attend her talk in the first place. The Deep Learning models she discussed in her talk were Google's Deep and Wide Network, Spotify's DL based Music Recommender and Youtube's Deep Neural Network based Recommender. Very interesting approaches, definitely something to look into in the coming months.

Introduction to Deep Learning for Images in Keras - by St├ęphane Egly and Malaikannan Sankarasubbu

One of the speakers in the Advanced track had to reschedule, so I got to go to the second part of this talk. Malaikannan showed a very good visualization for convolutions which I liked very much. He has been working with Deep Learning full time for at least the last two years. Thanks to him, I learned that his company,, has open-sourced Recurrent Shop, a framework for building complex recurrent neural networks with Keras. I also had a nice conversation with St├ęphane Egly after the presentation.

Lightning Talks

This section had 4 lightning talks, each about 10-15 minutes long.

Did Big Data Fail us in the Presidential Elections? - by Rumman Chowdry

Rumman pointed out various errors in polling from the point of view of a Data Scientist. The discussion was mostly around error margins and how they didn't carry over into the media reports to the public. She concluded that we as Data Scientists failed Big Data by failing to educate the media and subsequently the general public.

Using Convolutional Neural Networks to classify Monet paintings - by Samuel Bozek

Samuel describes his CNN that classifies Monet paintings with 85% accuracy. It is inspired by A Neural Algorithm of Artistic Style. In the talk I learned that Monet suffered from Macular Degeneration over his lifetime and his painting style reflects that, and can be subdivided into 3 distinct genres. The classifier has better performance on the early and middle periods than on the late period.

Developing Chatbots with AI - by Masha Kubyshina

Masha is an experienced chatbot consultant who decided to experiment with a different way to build chatbots. Rather than be driven by the development team or client, she decided to let users build their own chatbot. The participants in her experiment was her school-age daughter and her classmates. She demo-ed an ice-cream recipe chatbot, imagined and created by her daughter and friends, built on the Recast.AI platform.

Incorporating ML into Robotics and Computer Vision - by Carlos Uranga

Carlos is from Singularity University and he spoke of the Singularity when artificial intelligence will be able to learn by itself without help from humans. He showed a wearable EEG device (a helmet) that can be used to train an artificial hand to move with the power of thought. He also described an AI bartender that can learn how to mix drinks based on your personality.

The lightning talks were then followed by the following 2 regular talks.

In depth look at Word2Vec - by Andy Zhang

Andy's presentation is an attempt to describe Word2Vec from first principles. Andy led us through a bunch of simple examples and described how different pairs of words would align differently in the Word2Vec vector space, and how these alignments match up with our intuitions as demonstrated by word analogies. He also spoke very briefly about extending the idea to images (and images jointly with text).

Exploding / Vanishing Gradient Problem - by Alex Shim

Alex described why exploding or vanishing gradients occur in Recurrent Neural Networks (RNN), and described the internals of Long Short Term Memory (LSTM) and the modification for Gated Recurrent Unit (GRU). So far, I had taken the forget gate on faith (i.e, accepted that it does what it does without thinking too much about it), but Alex's talk gave me a good idea of how the forget gate operates to keep exploding and vanishing gradients in check.

Overall, I thought the event went very well. I learned quite a few things and made a number of new friends with whom I can compare notes in the future. Unlike more traditional conference/workshop settings, the presentation had lots of time for questions and interactions between the speaker and audience. Congratulations to the organizer Laura Montoya and the volunteers for such a great job!

4 comments (moderated to prevent spam):

Ravi said...

Hi Sujit, thank you for posting your analyses and thoughts, which I find very interesting and informative. In particular, your insights often make me aware of exciting possibilities in the ML and NLP areas. I really appreciate the time and care you take in developing and posting your blogs. Please keep 'em coming!

Sujit Pal said...

Thanks for the kind words, Ravi.

mattc said...

Thanks Sujit ---I've always appreciated your honesty and hands on step by step or insights how to learn and the pitfalls / discoveries of technology that you dig into.


Sujit Pal said...

Thanks Matt, you are welcome.