Salmon Run: Trip Report: RecSys 2018

I attended the ACM Conference on Recommender Systems (RECSYS 2018) at Vancouver last week. People who know me personally would probably be a bit surprised, since my claim to being interested in recommenders is based almost solely on having read Satnam Alag's Collective Intelligence in Action, and attending the Coursera course on Recommender Systems conducted by Professors Joseph Konstan and Michael Ekstrand a few years ago. However, of late, I have been working on recommender systems with our Health Education group, and I figured that attending this conference, while akin to drinking from a firehose, will quickly give me an indication of the latest techniques in the field, as well as introduce me to ideas that I could adapt and reuse in my own domain. As you will see from my trip report, I was not disappointed.

One notable thing about RecSys is that attendees seemed friendlier in general than other (NLP oriented) ACM conferences I have been to earlier. Or maybe its just me finally overcoming my imposter syndrome. In any case, I found my fellow attendees at RecSys much more willing to compare notes and share their expertise. I was also fortunate to meet up with several of my colleagues from Elsevier, some for the very first time, as well as recommender systems experts and past RecSys attendees who I had met earlier at EMNLP 2017 in Copenhagen, as well as many others. Somehow, our friend graph must have achieved some sort of critical mass, since it kept expanding over the duration of the conference. I also got a chance to say hello to Professors Ekstrand and Konstan and thank them for their Coursera course, which has introduced so many people, including me, to recommender systems. And downtown Vancouver, where the conference was held, is something of a foodie paradise. So along with being a very educational experience, RecSys 2018 was also a lot of fun.

Tutorials

The conference started with 1 day of tutorials, followed by 3 days of conference presentations, followed by 2 days of workshops. I was curious about how Deep Learning (DL) was being used in Recommenders, and apparently many other attendees felt the same way, because the 3 tutorials I signed up for, all DL related, were all sold out. Shades of NIPS perhaps, where Sowmith Chintala joked that AI researchers will have to discover time travel in order to get a ticket for 2019 before they are sold out. In any case, I found all my tutorials to be uniformly very interesting (discounting the almost universal urge by presenters to explain skip-gram or LSTM internals one more time).

The first tutorial I attended talked about Distributed Representations in Recommender Systems and covered architectures like prod2vec (aka item2vec) based on item-item cooccurrences from transaction sequences of co-purchased items, meta prod2vec which added product features as well, and content2vec which combines multiple embedding types, including image embeddings.

My second tutorial was titled Modularizing Deep Neural Network Inspired Recommendation Engines and was a walkthrough of OpenRec by its creator. OpenRec is a modular library used to construct DL based Recommender algorithms. The API is quite elegant and reminded me a bit of Keras, except at a higher abstraction level.

My third tutorial was about Sequence aware Recommendation, which tries to use a sequence of user-item interactions to make richer user models than standard latent factor (matrix factorization based) models using single user-item interaction pairs.

My overall impression at the end of the tutorials was that the field of Recommender Systems (RS) has borrowed and adapted a lot of ideas from Natural Language Processing (NLP), at least around distributional representations (word2vec, transfer learning for images using pretrained CNNs, etc) and DL architectures such as Autoencoders (AE) and Recurrent Neural Networks (RNN). Together, they served as a good background for many of the conference presentations and workshops.

Main Conference

The conference this year was single-track, which meant that I no longer had to start each day with a highlighter and the program to figure out which of two or three interesting presentations I would have to drop. On the flip side, it did mean that some of the presentations may not be interesting to everyone. For the latter case, there were always plenty of posters to look at and discuss with authors. There were also quite a few booths set up by sponsor companies. The full conference program can be found here.

Conference Day 1

The first day opened up with a keynote from Elizabeth Churchill of Google, who spoke about recommendation design principles as a set of five E's -- Explainable, Equitable, Ethical, Expedience and Exigence. This was followed by paper sessions on Explanations, Algorithms and Products.

The Explanation set of papers explored novel solutions for providing explanations from Recommender Systems. Interesting ideas included using a Generative Adversarial Network (GAN) to generate personalized reviews for users to explain why they should purchase the item; an attempt to quantify how much explanation is "enough"; the effect of providing explanations for reciprocal recommendations (in the context of matchmaking); and interpreting user inaction as a possible signal to the RS.

The Algorithms session focused on algorithms papers from industry. Interesting papers were on Variational Learning to Rank (VLTR) which attempts to balance explore/exploit decisions by shuffling product listings according to the model's relevance uncertainty for each product; the use of the (currently) state-of-the-art AWD-LSTM model along with session features to recommend real estate listings by Realtor.com; using CNN and RNN networks and leveraging contextual bandits to balance exploration/exploitation decisions at Hulu.com to keep users watching videos via autoplay; overlaying a ML model over a standard MF one to make related pins context dependent at PInterest; and combating clickbait by combining content and usage signals for articles so they act as regularizers for each other at FlipBoard.

This was followed by the Products set of papers, some of which were quite mathy and theoretical. Interesting ideas included an investigation of recommendations that are constrained to benefit not only the primary user, but other stakeholders as well; modeling sequential user behavior using translation based Factorization Machines (FM); an investigation into how much data can the user retract and still allow a recommender to make good predictions, something of renewed interest in the current age of GDPR; a DL model to predict complementary items; an extension of Matrix Factorization (MF) to work with sequences of behavior signals to make item recommendations; A Reinforcement Learning (RL) architecture to predict page-wise product recommendations; and a method to de-bias logging data so it can be reused in other models.

Conference Day 2

The second day opened with a keynote by Lisa Getoor from UC Santa Cruz, who contended that we often flatten inherent structure in our data because we use matrix algebra. She then goes on to propose a new language PSL that allow you to make use of logical structure and handles uncertainty. The keynote was followed by paper sessions on Learning and Optimization, System Considerations and Travel and Entertainment.

The Learning and Optimization session consisted of some pretty novel DL architectures. The first one proposes a Neural Gaussian Mixture Model (NGMM) combining Gaussian Mixture Models with a pair of neural networks to produce rating predictions from reviews. The second one uses Deep Neural Memory to record a sequence of user interactions, which is then used by contextual bandits for explore/exploit decisions. The next one is a non-DL model that calculates the optimal the mix between both parties in a reciprocal RS to predict the best reward. The next one was a best paper candidate and tries to combine preferences from MF and graph based models by filtering the prediction based on graph distance (aka higher order proximity). The next one is also a best paper candidate and uses a variational AE (VAE) that combines user and item features to learn a latent space, that can then be used to generate personalized item recommendations. The final paper in this set is about calibrated recommendations, that attempts to balance the user's interests across his recommendations.

The next group of papers have to do with novel situations that dictate the design of the RS. For example, a discussion of problems that can impact the performance of an RS in production; respecting privacy boundaries when recommending on Slack; deciding which video image to show on Netflix; incorporating intent for voice recommendations on Comcast X1; and techniques to standardize listings to facilitate communication between buyer and seller on eBay.

The next set of talks were about recommendations in the Travel and Entertainment industry. There were talks about bundling telecom services (for example, the channel lineup) personalized to the user; using a questionnaire to quickly gauge the user's preference to solve the cold start problem; a graph based public transport route planner that allows user to specify various parameters, including comfort (the city in the test case is Kolkata, and I guess I can relate, having used the public transport there); a very interesting study comparing user interactions with voice and visual recommendations; mapping out-of-stock items to similar in-stock items and combining their interaction history to create more accurate item recommendations; and a system to recommend the hero to choose for Multiplayer Online Battle Arena games.

In the evening there was a RecSys sponsored banquet by the Vancouver marina.

Conference Day 3

I missed the keynote and first session of the third day, because a few of us decided to take advantage of face time to meet over breakfast and do some brainstorming, and we lost track of time. I regret missing the keynote by Christopher Berry of the Canadian Broadcasting Corporation, I heard later that he covered some things about social responsibility I care deeply about. In any case, we also missed the paper session on RecSys that Care, so I will only cover the next two sessions on Metrics and Evaluation, and Beyond Users and Items.

The Metrics and Evaluation session, as you can imagine, is mostly about measuring things. The talks covered predicting best answer in a community Q+A site based on not only content features, but also user features; an investigation into how various IR metrics vary with N for top-N recommendations; a framework for benchmarking stream based news recommenders; a comparative study of related video recommendations, between newest, most similar or most relevant; removing bias from offline recommender evaluation for missing-not-at-random implicit feedback; and an attempt to understand human perceptions of image similarity in the context of related item recommendations.

The Beyond Users and Items session had talks that dealt with extensions to the basic Collaborative Filtering (CF) model. Talks were about the use of RNNs to map Knowledge Graph (KG) paths to graph embeddings (RKGE); differences in how various types of preferences (CF, content, social, trust-based) should be applied for different types of products; SpectralCF, applying a spectral convolution on a user-item bipartite graph, that can alleviate the cold-start problem by benefiting from the rich connectivity information in the spectral domain; the benefits of using so-called side information (categorical item features) in recommender systems; the benefits of pairwise preference elicitations (asking questions to cold start users to figure out their preferences) over pointwise; and using topic modeling and streaming MF on text features in an online recommender environment.

Workshops

The next two days were workshops. I attended the Deep Learning (DLRS), Knowledge Transfer and Learning (KTL) and the Knowledge Aware and Conversational Recommender Systems (KARS) workshops. I also attended the HealthRecSys workshop for a while at the recommendation of someone I met at the banquet, but it wasn't what I was looking for.

This is going to be the last DLRS workshop, since the workshop was set up to encourage the use of DL techniques in RS, and it has reached a point where no more encouragement is needed. Information about papers at the workshop can be found at the DLRS website. There were two talks that used VAEs to do recommendations, a very thorough description of a DL based recommender for news called CHAMELEON used by Brazilian newspaper Globo.com, and an interesting idea of creating explainable recommender systems by replacing the middle layer of an autoencoder with properties from a KG.

The KTL workshop was about using transfer learning techniques. Talks in the workshop were about investigating if better ImageNet models are also better for transfer learning for image recommendations; using a hybrid VAE for CF instead of more traditional techniques like MF; a new embedding technique BB2vec that learns product embeddings for complementary item recommendations from baskets and browsing sessions; using information about venues from other cities to recommend the next venue to visit for a given city; and detecting change points in user preferences using HMM for sequential recommendation tasks.

The KARS workshop discussed ways to incorporate knowledge (from KGs) into RS. Taks in the workshop were about deriving item features from domain knowledge; merging user and content features to create personalized recommendations for scholarly papers; computing recommendations using a KG aware AE; KG aware RS for software development that leverages old APIs in the system to recommend new ones; and narrative driven book reccomendations, where one's book reviews are used as the basis for recommending books.

Conclusion

While I have listed all the talks I attended above, I wanted to reflect a little bit on what I personally took away from the conference. In the past, especially with NLP conferences, I would find most of the topics super-interesting and applicable in some way or other to my work. In this case, perhaps because the RS I am working on don't really use cutting edge methods and probably never will, I found the actual techniques to be of limited utility, except perhaps as ideas to chase in specific situations. However, I did pick up on a few things and plan to try and apply in my own domain (not necessarily just the RS I am working on). Here is a tentative list.

many custom ways of building embeddings to reflect various domain specific scenarios
imaginative use of DL structures such as VAE, GAN, Neural Memory, etc.
incorporation of graph embeddings from Knowledge Graphs using sequence modeling techniques such as HMM and RNN
mixing of traditional and DL techniques such as Neural GMM, SpectralCF, etc.

In addition, I finally understood what a Contextual Bandit is, thanks to the efforts of my co-attendees during a coffee break. There are also a few RS specific tools and ideas that I want to delve into in more detail, such as OpenRec, reco-gym and Stream based recommenders. I also plan on looking at recommender specific DL architectures such as prod2vec, Gru4Rec, and Wide and Deep, to name a few, as well as look at the AWD-LSTM LM and the practical tips from Fast.ai as mentioned by Evan Oldridge, the presenter from realtor.com. I also want to take a look at techniques for Change Point detection as described by Prof Bamshad Mobashar (for a different application), and KG weighting streategies for scholarly paper recommendation (slides).