Recommendation Systems

“Sure. I do marathons… on Netflix.”

In the digital world, recommendation systems play a significant role - both for the users and for the company. For the users, a new world of options are thrown up - that were hitherto tough to find. For companies, it helps drive up user engagement and satisfaction, directly impacting their bottom line. If you’ve shopped on an e-commerce site or watched a movie on an on-demand video platform you would’ve seen options like: “People who viewed this product also viewed…” “Products similar to this one…”. These are the results from recommendation systems.

In this workshop, you will learn the different paradigms of recommendation systems and get introduced to the usage of machine-learning and deep-learning based approaches. By the end of the workshop, you will have enough practical hands-on knowledge to build, select, deploy and maintain a recommendation system for your problem.

Workshop Objectives

The aim of the workshop is to provide a thorough introduction to the art and science of building recommendation systems. These are the main objectives:

Get a thorough introduction to recommendation systems and paradigms across domains
Gain an end-to-end view of machine-learning & deep-learning based recommendation and learning-to-rank systems
Understand practical considerations and guidelines for building and deploying recommendation systems for your own problems

Key Concepts

Theory: ML & DL Formulation, Prediction vs. Ranking, Similiarity, Biased vs. Unbiased
Paradigms: Content-based, Collaborative filtering, Knowledge-based, Hybrid and Ensembles
Data: Tabular, Images, Text (Sequences)
Models: (Deep) Matrix Factorisation, Auto-Encoders, Wide & Deep, Rank-Learning, Sequence Modelling
Methods: Explicit vs. implicit feedback, User-Item matrix, Embeddings, Convolution, Recurrent, Domain Signals: location, time, context, social,
Process: Setup, Encode & Embed, Design, Train & Select, Serve & Scale, Measure, Test & Improve
Tools: python-data-stack: numpy, pandas, scikit-learn, tensorflow, tfranking, implicit, spacy

Workshop Design

This would be a two-day instructor-led hands-on workshop to learn and implement an end-to-end deep learning model for recommendation systems. This is predominantly a hands-on course and will be 70% programming/coding and 30% theory. It would aim to cover the following topics.

Session #1: Introduction

Why build recommendation systems?
- Scope and evolution of recsys
- Prediction and Ranking
- Relevance, novelty, serendipity & diversity
Paradigms in recommendations:
- Content-based
- Collaborative filtering
- Knowledge-based
- Hybrid and Ensembles
Key concepts in recsys:
- Explicit vs. implicit feedback
- User-Item matrix
- Domain signals: location, time, context, social
Why use deep learning for recsys?
- Primer on deep learning
- Traditional vs deep learning approaches
- Examples and use-cases

Session #2: Content-Based

Introduction to the case #1: product recommendation
Environment setup for hands-on session
Feature extraction using deep learning: Embeddings for Hetrogenous data
Exercise: Recommending items using similarity measures

Session #3: Colloborative-Filtering

Overview of traditional Colloborative-Filtering for recsys
Primer on deep learning approaches
- Deep matrix factorisation
- Auto-Encoders
Exercise: Recommending items using Colloborative-Filtering

Session #4: Learning-to-Rank

Why learning-to-rank? Prediction vs Ranking
Rank-learning approaches: pointwise, pairwise and listwise
Deep learning approach to combine prediction and ranking
Exercise: Recommending items using Learning-to-Rank

Session #5: Hybrid Recommender

Introduction to the case #2: text recommendation
Combining content-based and collaborative filtering
Primer on Wide & Deep Learning for Recommender Systems
Exercise: Recommending items using Hybrid recommender

Session #6: Time and Context

Adding temporal component: window and decay-based
Adding context context through group recommendations
Dynamic and Sequential modelling using Recurrent Neural Networks
Exercise: Recommending items using RNN recommender

Session #7: Deployment & Monitoring

Deploying the recommendation system models
Measuring improvements from recommendation system
Improving the models based on the feedback from production
Architecture design for recsys: Offline, Nearline and Online

Session #8: Evaluation, Challenges & Way Forward

A/B testing for recommendation systems
Challenges in recsys:
- Building explanations
- Model debugging
- Scaling-out & up
- Fairness, accountability and trust
Bias in recsys: training data, UI → Algorithm → UI, private
When not to use deep learning for recsys
Recap and next steps, Learning Resources

Workshop Details

Participant Profile — A programmer interested in adding data-driven recommendations to their products or a beginner in data scientist with experience in using machine learning & interested to build a deeper and more applied perspective in using ML & DL for recommndation systems.

Pre-requisite skills — This is a hands-on course and hence, participants should be comfortable with programming in python and have exposure to python data stack.

Participants should have a basic familiarity of Python. Specifically, we expect participants to know the first four sections from the Python Practice Book
Prior knowledge of machine learning principles is needed. Participants should have practice with machine learning problems e.g. regression, classification. Specifically, participants should be able to work with the following python libraries. You can refer to the Python Data Science Handbook to learn the same.
- jupyter: For doing literate programming in notebooks
- numpy: For scientific computation
- pandas: For data wrangling and transformation of tabular data (dataframes)
- scikit-learn: For building machine learning models
While the deep learning concepts will be taught in an intuitive way, some prior knowledge of linear algebra and calculus would be helpful. You can refer to these visual explanation videos from @3Blue1Brown on Linear Algebra, Calculus and Deep Learning to get started.

Tools Used — The workshop principles are tool-agnostic and can be applied using any data stack and platform for building recommendation systems. However, for the ease of doing the exercises, we would be using Python Data Stack during the workshop. All notebooks and required data-sets will be provided using a cloud hosted environment. No additional downloads required. Participants will only require to have a browser with internet connectivity on their own laptop.

Number of Participants — The maximum number of participants for the workshop would be capped at 30. The small class size would enable a more participative environment with group interaction possible as well as opportunities to have one-to-one learning interactions.

Duration — The workshop would be conducted over 2 days from 0930 to 1730. There will be short breaks during the morning and afternoon session and a longer lunch break of around 45 minutes in the middle.

Facilitators’ Profile

Amit Kapoor teaches the craft of telling visual stories with data. He conducts workshops and trainings on Data Science in Python and R, as well as on Data Visualisation topics. His background is in strategy consulting having worked with AT Kearney in India, then with Booz & Company in Europe and more recently for startups in Bangalore. He did his B.Tech in Mechanical Engineering from IIT, Delhi and PGDM (MBA) from IIM, Ahmedabad. You can find more about him at amitkaps.com and tweet him at @amitkaps.

Bargava Subramanian is a practicing Data Scientist. He has 14 years of experience delivering business analytics solutions to Investment Banks, Entertainment Studios and High-Tech companies. He has given talks and conducted workshops on Data Science, Machine Learning, Deep Learning and Optimization in Python and R. He has a Masters in Statistics from University of Maryland, College Park, USA. He is an ardent NBA fan. You can tweet to him at @bargava.