Recommendation Systems
“Sure. I do marathons… on Netflix.”
In the digital world, recommendation systems play a significant role - both for the users and for the company. For the users, a new world of options are thrown up - that were hitherto tough to find. For companies, it helps drive up user engagement and satisfaction, directly impacting their bottom line. If you’ve shopped on an e-commerce site or watched a movie on an on-demand video platform you would’ve seen options like: “People who viewed this product also viewed…” “Products similar to this one…”. These are the results from recommendation systems.
In this workshop, you will learn the different paradigms of recommendation systems and get introduced to the usage of machine-learning and deep-learning based approaches. By the end of the workshop, you will have enough practical hands-on knowledge to build, select, deploy and maintain a recommendation system for your problem.
Workshop Objectives
The aim of the workshop is to provide a thorough introduction to the art and science of building recommendation systems. These are the main objectives:
- Get a thorough introduction to recommendation systems and paradigms across domains
- Gain an end-to-end view of machine-learning & deep-learning based recommendation and learning-to-rank systems
- Understand practical considerations and guidelines for building and deploying recommendation systems for your own problems
Key Concepts
- Theory: ML & DL Formulation, Prediction vs. Ranking, Similiarity, Biased vs. Unbiased
- Paradigms: Content-based, Collaborative filtering, Knowledge-based, Hybrid and Ensembles
- Data: Tabular, Images, Text (Sequences)
- Models: (Deep) Matrix Factorisation, Auto-Encoders, Wide & Deep, Rank-Learning, Sequence Modelling
- Methods: Explicit vs. implicit feedback, User-Item matrix, Embeddings, Convolution, Recurrent, Domain Signals: location, time, context, social,
- Process: Setup, Encode & Embed, Design, Train & Select, Serve & Scale, Measure, Test & Improve
- Tools: python-data-stack:
numpy
,pandas
,scikit-learn
,tensorflow
,tfranking
,implicit
,spacy
Workshop Design
This would be a two-day instructor-led hands-on workshop to learn and implement an end-to-end deep learning model for recommendation systems. This is predominantly a hands-on course and will be 70% programming/coding and 30% theory. It would aim to cover the following topics.
Session #1: Introduction
- Why build recommendation systems?
- Scope and evolution of recsys
- Prediction and Ranking
- Relevance, novelty, serendipity & diversity
- Paradigms in recommendations:
- Content-based
- Collaborative filtering
- Knowledge-based
- Hybrid and Ensembles
- Key concepts in recsys:
- Explicit vs. implicit feedback
- User-Item matrix
- Domain signals: location, time, context, social
- Why use deep learning for recsys?
- Primer on deep learning
- Traditional vs deep learning approaches
- Examples and use-cases
Session #2: Content-Based
- Introduction to the case #1: product recommendation
- Environment setup for hands-on session
- Feature extraction using deep learning: Embeddings for Hetrogenous data
- Exercise: Recommending items using similarity measures
Session #3: Colloborative-Filtering
- Overview of traditional Colloborative-Filtering for recsys
- Primer on deep learning approaches
- Deep matrix factorisation
- Auto-Encoders
- Exercise: Recommending items using Colloborative-Filtering
Session #4: Learning-to-Rank
- Why learning-to-rank? Prediction vs Ranking
- Rank-learning approaches: pointwise, pairwise and listwise
- Deep learning approach to combine prediction and ranking
- Exercise: Recommending items using Learning-to-Rank
Session #5: Hybrid Recommender
- Introduction to the case #2: text recommendation
- Combining content-based and collaborative filtering
- Primer on Wide & Deep Learning for Recommender Systems
- Exercise: Recommending items using Hybrid recommender
Session #6: Time and Context
- Adding temporal component: window and decay-based
- Adding context context through group recommendations
- Dynamic and Sequential modelling using Recurrent Neural Networks
- Exercise: Recommending items using RNN recommender
Session #7: Deployment & Monitoring
- Deploying the recommendation system models
- Measuring improvements from recommendation system
- Improving the models based on the feedback from production
- Architecture design for recsys: Offline, Nearline and Online
Session #8: Evaluation, Challenges & Way Forward
- A/B testing for recommendation systems
- Challenges in recsys:
- Building explanations
- Model debugging
- Scaling-out & up
- Fairness, accountability and trust
- Bias in recsys: training data, UI → Algorithm → UI, private
- When not to use deep learning for recsys
- Recap and next steps, Learning Resources
Workshop Details
Participant Profile — A programmer interested in adding data-driven recommendations to their products or a beginner in data scientist with experience in using machine learning & interested to build a deeper and more applied perspective in using ML & DL for recommndation systems.
Pre-requisite skills — This is a hands-on course and hence, participants should be comfortable with programming in python
and have exposure to python data stack.
- Participants should have a basic familiarity of Python. Specifically, we expect participants to know the first four sections from the Python Practice Book
- Prior knowledge of machine learning principles is needed. Participants should have practice with machine learning problems e.g. regression, classification. Specifically, participants should be able to work with the following python libraries. You can refer to the Python Data Science Handbook to learn the same.
jupyter
: For doing literate programming in notebooksnumpy
: For scientific computationpandas
: For data wrangling and transformation of tabular data (dataframes)scikit-learn
: For building machine learning models
- While the deep learning concepts will be taught in an intuitive way, some prior knowledge of linear algebra and calculus would be helpful. You can refer to these visual explanation videos from @3Blue1Brown on Linear Algebra, Calculus and Deep Learning to get started.
Tools Used — The workshop principles are tool-agnostic and can be applied using any data stack and platform for building recommendation systems. However, for the ease of doing the exercises, we would be using Python Data Stack during the workshop. All notebooks and required data-sets will be provided using a cloud hosted environment. No additional downloads required. Participants will only require to have a browser with internet connectivity on their own laptop.
Number of Participants — The maximum number of participants for the workshop would be capped at 30. The small class size would enable a more participative environment with group interaction possible as well as opportunities to have one-to-one learning interactions.
Duration — The workshop would be conducted over 2 days from 0930 to 1730. There will be short breaks during the morning and afternoon session and a longer lunch break of around 45 minutes in the middle.
Facilitators’ Profile
Amit Kapoor teaches the craft of telling visual stories with data. He conducts workshops and trainings on Data Science in Python and R, as well as on Data Visualisation topics. His background is in strategy consulting having worked with AT Kearney in India, then with Booz & Company in Europe and more recently for startups in Bangalore. He did his B.Tech in Mechanical Engineering from IIT, Delhi and PGDM (MBA) from IIM, Ahmedabad. You can find more about him at amitkaps.com and tweet him at @amitkaps.
Bargava Subramanian is a practicing Data Scientist. He has 14 years of experience delivering business analytics solutions to Investment Banks, Entertainment Studios and High-Tech companies. He has given talks and conducted workshops on Data Science, Machine Learning, Deep Learning and Optimization in Python and R. He has a Masters in Statistics from University of Maryland, College Park, USA. He is an ardent NBA fan. You can tweet to him at @bargava.