A general & easily extensible Recommendation System that could be used in any real business
Background
Recommendation System is a generally used technique in current companies, however, because of the variety of business, every company have to built a specific Recommendation System and integrated with similar algorithms for solving their own business problems. This is painful.
I’ve worked as an ML engineer for more than 5 years, and for the 5 years I have worked in big companies like Tencent, and also worked in some small startups too. Every time when I start the Recommendation System project in different companies, I have to write a completely new one but with some similar codes/algorithms. Thus, I am keep thinking is there a good way to build a general Recommendation System, and providing common & popular algorithms & tools in it. If there’s anyone wants to build a brand new Recommendation System anytime, he/she can simply run some commands and a good recommendation system is presented to him/her.
Besides, I have researched some open-source Recommendation System, there’s none of them that make it simple to use and easy to extend.
The Issues
the issues of a general recommendation system we face with are:
- the data sources vary a lot
- the data might be structured or unstructured
- the data’s implication & significance varies a lot in different business logic
- recommendation system requires interpretability for the business
Those issues look hard to solve, but it is able to be classified into two categories:
- Various Data
- Interpretability
The Idea
Architecture
As we know, a typical Recommendation System is a 2 layer architecture, which is used for generating and ranking respectively. In some companies, we call it offline and online layers too. Of course, there are some companies like Netflix that built a ~3-layer architecture for a Recommendation System, such as an offline layer, a nearline layer, and an online layer. Don’t worry, they are just some improvements/varieties of the classical Recommendation System. Thus for our project, we will keep use a classical 2-layer architecture.
Data Issues
Benefiting from the deep learning techniques it’s easy to build embeddings for everything, which provides a standard and an easy way for us to solve the data issues, that’s embedding.
To make the Recommendation System general, it will strictly require the input data is embeddings, in this way the system will fulfill its potential on modern recommendation algorithms.
But you might have a question: “this system still leaves the data issues alone, it is still unsolved.” Don’t worry, let me explain.
Adapters
Considering embedding is a proven technique and it’s easy to build for any kind of data. Thus in this general recommendation system, we will provide some basic different data adapters for converting data into vectors. For example, it’s easy to provide:
- text data adapter: word2vec / doc2vec, for embedding texts
- graph data adapter: deep walk/graph embeddings, for embedding graph data
- etc.
Also, we will make it easy to build adapters plugins, so everyone in the open-source community could join us and make it better.
Interpretability
We always prefer to believe that we need strong interpretability on the recommendation system, actually, we needn’t. The recommendation system which is used in real business usually just needs some basic interpretability to attract customers to click the item and increase the CTR for the product.
Once you agree with this, you will find it’s easy to build some strategy or algorithm for explaining the “recommendation behavior”, this could be found in some recent papers.
Contribute
I briefly introduce the idea here and build a repo in my GitHub: https://github.com/Marcnuth/x-recommender
Feel free to create any issues for me to discuss/ask anything.