How a simple metric can generate great results, why explicit feedback doesn’t cut it, and ways to accelerate complex ML models
At Bibblio Labs we recently organised our third RecSys meetup in London, hosted by the team at Redis Labs. The meetup took place at Skills Matter’s CodeNode, a venue dedicated to technology events - which includes a nice bar - near Moorgate Station.
The presenters of the evening were Yaqub Alwan (Data Sciences Engineer at Sony Interactive Entertainment Europe), Maciej Kula (Machine Learning Engineer at Ravelin) and David Maier (Technical Field Enablement Manager at Redis Labs).
Here’s a a quick overview of their presentations from the evening, and you’ll find the links to their presentation slides too:
Show me what you want
“When you’re confronted with a choice between something simple and complex, and the complex option has a small upside, always choose simple.” - Yaqub Alwan
Yaqub works at Sony Interactive Entertainment Europe - or as most people like to call it: Playstation. They organise bi-annual hackathon events for all employees.
During the last three-day hackathon, a colleague of Yaqub's had the idea to try to build a recommender system using game screenshots, like these:
The motivation for this was that because this content is available early in the game development, it could help out with recommending pre-orders to people.
During the three-day hackathon Yaqub and a group of co-workers implemented a system using autoencoders to represent screenshots downloaded from the Playstation store as numeric vectors.
They were then able to group screenshots belonging to the same game, based on image similartiy, to improve scoring. An impressive feat.
The result was that they had a system which when fed in a screenshot from a known game, could correctly recognise that game in a lot of cases. On top of this, if they provided the system with the name of an existing game from the store then it could retrieve a list of similar games. They could also take a user-provided screenshot, perhaps representative of a game the user enjoyed, and provide a list of similar games:
This would enable a user to discover similar games by aesthetic and scenes, or games which were only just being demoed at The Electronic Entertainment Expo.
The results were rather good. Yaqub concluded: “We used a very simple metric called ‘maximum inner product’ which would give you the most similar other pair of screenshots. For product-to-product recommendation the results were actually really sensible, given we only used this ‘dumb’ metric.”
He emphasised that this was done as a hackathon project and has not been taken further as far as he knows. I personally wouldn’t mind this popping up as a discovery tool in game store interfaces soon!
Explicit vs. Implicit Recommenders
“Of course you should always use different metrics when evaluating recommender systems. I just argue RMSE should never be among them.” - Maciej Kula
Maciej recently joined Ravelin as machine learning engineer, after having worked as data scientist and research engineer at Lyst and Netflix respectively.
He didn’t beat about the bush and immediately introduced the purpose of his talk: to convince the audience that a) root mean square error (RMSE) is never an appropriate evaluation metric for a recommender system, and b) that in most cases implicit feedback is far more valuable than explicit feedback.
After taking us through the definitions of explicit - and implicit feedback recommender systems and RMSE, Maciej pointed to The Netflix Challenge and its results as the reason why explicit recommenders and RMSE are still treated as a standard solution today:
But implicit feedback, he argued, is more useful than that. Or rather, there is a problem with recommender systems build solely on explicit feedback. The assumption behind models trained and evaluated only on observed ratings is that ratings that are not observed are missing at random. And that’s false:
Referring to the experiment Steck ran in 2010 on the MovieLens dataset, Maciej concluded that implicit feedback alone is much better than explicit feedback alone, although putting the two together gives the best results. He raised the interesting point that Netflix doesn't use ‘stars’ any more, as part of the company’s efforts to move away from explicit feedback.
After his talk I opened the floor to questions. In the history of RecSys London - I appreciate we’ve only had 3 Meetups so far - it’s hard to remember having more of a debate than the one we had after this talk. So what do you think: Do you think that RMSE has value? And how about explicit feedback?
Redis for Production Recommender Systems
“Please raise your hands if you know Redis. [All hands go up]. Oh, this will be easy then!” - David Maier
David Maier is a technical field enablement manager at Redis Labs. He came all the way from Germany to our London Meetup to share how you could use the Redis modules for productionalising your recommender system:
Performance, simplicity and extensibility were mentioned as top differentiators of Redis. David paid special attention to the latter, zooming in on the Redis modules or ‘lego blocks’ that could support recommenders, such as Neural Redis and Redis-ML:
Overall it was a great event, and it was fun chatting to everyone over drinks and pizza. If you want to join us and over 50 other RecSys enthusiasts at the next free London meetup on recommender systems, then join the group here! Keep an eye out for the next event, as we had a waiting list last time. Also, we're always on the lookout for speakers to light up our next meetups. So, if you know or are somebody who'd like to share a story - big or small - then please contact me.