Recommendation Agents
This component aims to define an agent to interact with the previously defined environment by making recommendations and receiving the user’s feedback. Similar to the approaches used in the Reinforcement Learning theory, an agent is represented by two main components: a Value Function and an Action Selection Policy. The value function represents the agent’s goals, quantifying the expected consequences of its decisions. In this case, where the agent is a recommendation system, the value function represents the utility of each item for a user according to the algorithm prediction. The reward usually consists of a scalar value representative of the user’s explicit/implicit feedback. In turn, the action selection policy represents the policy used by the agent to choose one (or more) items to be recommended.
The Recommendation Agents supported by iRec are listed below.
Year | Model | Paper | Description |
---|---|---|---|
2002 | ε-Greedy | Link | In general, ε-Greedy models the problem based on an ε diversification parameter to perform random actions. |
2013 | Linear ε-Greedy | Link | A linear exploitation of the items latent factors defined by a PMF formulation that also explore random items with probability ε. |
2011 | Thompson Sampling | Link | A basic item-oriented bandit algorithm that follows a Gaussian distribution of items and users to perform the prediction rule based on their samples. |
2013 | GLM-UCB | Link | It follows a similar process as Linear UCB based on the PMF formulation, but it also adds a sigmoid form in the exploitation step and makes a time-dependent exploration. |
2018 | ICTR | Link | It is an interactive collaborative topic regression model that utilizes the TS bandit algorithm and controls the items dependency by a particle learning strategy. |
2015 | PTS | Link | It is a PMF formulation for the original TS based on a Bayesian inference around the items. This method also applies particle filtering to guide the exploration of items over time. |
2019 | kNN Bandit | Link | A simple multi-armed bandit elaboration of neighbor-based collaborative filtering. A variant of the nearest-neighbors scheme, but endowed with a controlled stochastic exploration capability of the users’ neighborhood, by a parameter-free application of Thompson sampling. |
2017 | Linear TS | Link | An adaptation of the original Thompson Sampling to measure the latent dimensions by a PMF formulation. |
2013 | Linear UCB | Link | An adaptation of the original LinUCB (Lihong Li et al. 2010) to measure the latent dimensions by a PMF formulation. |
2020 | NICF | Link | It is an interactive method based on a combination of neural networks and collaborative filtering that also performs a meta-learning of the user’s preferences. |
2016 | COFIBA | Link | This method relies on upper-confidence-based tradeoffs between exploration and exploitation, combined with adaptive clustering procedures at both the user and the item sides. |
2002 | UCB | Link | It is the original UCB that calculates a confidence interval for each item at each iteration and tries to shrink the confidence bounds. |
2021 | Cluster-Bandit (CB) | Link | it is a new bandit algorithm based on clusters to face the cold-start problem. |
2002 | Entropy | Link | The entropy of an item i is calculated using the relative frequency of the possible ratings. In general, since entropy measures the spread of ratings for an item, this strategy tends to promote rarely rated items, which can be considerably informative. |
2002 | log(pop)*ent | Link | It combines popularity and entropy to identify potentially relevant items that also have the ability to add more knowledge to the system. As these concepts are not strongly correlated, it is possible to achieve this combination through a linear combination of the popularity ρ of an item i by its entropy ε: score(i) = log(ρi) · εi. |
- | Random | Link | This method recommends totally random items. |
- | Most Popular | Link | It recommends items with the higher number of ratings received (most-popular) at each iteration. |
- | Best Rated | Link | Recommends top-rated items based on their average ratings in each iteration. |