A Bandit Method Using Probabilistic Matrix Factorization in Recommendation

Expand
  • (Key Laboratory of System Control and Information Processing of Ministry of Education; Department of Automation, Shanghai Jiaotong University, Shanghai 200240, China)

Online published: 2015-10-29

Abstract

In recommendation system, sparse data and cold-start user have always been a challenging problem. Using a linear upper confidence bound (UCB) bandit approach as the item selection strategy based on the user historical ratings and user-item context, we model the recommendation problem as a multi-arm bandit (MAB) problem in this paper. Enabling the engine to recommend while it learns, we adopt probabilistic matrix factorization (PMF) in this strategy learning phase after observing the payoff. In particular, we propose a new approach to get the upper bound statistics out of latent feature matrix. In the experiment, we use two public datasets (Netfilx and MovieLens) to evaluate our proposed model. The model shows good results especially on cold-start users.

Cite this article

TU Shi-tao* (涂世涛), ZHU Lan-juan (朱兰娟) . A Bandit Method Using Probabilistic Matrix Factorization in Recommendation[J]. Journal of Shanghai Jiaotong University(Science), 2015 , 20(5) : 535 -539 . DOI: 10.1007/s12204-015-1618-7

References

[1] Sarwar B, Karypis G, Konstan J, et al. Itembased collaborative filtering recommendation algorithms[C]//Proceedings of the 10th international conference on World Wide Web. Hong Kong, China:ACM, 2001: 285-295.
[2] Schein A I, Popescul A, Ungar L H, et al.Methods and metrics for cold-start recommendations[C]//Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Tampere, Finland: ACM,2002: 253-260.
[3] Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems [J]. Computer,2009, 42(8): 30-37.
[4] Lee D D, Seung H. Algorithms for non-negative matrix factorization [J]. Advances in Neural Information Processing Systems, 2001, 13: 556-562.
[5] Jamali M, Ester M. A matrix factorization technique with trust propagation for recommendation in social networks[C]//Proceedings of the 4th ACM Conference on Recommender Systems. Barcelona, Spain:ACM, 2010: 135-142.
[6] Ma H, Yang H, Lyu M R, et al. Sorec: Social recommendation using probabilistic matrix factorization[C]//Proceedings of the 17th ACM Conference on Information and Knowledge Management. Napa. Valley,California, USA: ACM, 2008: 931-940.
[7] Macready W G, Wolpert D H. Bandit problems and the exploration/exploitation tradeoff [J]. IEEE Transactions on Evolutionary Computation, 1998,2(1): 2-22.
[8] Auer P. Using confidence bounds for exploitationexploration trade-offs [J]. The Journal of Machine Learning Research, 2003, 3: 397-422.
[9] Salakhutdinov R, Mnih A. Probabilistic matrix factorization [C]//Advances in Neural Information Processing Systems. Cambridge, Massachusetts: MIT Press, 2007: 1257-1264.
[10] Golub G H, Reinsch C. Singular value decomposition and least squares solutions [J]. Numerische Mathematik,1970, 14(5): 403-420.
[11] Precup D, Sutton R S, Singh S. Eligibility traces for off-policy policy evaluation [C]// Proceedings of 17th International Conference on Machine Learning.San Francisco, CA, USA: Morgan Kaufmann, 2000:759-766.
[12] Li L, ChuW, Langford J, et al. A contextual-bandit approach to personalized news article recommendation[C]//Proceedings of the 19th International Conference on World Wide Web. Raleish, North Carolina, USA:ACM, 2010: 661-670.

Options
Outlines

/