Tag Archives: svd

多个用户共用一个id?

之前在Netflix Prize的时候,我们Ensemble组内部讨论过,如果很多人共用一个账号,而他们的兴趣不同,这可能会造成推荐结果的不准确。那么如何能够分开共用一个账号的用户呢? 当时我们的组员Lester Mackey提出过一些idea,但是当时因为时间紧张,没有能实现这个model。 最近Lester Mackey在ICML上发表了一篇文章”Mixed Membership Matrix Factorization“,详细讨论这个idea。他将LDA和SVD结合了起来,认为每个user id可能对应了好几个人,所以一开始用多项式分布采出一个id对应的人,而每个人对应一个latent factor。他的方法在RMSE上的提高是比较明显的 0.9 => 0.896 这个model很有意思,大家可以看看,也许我理解的不是那么准确。

A trick of doing SVD on binary data

In github contest, some people used SVD on github data. Github data is a binary data which only contain “who watch what” data. Most of previous reseaches about SVD are done on rating data. In netflix, many people used Funk-SVD which are trained on observed data. However, in binary data, the label of observed data [...]

SVD and KNN top-N system with binary data

Github contest is a top-N problem and it uses binary data. I always doubt about how to using SVD in top-N problem. Jeremy got #2 in github contest and he only use SVD in his implement. I sent a email to him today and discuss about SVD with him. Following are the content of email. [...]