Using basic idea from RDT to calculate User-based Recommendation with 10M users

Now, we have a problem, given 10 million users and their preference on X items, how to make recommendation for users by user-based collaborative filtering?

The key step of usercf is to calculate user-user similarity. However, calculate user-user similarity between 10 million users in a 4G RAM computer is impossible. Thus, we can only use random method, such as random decision tree (RDT).

The basic idea of RDT is to split space randomly, and then given more accurate results by bagging all results for different random split. By this idea, we can also design an efficient way to calculate user-based recommendation for 10M users.

The detail method is a secret. However, I think may researchers can think out better ideas than my solution.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word