If you had a collection of users and the messages they've sent to each other over an arbitrary amount of time, how would you go about deriving the similarity of one user to another? I was thinking of checking word use and length, message length, etc, but I'm not sure if I'd get a useful spread that way.
you get there most frequently used words and classify it by music, science, tech, video games, movies, etc. this will describe a person's interests and compare it with others
>>9149638
The problem with that is now you have to get into the business of deciding what goes in which category.