Hello /sci/.
I'm making software to analyze boards and threads on 4chan.
How do I measure intelligence of a board? I was thinking of something including image ratio, length of posts and sentimental objectivity.
Also, what other metrics should I consider?
>>8498102
The only thing considered a measure for intelligence is the intelligence quotient, there are tests for that. If you want to determine the intelligence of a poster by their posts, you need labelled data to train a regressor. You don't have that. Everything else is just guesswork.
Anyway, if you want to look at interesting quantities by themselves you may look at features such as:
>Number of different words per thread
>Length of posts
>Number of misspelled words
>Number of unique posts
>Number of original posters per thread
>Number of "meme words"
>Number of images posted
>Number of unique images (images that haven't been posted so far)
It would be interesting to do the following by the way:
Look at only the OP and train a classifier and judge whether the post will reach its bump limit (or a regressor and determine how many replies the thread will have.
>>8498115
Yeah, I already got half of those you said.
A really cool one would be something along the lines of keywords, words that are not on the list of 10,000 of the most popular English ones, but the most popular on the board. I wonder what the outcome would be for different boards. I also want to include plotting graphs. And I don't want to do anything with machine learning just yet.
>>8498102
You should include a function to quantify the dankness of memes posted.
>>8498102
You can try making a browser extension which tracks all the posts one makes and which includes an IQ test. After some time you'll have enough labeled data to train a regressor.
>>8498459
Also OP has too much free time if he considers doing this
>>8498462
I could see myself doing this.
I have nothing else to work on.
>>8498115
Count ad hominen as well.
>>8498102
You should also consider reply rate, and secondly take all the posts on the board and break each one up by the words in the post. See which words are commonly used there, the rate we are using unique words at (basically see how much a board is shitposting and meme spouting), and tag certain words as flags or indicators of low IQ. I suspect words with overly frequent posts like "redpill me on X" or buzzwords like "sjw", "cuck" or "autist" have lower IQ due to the fact that a good amount of posters are quite frankly incapable of expressing their opinion or engaging in proper debate. It also might be interesting to check reply chains to see how often discussions either contain or end in short single sentence posts with these buzzwords.
How are you gonna get the data?
>>8498658
Cuck.
>>8498884
Maybe he's going to analyze the HTML code or something.
>>8498602
Good luck identifying those.
Number of swear words is another one worth looking at.
>>8498884
>refresh the page
>>8498250
I know another dude from a different imageboard (krautchan.net/int) who wrote software that kep records of everything and we could see from what country the most used words was.
I will see if I can get to him. Been a few years since I last talked to him.
>>8498884
Python has many advanced libraries for that.