Hey anons,
simple question for you.
I have multiple scatter plots and I want to know which of the plots are the most similar.
Right now I test be just the mean distance of multiple "test points" but there has to be something better.
bumpy the bump
>>8530024
If they are the same functional form, you could just try comparing their regression equations. Maybe you could take the difference of their slopes, intercepts, etc
Isn't there some kind of equation I can put some (10000) sample points in and calculate a difference factor?
Like ...for n from 1 to 10000 do sum(|f(n)| - |f2(n)|)/10000. This is what I do right now but there has to be some "better" way to do it.
I'm doing this by using a computer program so comparing slopes, intercepts, etc. isn't a option here.
>>8530177
What program
A program I'm implementing to compare the scatter plots. I want to compare lots of them
>>8530197
I don't know an exact formula for this type of thing. Maybe you could create some type of index to measure the similarities between various descriptive statistics of each dataset, so how similar are the variances, standard deviations, etc of each graph. Cause after all if you have the scatter plots you should have access to the data.
sure ... I have all the data
I think I can just improve by using the squared distance like
mean squared distance (1 -> n) = sum((|f(x)| - |g(x)|)^2) / n . So less distance is way better than a little bit more distance.
>>8530024
You can something called a correlation in Python it'll return a correlation factor
>>8530272
https://en.m.wikipedia.org/wiki/Cross-correlation
Numpy has a function as does Scipy
https://docs.scipy.org/doc/numpy/reference/generated/numpy.correlate.html
Cross-correlation looks very promising.
thx anon
only problem with cross correlation is that the functions need to be integrable ... which is quite impossible for my scatter plots.
I might go for some Taylor series to approximate my scatter plots but that might be very very hard.
Correlation is 0
>>8530336
it can be used, and it is in fact used for discrete functions as well. Z-transformation maybe?
>>8530336
There is discrete cross correlation, also if applicable I would fit a curve to the scatter plot and cross correlate that. If not dicrete cross correlation will work assuming you have a reasonable amount of data points