[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Statistics in Number Theory

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 23
Thread images: 1

File: youshouldbeabletosolvethis.jpg (71KB, 640x480px) Image search: [Google]
youshouldbeabletosolvethis.jpg
71KB, 640x480px
Hi,

I was hoping someone could explain an idea I heard of before but never fully formalized. Basically, it was a philosophy for why the error terms in prime counting functions (in number fields, function fields, and some other places in number theory) "should" have exponent 1/2, (almost) as predicted by the Riemann hypothesis and its variants.

Here's a vague idea of what I have, if it helps prompt someone to properly explain/give a reference. The keyword is the Central Limit Theorem, which states (I believe...) that if [math]X_i[/math] are a collection of independent random variables, then
[eqn] \frac{X_1 + ... + X_n}{\sqrt{n} \to N(0, \sigma) [/eqn]
where [math]N(0, \sigma)[/math] is the normal distribution with mean 0, variance [math]\sigma[/math].

Now the Riemann hypothesis is equivalent to the mean value of the Mobius function equaling 0 with error term [math]O(n^{1/2 + \varepsilon}[/math], and somehow this is "expected" by CLT reasoning, thinking of evaluating [math]\mu(n)[/math] n times and summing over the values as being the numerator in CLT.

So this is pretty shaky but I hope someone could help me understand the general picture, and how to reason in the case of prime counting (not the Mobius function) so it can be generalized.

Pic unrelated.
>>
>>7654155
Let me try CLT again:
[eqn] \frac{X_1 + ... + X_n}{\sqrt{n}} \to N(0,\sigma).[/eqn]
>>
>>7654155
Just for completeness, in case a statistician shows up that is unfamiliar with the number theory: the Mobius function [math]\mu(n)[/math] takes value either [math]-1, 0, 1[/math] depending on the prime factors of n so that it is 0 when n is divisible by a square, -1 when divisible by an odd number of distinct primes, and 1 otherwise. So if there's no bias in number to have even or odd prime number of divisors, the mean of [math]\mu(n) = \lim_{x \to \infty} \frac{1}{x} \sum_{n \le x} \mu(n) [/math] should be 0, and the error term is given explicitly based on where [math]\zeta(s) \neq 0[/math], hence the relation to the Riemann hypothesis. But I'm hoping for a statistical heuristic that could explain why, in some sense.
>>
>>7654167
One more post, sorry.

It is a theorem (equivalent to the famous Prime Number Theorem!) that
[eqn] M(\mu) = \lim_{x \to \infty} \frac{1}{x} \sum_{n \le x} \mu(n) = 0[/eqn]
and the Riemann hypothesis is that we have the estimate
[eqn] \frac{1}{x} \sum_{n \le x} \mu(n) = O(x^{1/2+\varepsilon}).[/eqn]
That should make more sense.
>>
>>7654156
Intuitively speaking, this tells you that under certain conditions, the average of a sample tends to the true mean if your sample is large enough.

In other words, if you can satisfy certain things, then the intuitive estimation for the average can be relied upon to predict the true average in the limiting case.. The theorem is framed in terms of convergence of distributions, which is a weaker form of convergence.
>>
>>7654156
Now that my computer is working, I'll respond in more detail.
>>
>>7654171
All right, I'll take a stab at this. I know nothing of number theory and I should be embarrassed for only having a vague knowledge of o notation, but.

[math] \mu(n) [/math] may be viewed as a random variable defined over [math] \mathbb{N} [/math] that only takes on 3 values.. It's discrete. By no bias, I assume you mean there's no reason to assume odd is more likely then even number of divisors.

We can compute its expected value in theory. I hope I'm TeX'ing correctly on here..
>>
>>7654171
Oops, I keep messing it up. Of course I should have said the Riemann hypothesis is equivalent to
[eqn]\sum_{n \le x} \mu(n) = O(x^{1/2+\varepsilon}[/eqn]
so no 1/x in front! Hope that was clear from the context.

>>7654265
Yes, I mean no bias as in a "random" integer should have even number of prime divisors with same probability as odd number of prime divisors.
>>
>>7654171

Now, I'm not sure of what you mean by

[math] M(\mu) [/math].

For the time being, I will assume that M is used to denote the mean, and that in essence, it is saying that the expected value of [math] \mu [/math] tends to 0 in the limiting case.

Why does this make sense heuristically?
Computing [math] \mathbb{E}\mu [/math] looks like it would be tedious because in this discrete case:

[math] \mathbb{E}\mu = \sum x\cdot P(\mu= x) [/math].. Okay.. Well, x = -1, 0, or 1, but what is the probability that x takes on either of those values? You seemed to be comfortable asserting that it's unbiased, so I'll assume that each x value is likely to occur with probability [math] \frac{1}{3} [/math]. From this it's easy to see the expected value is 0.. Of course, I'm guessing at the properties of this RV and at what you're asking, and looking for ways to simplify. Excuse me if I'm mistaken.
>>
>>7654276

So.. If my reasoning isn't complete bull shit so far, what this says is that we expect, more often than not, for a prime to be divisible by a square. Is this sensible? Also, in your notation, is this saying that the series grows much slower than [math] x^{\frac{1}{2}+\epsilon}[/math]?
>>
>>7654276
Well, technically it's not uniform in those three values. The "probability" that a "random" integer is squarefree (i.e. [math]\mu(n) \neq 0[/math]) can be computed I believe to be [math]6/\pi^2 = 1/\zeta(2)[/math], though I forget the details. But then the remaining cases shouldn't have a bias - but this is a priori philosophy which needs to be proven carefully, but that's equivalent to the Prime Number Theorem (a difficult theorem).

My question specifically is in the error term. When you estimate the sum [math]\sum_{n \le x} \mu(n)[/math] it looks like it should be approximately zero, but the error (for the reason I'm trying to find out) should have order [math]O(x^{1/2 + \varepsilon}[/math] but heuristically we can just pretend it's 1/2, if that makes it clearer.
>>
>>7654294
I'm slowly starting to see how probability could be useful here. In my above argument, I naively assumed that distributions were uniform to make a point. The reason why I assumed that was because it was clear to me that -1, 0, and 1 are not quite equally likely, and how to go about computing their probabilities isn't so obvious to me. Now....I'll say more in my next post
>>
>>7654294
So we will view [math] \mu [/math] as a discrete random variable that assumes values -1,0,1 with unknown probabilities.

The central limit theorems tells me that:

[math] \frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}}[/math] has standard normal limiting distribution..Here I'm using mu to denote the mean of a RV. So assuming that we can somehow demonstrate that the expected value of [math] \mu(n} [/math] is finite, we can get an idea of how it distributed. My guess here is that the [math] o(x^{\frac{1}{2}} [/math] has to deal with the rate of convergence of our given distribution to the standard normal. My buddy is a number theorist.. Too bad you didn't ask this during the day; I could have got you a much better answer from him.
>>
>>7654321
Hmm.. Is there anything you could point me to to read. Maybe I can interpret the statistics a little better than you can. There isn't much sense in me trying to reinvent the wheel here. Also, my bad.. We don't have any knowledge of [math] \sigma [/math]
>>
>>7654327
Yeah, I apologize. I'm working with too little knowledge of number theory to try and connect the dots. I'd be interested to know a little more though.
>>
>>7654327
>>7654334
You could try reading a bit more about the Mobius function, but from how I originally heard about this stats interpretation, it seemed to have little to do with number theory and more about interpreting things using CLT.

Here's what I can give if you want more specifics. Every number n has a canonical factorization [math] n = p_1^{e_1} \cdots p_k^{e_k}[/math] with [math]p_i[/math] distinct. Define [math]\mu(n) = 0[/math] if [math]e_i > 1[/math] for any i (i.e. n is divisible by a square), and [math]\mu(n) = (-1)^k[/math] otherwise (so -1 for odd number of prime factors, 1 for even number, and [math]\mu(1) = 1[/math]).

It turns out there's a bit of structure here, namely [math]\mu(ab) = \mu(a)\mu(b)[/math] when gcd(a,b) = 1. But that's besides the point. Philosophically, analytic number theorists like to think of evaluating [math]\mu(n)[/math] as computing a random variable [math]X_n[/math] which could take the value -1 or 1 (forget about the numbers divisible by squares for now, since they'll contribute 0 to the mean anyway).

Now the sum [math]\sum_{n \le x} \mu(n)[/math] looks a lot like the sum of random variables in statistics [math]\sum X_i[/math] which CLT says they converge to a normal distribution of some kind, BUT proportional to [math]\sqrt{n}[/math]. Here's where I'm stuck. The 1/2 in the exponent of [math]\sqrt{n}[/math] is supposed to be heuristically related to the conjecture (Riemann hypothesis) that [/math]\sum X_i = \sum_{n \le x} \mu(n) = O(x^{1/2+\varepsilon})[/math], where big-O just means it essentially can't grow "faster" than that function.

More specifically, we're saying that
[eqn]|\sum_{n \le x} \mu(n)| \le Cn^{1/2+\varepsilon}[/eqn]
for some constant C, and all [math]n \ge n_0[/math] ("eventually" in case it has a rough start because small numbers are kinda special). This is looking closer to a CLT type statement perhaps, but I'm having trouble closing this last gap.
>>
>>7654350

Let's see if I can get LaTeX to work.. I hate typing in here

[math] \frac{\bar{X}-mean}{\frac{\sigma}{\sqrt{n}}} \longrightarrow N(0,1) [/math]
>>
>>7654350
Check this out:

In your case, we don't quite know the variance of [math] \mu [/math], nor its expected value. But from the central limit theorem written in this manner, my guess is that your
[math] Cn^{\frac{1}{2}+\epsilon} [/math] is related to the [math] \frac{\sigma}{\sqrt{n}} [/math]

stat[dot]yale[dot]edu[forwardslash]Courses[forwardslash]1997-98/101/sampmn[dot]htm
>>
>>7655581
If you read this page, you'll begin getting the intuition.

You should see the statement of the theorem toward the bottom, and notice that intuitively, we would expect

[math] \frac{\sqrt{n}}{\sigma}(\bar{X}-\mu) [/math] to have its growth dominated by the [math] \frac{\sqrt{n}}{\sigma} [/math] factor, as intuitively, the difference
[math] \bar{X}-\mu [/math] should be very small for large n.. Much smaller than [math] \frac{\sqrt{n}}{\sigma}[/math].

I'm not sure if I'm correct, but it seems reasonable to me. Note, we may not know the variance here, but we know it's finite since [math] Var(X) \leq \mathbb{E}[X^{2}] [/math], hence, we can ignore it and just call it C.
>>
>>7654155
is OP image suppose to be hard?
>>
Stats major here.

The most crucial (and sadly, often the most overlooked) part of the CLT is the requirement that the random variables be independent.
You can relax this condition slightly, for example by requiring only weak dependence (roughly, that we have independence "in the limit"), or assuming as antecedent that the deviation goes to 0 in the limit (e.g., the Lindeberg-Levy CLT).

Unfortunately, from my limited exposure to number theory I don't think we know enough about how very large numbers behave. In particular, the kind of statement OP wishes to prove in >>7654171 is precisely the sort of thing you need in order for the CLT to become meaningful.
>>
>>7657318
Well, I'm not trying to prove that (the Riemann hypothesis!) of course, just try to understand a heuristic as to why it "should" be true. Isn't there a way to see it that way?
>>
> The most crucial (and sadly, often the most overlooked) part of the CLT is the requirement that the random variables be independent.

I agree with this. An even more often overlooked fact is the finite population CLT, which is the one people should be using most of the time (since in many applications, the samples come form a finite population).
Thread posts: 23
Thread images: 1


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.