Varianz einer begrenzten Zufallsvariablen

22

Angenommen, eine Zufallsvariable hat eine untere und eine obere Schranke [0,1]. Wie berechnet man die Varianz einer solchen Variablen?

Piotr
quelle
8
Genauso wie für eine unbegrenzte Variable - Integrations- oder Summationsgrenzen entsprechend einstellen.
Scortchi - Wiedereinsetzung von Monica
2
Wie @Scortchi sagte. Aber ich bin neugierig, warum Sie dachten, dass es anders sein könnte?
Peter Flom - Wiedereinsetzung von Monica
3
Wenn Sie nichts über die Variable wissen (in diesem Fall könnte eine Obergrenze für die Varianz aus der Existenz von Grenzen berechnet werden), warum sollte die Tatsache, dass sie begrenzt ist, in die Berechnung einfließen?
Glen_b
6
Eine nützliche obere Grenze für die Varianz einer Zufallsvariablen , die auf Werte in nimmt [a,b] mit einer Wahrscheinlichkeit von 1 ist (ba)2/4 und wird durch eine diskrete Zufallsvariable erreicht , die auf Werte annimmt a und b mit gleichen Wahrscheinlichkeit 12 . Ein weiterer zu beachtender Punkt ist, dass die Varianz garantiert vorhanden ist, wohingegen eine unbegrenzte Zufallsvariable möglicherweise keine Varianz aufweist (einige, wie z. B. Cauchy-Zufallsvariablen, haben nicht einmal einen Mittelwert).
Dilip Sarwate
7
Es gibt eine diskrete Zufallsvariable, deren Varianz gleich (ba)24 genau:eine Zufallsvariable, diemit gleicher Wahrscheinlichkeit dieWerteaundannimmt1b12 . Zumindest wissen wir also, dass eine universelle Obergrenze für die Varianz nicht kleiner als(ba)24 .
Dilip Sarwate

Antworten:

46

Sie können Popovicius Ungleichung wie folgt beweisen. Verwenden die Notation m=infX und M=supX . Definiere eine Funktion g durch

g(t)=E[(Xt)2].
Berechnen der Ableitungg und Lösen von
g(t)=2E[X]+2t=0,
wir, dassg sein Minimum beit=E[X] (beachte, dassg>0 ).

Betrachten Sie nun den Wert der Funktion g am Sonderpunkt t=M+m2 . Es muss der Fall seindass

Var[X]=g(E[X])g(M+m2).
Aber
g(M+m2)=E[(XM+m2)2]=14E[((Xm)+(XM))2].
DaXm0 undXM0 , haben wir
((Xm)+(XM))2((Xm)(XM))2=(Mm)2,
was bedeutet, dass
14E[((Xm)+(XM))2]14E[((Xm)(XM))2]=(Mm)24.
Daher haben wir Popovicius Ungleichung V a r [ X ] ( M - m ) 2 bewiesen
Var[X](Mm)24.

Zen
quelle
3
Nizza Ansatz: Es ist gut, rigorose Demonstrationen dieser Art von Dingen zu sehen.
whuber
22
+1 Nice! I learned statistics long before computers were in vogue, and one idea that was drilled into us was that
E[(Xt)2]=E[((Xμ)(tμ))2]=E[(Xμ)2]+(tμ)2
Dies ermöglichte die Berechnung der Varianz durch Ermitteln der Summe der Quadrate der Abweichungen von einem beliebigen geeigneten Punkt und anschließendes Anpassen der Vorspannung. Hier liefert diese Identität natürlich einen einfachen Beweis für das Ergebnis, dass g ( t ) einen Minimalwert bei t = μ hat, ohne dass Ableitungen usw. erforderlich sind.tg(t)t=μ
Dilip Sarwate
18

Let F be a distribution on [0,1]. We will show that if the variance of F is maximal, then F can have no support in the interior, from which it follows that F is Bernoulli and the rest is trivial.

As a matter of notation, let μk=01xkdF(x) be the kth raw moment of F (and, as usual, we write μ=μ1 and σ2=μ2μ2 for the variance).

We know F does not have all its support at one point (the variance is minimal in that case). Among other things, this implies μ lies strictly between 0 and 1. In order to argue by contradiction, suppose there is some measurable subset I in the interior (0,1) for which F(I)>0. Without any loss of generality we may assume (by changing X to 1X if need be) that F(J=I(0,μ])>0: in other words, J is obtained by cutting off any part of I above the mean and J has positive probability.

Let us alter F to F by taking all the probability out of J and placing it at 0. In so doing, μk changes to

μk=μkJxkdF(x).

As a matter of notation, let us write [g(x)]=Jg(x)dF(x) for such integrals, whence

μ2=μ2[x2],μ=μ[x].

Calculate

σ2=μ2μ2=μ2[x2](μ[x])2=σ2+((μ[x][x2])+(μ[x][x]2)).

The second term on the right, (μ[x][x]2), is non-negative because μx everywhere on J. The first term on the right can be rewritten

μ[x][x2]=μ(1[1])+([μ][x][x2]).

The first term on the right is strictly positive because (a) μ>0 and (b) [1]=F(J)<1 because we assumed F is not concentrated at a point. The second term is non-negative because it can be rewritten as [(μx)(x)] and this integrand is nonnegative from the assumptions μx on J and 0x1. It follows that σ2σ2>0.

We have just shown that under our assumptions, changing F to F strictly increases its variance. The only way this cannot happen, then, is when all the probability of F is concentrated at the endpoints 0 and 1, with (say) values 1p and p, respectively. Its variance is easily calculated to equal p(1p) which is maximal when p=1/2 and equals 1/4 there.

Now when F is a distribution on [a,b], we recenter and rescale it to a distribution on [0,1]. The recentering does not change the variance whereas the rescaling divides it by (ba)2. Thus an F with maximal variance on [a,b] corresponds to the distribution with maximal variance on [0,1]: it therefore is a Bernoulli(1/2) distribution rescaled and translated to [a,b] having variance (ba)2/4, QED.

whuber
quelle
Interesting, whuber. I didn't know this proof.
Zen
6
@Zen It's by no means as elegant as yours. I offered it because I have found myself over the years thinking in this way when confronted with much more complicated distributional inequalities: I ask how the probability can be shifted around in order to make the inequality more extreme. As an intuitive heuristic it's useful. By using approaches like the one laid out here, I suspect a general theory for proving a large class of such inequalities could be derived, with a kind of hybrid flavor of the Calculus of Variations and (finite dimensional) Lagrange multiplier techniques.
whuber
Perfect: your answer is important because it describes a more general technique that can be used to handle many other cases.
Zen
@whuber said - "I ask how the probability can be shifted around in order to make the inequality more extreme." -- this seems to be the natural way to think about such problems.
Glen_b -Reinstate Monica
There appear to be a few mistakes in the derivation. It should be
μ[x][x2]=μ(1[1])[x]+([μ][x][x2]).
Also, [(μx)(x)] does not equal [μ][x][x2] since [μ][x] is not the same as μ[x]
Leo
13

If the random variable is restricted to [a,b] and we know the mean μ=E[X], the variance is bounded by (bμ)(μa).

Let us first consider the case a=0,b=1. Note that for all x[0,1], x2x, wherefore also E[X2]E[X]. Using this result,

σ2=E[X2](E[X]2)=E[X2]μ2μμ2=μ(1μ).

To generalize to intervals [a,b] with b>a, consider Y restricted to [a,b]. Define X=Yaba, which is restricted in [0,1]. Equivalently, Y=(ba)X+a, and thus

Var[Y]=(ba)2Var[X](ba)2μX(1μX).
where the inequality is based on the first result. Now, by substituting μX=μYaba, the bound equals
(ba)2μYaba(1μYaba)=(ba)2μYababμYba=(μYa)(bμY),
which is the desired result.
Juho Kokkala
quelle
8

At @user603's request....

A useful upper bound on the variance σ2 of a random variable that takes on values in [a,b] with probability 1 is σ2(ba)24. A proof for the special case a=0,b=1 (which is what the OP asked about) can be found here on math.SE, and it is easily adapted to the more general case. As noted in my comment above and also in the answer referenced herein, a discrete random variable that takes on values a and b with equal probability 12 has variance (ba)24 and thus no tighter general bound can be found.

Another point to keep in mind is that a bounded random variable has finite variance, whereas for an unbounded random variable, the variance might not be finite, and in some cases might not even be definable. For example, the mean cannot be defined for Cauchy random variables, and so one cannot define the variance (as the expectation of the squared deviation from the mean).

Dilip Sarwate
quelle
this is a special case of @Juho's answer
Aksakal
It was just a comment, but I could also add that this answer does not answer the question asked.
Aksakal
@Aksakal So??? Juho was answering a slightly different and much more recently asked question. This new question has been merged with the one you see above, which I answered ten months ago.
Dilip Sarwate
0

are you sure that this is true in general - for continuous as well as discrete distributions? Can you provide a link to the other pages? For a general distibution on [a,b] it is trivial to show that

Var(X)=E[(XE[X])2]E[(ba)2]=(ba)2.
I can imagine that sharper inequalities exist ... Do you need the factor 1/4 for your result?

On the other hand one can find it with the factor 1/4 under the name Popoviciu's_inequality on wikipedia.

This article looks better than the wikipedia article ...

For a uniform distribution it holds that

Var(X)=(ba)212.
Ric
quelle
This page states the result with the start of a proof that gets a bit too involved for me as it seems to require an understanding of the "Fundamental Theorem of Linear Programming". sci.tech-archive.net/Archive/sci.math/2008-06/msg01239.html
Adam Russell
Thank you for putting a name to this! "Popoviciu's Inequality" is just what I needed.
Adam Russell
2
This answer makes some incorrect suggestions: 1/4 is indeed right. The reference to Popoviciu's inequality will work, but strictly speaking it applies only to distributions with finite support (in particular, that includes no continuous distributions). A limiting argument would do the trick, but something extra is needed here.
whuber
2
A continuous distribution can approach a discrete one (in cdf terms) arbitrarily closely (e.g. construct a continuous density from a given discrete one by placing a little Beta(4,4)-shaped kernel centered at each mass point - of the appropriate area - and let the standard deviation of each such kernel shrink toward zero while keeping its area constant). Such discrete bounds as discussed here will thereby also act as bounds on continuous distributions. I expect you're thinking about continuous unimodal distributions... which indeed have different upper bounds.
Glen_b -Reinstate Monica
2
Well ... my answer was the least helpful but I would leave it here due to the nice comments. Cheers,R
Ric