Why probability density function can be seen as density of a measure with respect to another measure – an example of N(0,1)

This was originally written on Nov 3, 2013, for the probability theory course I was serving as TA.

Converted from .tex using latex2wp.

Usually, we say a random variable {X} follows a Normal(0,1) distribution, if its cumulative distribution can be expressed as:

\displaystyle P\{X\leq t\}=\int_{-\infty}^{t}\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}dx.

Now we formalize this in a more measure-theoretic way, in correspondence to what we learned in the course, particularly, why the part {\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}} is called the density of {X}: How is the term “probability density function” that we use a lot in statistics related to the concept “density” (density of a measure with respect to another measure) that we learned in class?

First of all, we need to adopt a definition of Normal(0,1) random variable. Say {X} is a random variable (i.e. measurable function) from {(\Omega,\mathscr{F})} to {(\mathbb{R},\mathscr{R})}. Denote {P} some probability measure on {(\Omega,\mathscr{F})} and {\mu} the Lebesgue measure on {(\mathbb{R},\mathscr{R})}. We say {X} is a Normal(0,1) random variable, if we have (this is the definition we adopt, i.e. a starting point for the following arguments)

\displaystyle P\{\omega:X(\omega)\leq t\}=\int_{(-\infty,t]}\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}d\mu(x). \ \ \ \ \ (1)

Now, how to convert this into a statement that {\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}} is the density of some measure with respect to some other measure? Note that when saying some function {D} is the density of some measure {\rho} with respect to some other measure {\mu}, {\rho} and {\mu} need to be defined on the same measurable space, so at this point we cannot say {\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}} is the density of {P} with respect to {\mu}.

But now the distribution comes to rescue. Recall that at some ealier time point of the class, we’ve learned the concept “distribution” of a random variable, which is a measure {L_{X}} on the target space (here {(\mathbb{R},\mathscr{R})}) defined as following: for any {A\in\mathscr{R}},

\displaystyle L_{X}(A):=PX^{-1}(A)=P\{\omega:X(\omega)\in A\}. \ \ \ \ \ (2)

So by (1) we have

\displaystyle L_{X}((-\infty,t])=\int_{(-\infty,t]}\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}d\mu(x), \ \ \ \ \ (3)

or (by some careful treatment of the fact that {\mathscr{R}} is the sigma-field generated by all half-infinity intervals and the properties of measure)

\displaystyle L_{X}(A)=\int_{A}\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}d\mu(x), \ \ \ \ \ (4)

for any {A\in\mathscr{R}}.

That is to say, {\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}} is the density of {L_{X}} (distribution of {X}, which is a probability measure) with respect to {\mu} (the Lebesgue measure on the real line).

Leave a comment