# Why probability density function can be seen as density of a measure with respect to another measure – an example of N(0,1)

This was originally written on Nov 3, 2013, for the probability theory course I was serving as TA.

Converted from .tex using latex2wp.

Usually, we say a random variable ${X}$ follows a Normal(0,1) distribution, if its cumulative distribution can be expressed as:

$\displaystyle P\{X\leq t\}=\int_{-\infty}^{t}\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}dx.$

Now we formalize this in a more measure-theoretic way, in correspondence to what we learned in the course, particularly, why the part ${\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}}$ is called the density of ${X}$: How is the term “probability density function” that we use a lot in statistics related to the concept “density” (density of a measure with respect to another measure) that we learned in class?

First of all, we need to adopt a definition of Normal(0,1) random variable. Say ${X}$ is a random variable (i.e. measurable function) from ${(\Omega,\mathscr{F})}$ to ${(\mathbb{R},\mathscr{R})}$. Denote ${P}$ some probability measure on ${(\Omega,\mathscr{F})}$ and ${\mu}$ the Lebesgue measure on ${(\mathbb{R},\mathscr{R})}$. We say ${X}$ is a Normal(0,1) random variable, if we have (this is the definition we adopt, i.e. a starting point for the following arguments)

$\displaystyle P\{\omega:X(\omega)\leq t\}=\int_{(-\infty,t]}\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}d\mu(x). \ \ \ \ \ (1)$

Now, how to convert this into a statement that ${\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}}$ is the density of some measure with respect to some other measure? Note that when saying some function ${D}$ is the density of some measure ${\rho}$ with respect to some other measure ${\mu}$, ${\rho}$ and ${\mu}$ need to be defined on the same measurable space, so at this point we cannot say ${\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}}$ is the density of ${P}$ with respect to ${\mu}$.

But now the distribution comes to rescue. Recall that at some ealier time point of the class, we’ve learned the concept “distribution” of a random variable, which is a measure ${L_{X}}$ on the target space (here ${(\mathbb{R},\mathscr{R})}$) defined as following: for any ${A\in\mathscr{R}}$,

$\displaystyle L_{X}(A):=PX^{-1}(A)=P\{\omega:X(\omega)\in A\}. \ \ \ \ \ (2)$

So by (1) we have

$\displaystyle L_{X}((-\infty,t])=\int_{(-\infty,t]}\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}d\mu(x), \ \ \ \ \ (3)$

or (by some careful treatment of the fact that ${\mathscr{R}}$ is the sigma-field generated by all half-infinity intervals and the properties of measure)

$\displaystyle L_{X}(A)=\int_{A}\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}d\mu(x), \ \ \ \ \ (4)$

for any ${A\in\mathscr{R}}$.

That is to say, ${\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}}$ is the density of ${L_{X}}$ (distribution of ${X}$, which is a probability measure) with respect to ${\mu}$ (the Lebesgue measure on the real line).