我的知识记录

digSelf

概率论:高斯分布的由来

2024-08-10
概率论:高斯分布的由来

最大熵原理视角下的概率密度函数

最大熵原理

最大熵原理(Maximum Entropy Principle, 以下简称MEP):

在给定约束条件下,一个随机变量的概率分布应该使得熵达到最大值。或者说,满足给定约束条件的所有概率分布中,熵最大的概率分布是最能代表当前系统的概率分布。——来自wikipedia

条件假设

假设已经知道关于随机变量X的潜在概率分布的均值\mu和方差\nu\rho(x)是定义在\mathbb{R}上的概率密度函数,则\rho(x)的信息熵定义如下:

s(x) = E[-\ln[\rho(x)]] = \int_{-\infty}^{\infty} -\rho(x)\ln[\rho(x)] \, \mathrm{d}x

由于\rho(x)是概率密度函数,因此有:

\int_{-\infty}^{\infty} \rho(x) \, \mathrm{d}x = 1

此外,其受到期望:

E(X)= \int_{-\infty}^{\infty} x\rho(x) \, \mathrm{d}x = \mu

和方差:

\mathrm{Var}(X) = \int_{-\infty}^{\infty} (x - \mu)^{2}\rho(x) \, \mathrm{d}x = \nu

的约束

有约束条件的优化

根据拉格朗日乘数法,有:

\begin{aligned} L(\rho, \alpha, \beta, \gamma) = &-\int_{-\infty}^{\infty} \rho(x)\ln[\rho(x)] \, \mathrm{d}x + \alpha\left[ \int_{-\infty}^{\infty} \rho(x) \, \mathrm{d}x -1 \right] \\&+ \beta \left[ \int_{-\infty}^{\infty} x\rho(x) \, \mathrm{d}x -\mu \right] + \gamma \left[ \int_{-\infty}^{\infty} (x - \mu)^{2}\rho(x) \, \mathrm{d}x - \nu \right] \end{aligned}

可以使用函数微分来最大化拉格朗日函数 L(\rho, \alpha, \beta, \gamma). 令\delta \rho是概率密度函数的足够小的改变,则对于第一项有:

\begin{aligned} (\rho + \delta \rho)\ln(\rho + \delta \rho) &= (\rho + \delta \rho)\ln\left[ \rho \left( 1 + \frac{\delta \rho}{\rho} \right) \right] \\ &= (\rho + \delta \rho) \ln\rho + (\rho + \delta \rho)\ln\left( 1 + \frac{\delta \rho}{\rho} \right) \end{aligned}

根据在x = 0处函数\mathrm{ln}(1 + x)的泰勒展开式,有:

\begin{aligned} \ln\left( 1 + \frac{\delta \rho}{\rho} \right) & = \frac{\delta \rho}{\rho} - \frac{1}{2}\left( \frac{\delta \rho}{\rho} \right)^{2} \end{aligned}

代入可得:

\begin{aligned} (\rho + \delta \rho)\ln(\rho + \delta \rho) &\approx (\rho + \delta \rho) \ln\rho + (\rho + \delta \rho)\left[ \frac{\delta \rho}{\rho} - \frac{1}{2}\frac{(\delta \rho)^{2}}{\rho ^{2}} \right] \\ &\approx (\rho + \delta \rho)\ln\rho + \delta \rho + \frac{(\delta \rho)^{2}}{\rho} - \frac{1}{2}\frac{(\delta \rho)^{2}}{\rho}- \frac{1}{2} \frac{(\delta \rho)^{3}}{\rho ^{2}} \\ &\approx \rho \ln\rho + \delta \rho\ln\rho + \delta \rho \end{aligned}

故可以推导出如下式:

\begin{aligned} L(\rho + \delta \rho) - L(\rho) = \int_{-\infty}^{\infty} \delta \rho \left( \ln\rho + 1 + \alpha + \beta x + \gamma (x - \mu)^{2} \right) \, \mathrm{d}x \end{aligned}

由于\delta \rho \neq 0\rho > 0,我们有:

\begin{aligned} &\ln\rho + 1 + \alpha + \beta x + \gamma (x - \mu)^{2} = 0 \\ \implies &\ln \rho = -(1 + \alpha + \beta x+\gamma(x - \mu)^{2}) \\ \implies &\rho = \exp(-1-\alpha-\beta x-\gamma(x-\mu)^{2}) \end{aligned}

进一步简化\rho, 有:

\begin{aligned} \rho &= C_{1}\exp(-\alpha-\beta x-\gamma(x-\mu)^{2}) \\ &= C_{2} \exp(-\beta x-\gamma(x-\mu)^{2}) \\ &= C_3 \exp\left( -\gamma\left( x - \mu+\frac{\beta}{\gamma} \right)^{2} \right) \end{aligned}

如果知道{C}_{3}, \beta\gamma,就可以得到概率密度函数\rho(x). 根据高斯积分,有:

\int_{-\infty}^{\infty} \rho(x) \, \mathrm{d}x =1 \implies C_{3} \int_{-\infty}^{\infty} \exp\left( -\gamma\left( x - \mu+\frac{\beta}{\gamma} \right)^{2} \right) \, \mathrm{d}x = 1

在标准的高斯积分形式中a = \gamma, b= -\mu + \frac{\beta}{\gamma}. 因此,可以推导出如下所示:

\begin{aligned} {C}_{3} &= \frac{1}{\sqrt{ \frac{\pi}{\gamma} }} \\ &= \sqrt{ \frac{\gamma}{\pi} } \end{aligned}

类似地,根据第二个约束条件,有:

\begin{aligned} \int_{-\infty}^{\infty} x\rho(x) \, \mathrm{d}x = \mu \iff &\sqrt{ \frac{\gamma}{\pi} } \int_{-\infty}^{\infty} x \exp\left( -\gamma \left( x- \mu + \frac{\beta}{\gamma} \right)^{2} \right) \, \mathrm{d}x = \mu \end{aligned}

进一步,有:

\begin{aligned} &\sqrt{ \frac{\gamma}{\pi} } \int_{-\infty}^{\infty} x \exp\left( -\gamma \left( x- \mu + \frac{\beta}{\gamma} \right)^{2} \right) \, \mathrm{d}x \\ = & \sqrt{ \frac{\gamma}{\pi} }\left[ \int_{-\infty}^{\infty} \left( x - \mu + \frac{\beta}{\gamma} \right)\exp\left( -\gamma \left( x- \mu + \frac{\beta}{\gamma} \right)^{2} \right)\, \mathrm{d}x + \int_{-\infty}^{\infty} \left( \mu - \frac{\beta}{\gamma} \right) \exp\left( -\gamma \left( x- \mu + \frac{\beta}{\gamma} \right)^{2} \right) \, \mathrm{d}x \right] \\ =& \sqrt{ \frac{\gamma}{\pi} }\left[ \int_{-\infty}^{\infty} t\exp(-\gamma t^{2}) \, \mathrm{d}t + \left( \mu-\frac{\beta}{\gamma} \right) \int_{-\infty}^{\infty} \exp\left( -\gamma \left( x- \mu + \frac{\beta}{\gamma} \right)^{2} \right) \, \mathrm{d}x \right] \\ =& \sqrt{ \frac{\gamma}{\pi} }\left[ -\frac{1}{2\gamma} \cdot \int_{-\infty}^{\infty} \exp(-\gamma t^{2}) \, \mathrm{d}(-\gamma t^{2}) + \left( \mu - \frac{\beta}{\gamma} \right) \sqrt{ \frac{\pi}{\gamma} }\right] \\ =& \mu - \frac{\beta}{\gamma} \end{aligned}

因此\beta = 0,进而有下式

\rho(x) = \sqrt{ \frac{\gamma}{\pi} }\exp(-\gamma(x-\mu)^{2})

根据最后的约束条件,可以推出:

\begin{aligned} &\int_{-\infty}^{\infty} (x - \mu)^{2}\rho(x) \, \mathrm{d}x = \nu \\ \implies & \sqrt{ \frac{\gamma}{\pi} } \int_{-\infty}^{\infty} (x - \mu)^{2}\exp(-\gamma(x - \mu)^{2}) \, \mathrm{d}x = \nu \\ \implies & - \frac{1}{2\gamma}\sqrt{ \frac{\gamma}{\pi} } \int_{-\infty}^{\infty} (x-\mu)^{2} \, \mathrm{d}(\exp(-\gamma(x - \mu)^{2})) = \nu \end{aligned}

对于 \int_{-\infty}^{\infty} (x-\mu)^{2} \, \mathrm{d}(\exp(-\gamma(x - \mu)^{2})) , 有:

\begin{aligned} &\int_{-\infty}^{\infty} (x-\mu)^{2} \, \mathrm{d}(\exp(-\gamma(x - \mu)^{2})) \\= &(x - \mu)^{2}\exp(-\gamma(x - \mu)^{2})\bigg|_{-\infty}^{\infty} - 2\gamma\int_{-\infty}^{\infty} (x - \mu) \exp(-\gamma(x - \mu)^{2}) \, \mathrm{d}x \\ =& \int_{-\infty}^{\infty} (x - \mu) \, \mathrm{d}(\exp(-\gamma(x - \mu)^{2})) \\ =& - \int_{-\infty}^{\infty} \exp(-\gamma(x-\mu)^{2}) \, \mathrm{d}x \\ =& - \sqrt{ \frac{\pi}{\gamma} } \end{aligned}

因此,有:

\begin{aligned} &\frac{1}{2\gamma} \cdot \sqrt{ \frac{\gamma}{\pi} } \cdot \sqrt{ \frac{\pi}{\gamma} } = \nu \\ \implies & \frac{1}{2\gamma} = \nu \\ \implies &\gamma = \frac{1}{2\nu} \end{aligned}

最后,可获得\rho(x)最终的表达式如下:

\begin{aligned} \rho(x) &= \sqrt{ \frac{\gamma}{\pi} }\exp(-\gamma(x-\mu)^{2}) \\ &= \frac{1}{\sqrt{ 2\pi \nu }} \exp\left( -\frac{1}{2\nu}(x - \mu)^{2}\right) \\ &= \frac{1}{\sigma\sqrt{ 2\pi }} \exp\left( -\frac{1}{2\sigma ^{2}}(x - \mu)^{2} \right) \end{aligned}

总结

在所有具有给定均值和方差的连续分布中,高斯分布的熵最大。最大熵原理意味着在这些约束条件下,高斯分布是最“无信息”或“随机”的分布,当对除均值和方差之外的误差的具体性质知之甚少时,高斯分布是一种自然选择。

测量误差通常被认为是各种小的、独立的误差源的总和。根据中心极限定理,我们知道误差往往呈高斯分布。

  • 0