scipy.stats.distributions is among other things a nice formula collection.
One of the parts that are missing are the characteristic functions for the distributions.
Wikipedia is very good for large areas of statistics, see for some details and examples http://en.wikipedia.org/wiki/Characteristic_function_%28probability_theory%29
Wikipedia lists the characteristic funtion also on the pages for many distributions.
I wrote something like the script below already several times (for the easy cases).
The characteristic function for the normal distribution is easy, but looking at the characteristic function
of the t-distribution, I wish someone had translated it into code already.
t-distribution
Since I haven't seen it yet, I sat down and tried it myself. I managed to code the characteristic function of the t-distribution, but it returns NaNs when it is evaluated close to zero for large df.
I didn't find a Bessel "k" function that works in this case
>>> special.kv(50/2., 1e-30)
inf
>>> special.kve(50/2., 1e-30)
inf
The t-distribution approaches the normal distribution as the shape parameter, the degrees of freedom,
gets large. So, the characteristic function of the t-distribution should be well behaved for large df.
However, the individual terms go to infinity or zero.
Since in my current case, I don't care about the behavior of the characteristic function around zero, I stopped trying to get a better implementation.
Warning: monkey patching included in the script
Aside: I cannot make up my mind whether the abbreviation for characteristic function should be chf or cf.
I have both versions in various scripts that I wrote.
The script
Here's my current script
# -*- coding: utf-8 -*-
"""Characteristic Functions
Created on Fri Nov 30 22:43:36 2012
Author: Josef Perktold
"""
import numpy as np
from scipy import stats, special
def chf_normal(t, loc=0, scale=1):
'''characteristic function of normal distribution
Parameters
----------
t : array_like
points at which characteristic function is evaluated
loc : float (or array_like ?)
mean of underlying normal distribution
scale : float (or array_like ?)
standard deviation, scale of normal distribution
Returns
-------
chfval : ndarray
characteristic function evaluated at x
Notes
-----
Can be used for higher dimensional arguments if ``t``, ``mean`` and ``var``
broadcast.
'''
t = np.asarray(t)
return np.exp(1j * t * loc - 0.5 * t**2 * scale**2)
def chf_t(t, df, loc=0, scale=1):
'''characteristic function of t distribution
breaks down for large df and t close to zero
'''
t = np.asarray(t)
vhalf = df / 2.
term = np.sqrt(df) * np.abs(t*scale)
cf = special.kv(vhalf, term) * np.power(term, vhalf)
cf = cf / special.gamma(vhalf) / 2**(vhalf - 1)
cf = cf * np.exp(1j * loc * t)
if cf.shape == () and t == 0:
#special case: kv(0) returns nan
#for df>15 or so, we also get nans in the result in neighborhood of 0
cf = np.ones((), cf.dtype) #return same dtype as cf would
else:
cf[t == 0] = 1
return cf
def chf_t_(t, df, loc=0, scale=1):
#not much, but a bit better with log
vhalf = df / 2.
term = np.sqrt(df) * np.abs(t*scale)
cf = np.log(special.kv(vhalf, term)) + vhalf * np.log(term)
cf = cf - special.gammaln(vhalf) - (vhalf - 1) * np.log(2)
cf = cf + 1j * loc * t
if cf.shape == () and t == 0:
#special case: kv(0) returns nan
#for df>15 or so, we also get nans in the result in neighborhood of 0
cf = np.zeros((), cf.dtype) #return same dtype as cf would
else:
cf[t == 0] = 0
return np.exp(cf)
def chfn(self, t, *args, **kwds):
return chf_normal(t, *args, **kwds)
#monkeypatch scipy.stats
stats.distributions.norm_gen._chf = chfn
t = np.linspace(-1, 1, 11)
print stats.norm._chf(t, loc=1, scale=2)
print chf_t(t, 50, loc=1, scale=2)
print chf_t_(t, 50, loc=1, scale=2)
Editorial note: I had written this initially for the scipy-user mailing list. (Old habits are difficult to break.)
But I remembered just before hitting Send that the recommendation is to put it in a blog.
Application and Plot: Wrapped Circular T Distribution
As explained in my previous post, once we have the characteristic function of a distribution
defined on the real line, it is simple to get the Fourier approximation for the
wrapped circular distribution. As an application of the characteristic function of the
t distribution, I constructed the wrapped circular distributions.
The following plot shows an example of the density functions of the wrapped Cauchy,
the wrapped normal distribution, and the wrapped t distribution for a few values of the
degrees of freedom. Normal and Cauchy distributions are the two extreme cases of the t distribution,
when the degrees of freedom go to infinity and when the degrees of freedom is one, respectively.
The distribution in the plot have the same location and scale parameter. However, this implies that
the variance of the distributions is not the same. As a different comparison we could have adjusted
the scale parameter to obtain distributions with identical variance. The latter is a more informative comparison
when we are estimating the parameters based on data, and the estimated distribution reflects a similar
variance as the data.
The fatter tails of Cauchy and t distributions with small t are clearly visible in the plot.