scipy.stats.distributions is among other things a nice formula collection.
One of the parts that are missing are the characteristic functions for the distributions.
Wikipedia is very good for large areas of statistics, see for some details and examples http://en.wikipedia.org/wiki/Characteristic_function_%28probability_theory%29 Wikipedia lists the characteristic funtion also on the pages for many distributions.
I wrote something like the script below already several times (for the easy cases).
The characteristic function for the normal distribution is easy, but looking at the characteristic function of the t-distribution, I wish someone had translated it into code already.
Since I haven't seen it yet, I sat down and tried it myself. I managed to code the characteristic function of the t-distribution, but it returns NaNs when it is evaluated close to zero for large df.
I didn't find a Bessel "k" function that works in this case
>>> special.kv(50/2., 1e-30) inf >>> special.kve(50/2., 1e-30) inf
The t-distribution approaches the normal distribution as the shape parameter, the degrees of freedom, gets large. So, the characteristic function of the t-distribution should be well behaved for large df. However, the individual terms go to infinity or zero.
Since in my current case, I don't care about the behavior of the characteristic function around zero, I stopped trying to get a better implementation.
Warning: monkey patching included in the script
Aside: I cannot make up my mind whether the abbreviation for characteristic function should be chf or cf. I have both versions in various scripts that I wrote.
Here's my current script
# -*- coding: utf-8 -*- """Characteristic Functions Created on Fri Nov 30 22:43:36 2012 Author: Josef Perktold """ import numpy as np from scipy import stats, special def chf_normal(t, loc=0, scale=1): '''characteristic function of normal distribution Parameters ---------- t : array_like points at which characteristic function is evaluated loc : float (or array_like ?) mean of underlying normal distribution scale : float (or array_like ?) standard deviation, scale of normal distribution Returns ------- chfval : ndarray characteristic function evaluated at x Notes ----- Can be used for higher dimensional arguments if ``t``, ``mean`` and ``var`` broadcast. ''' t = np.asarray(t) return np.exp(1j * t * loc - 0.5 * t**2 * scale**2) def chf_t(t, df, loc=0, scale=1): '''characteristic function of t distribution breaks down for large df and t close to zero ''' t = np.asarray(t) vhalf = df / 2. term = np.sqrt(df) * np.abs(t*scale) cf = special.kv(vhalf, term) * np.power(term, vhalf) cf = cf / special.gamma(vhalf) / 2**(vhalf - 1) cf = cf * np.exp(1j * loc * t) if cf.shape == () and t == 0: #special case: kv(0) returns nan #for df>15 or so, we also get nans in the result in neighborhood of 0 cf = np.ones((), cf.dtype) #return same dtype as cf would else: cf[t == 0] = 1 return cf def chf_t_(t, df, loc=0, scale=1): #not much, but a bit better with log vhalf = df / 2. term = np.sqrt(df) * np.abs(t*scale) cf = np.log(special.kv(vhalf, term)) + vhalf * np.log(term) cf = cf - special.gammaln(vhalf) - (vhalf - 1) * np.log(2) cf = cf + 1j * loc * t if cf.shape == () and t == 0: #special case: kv(0) returns nan #for df>15 or so, we also get nans in the result in neighborhood of 0 cf = np.zeros((), cf.dtype) #return same dtype as cf would else: cf[t == 0] = 0 return np.exp(cf) def chfn(self, t, *args, **kwds): return chf_normal(t, *args, **kwds) #monkeypatch scipy.stats stats.distributions.norm_gen._chf = chfn t = np.linspace(-1, 1, 11) print stats.norm._chf(t, loc=1, scale=2) print chf_t(t, 50, loc=1, scale=2) print chf_t_(t, 50, loc=1, scale=2)
Editorial note: I had written this initially for the scipy-user mailing list. (Old habits are difficult to break.) But I remembered just before hitting Send that the recommendation is to put it in a blog.
Application and Plot: Wrapped Circular T Distribution
As explained in my previous post, once we have the characteristic function of a distribution defined on the real line, it is simple to get the Fourier approximation for the wrapped circular distribution. As an application of the characteristic function of the t distribution, I constructed the wrapped circular distributions.
The following plot shows an example of the density functions of the wrapped Cauchy, the wrapped normal distribution, and the wrapped t distribution for a few values of the degrees of freedom. Normal and Cauchy distributions are the two extreme cases of the t distribution, when the degrees of freedom go to infinity and when the degrees of freedom is one, respectively.
The distribution in the plot have the same location and scale parameter. However, this implies that the variance of the distributions is not the same. As a different comparison we could have adjusted the scale parameter to obtain distributions with identical variance. The latter is a more informative comparison when we are estimating the parameters based on data, and the estimated distribution reflects a similar variance as the data.
The fatter tails of Cauchy and t distributions with small t are clearly visible in the plot.