This is another post on showing what a few lines of matplotlib can produce.
When I wrote the test suite for scipy.stats.distributions, I had to mark quite a few distributions as slow so they are skipped under the normal test runs, because they were very slow. One of the reasons that some distributions are slow is because the generic random number generation is very indirect if only the density function is available.
For some time I was looking at spline interpolation of the inverse cumulative distribution, ppf, as a approximate way of generating random numbers. However, since scipy has no splines that impose monotonicity, that did not work.
Finally, I wrote a script that just uses linear interpolation of the cdf of a distribution, using scipy.interpolate.interp1d so we can use standard inversion to create random numbers. To check whether the interpolation looks accurate enough, I went to "proof by plotting".
The interpolating random number generator takes about 0.3 seconds for one million random numbers, not counting the setup cost of creating the interpolator. The script is currently just a quick hack to see if it works.
As an example I took the t distribution with 5 degrees of freedom, which has somewhat heavy tails. I calculated the approximation for 1000 intervals, and then for 10 and 20 intervals as contrast.
Since a large part of the "action" is in the tails, and I want to get those to be resonably accurate, I could not look just at a regular histogram since the tails are not very visible. So I looked at two variations, one with log scale, the other where the bin width is chosen so each bin has equal probability instead of equal length.
The result are the following four plots, with equal-length bins in the first row, equal-probability bins in the second row, and linear scale on the left side and log scaled probabilites on the right side. With 1000 segments in the interpolation, I don't see any systematic deviation of the random numbers from the true distribution. Below is the qqplot, generated with statsmodels, that indicates that the random numbers are consistent with a t(5) distribution.
As contrast, below are the same kind of plots for 20 intervals in the interpolation, which is a symmetric step function density with 20 intervals, many of them close to zero. The histogram shows clearly the steps, the qqplot shows systematic curved segments, which are more visible in the qqplot for 10 intervals.