joepy: Starting a list of diagnostic and specification tests

I started already several times an inventory of statistical tests in python, scipy.stats and statsmodels, compared to R.

Here is another try.

It is mainly the comparison with the R package lmtest. I just spend most of two days, writing tests against R, and before that some days of writing tests against Gretl, and before that outlier measures against SAS, as described in the previous posts.

I don't know yet how easy it will be to maintain a list like this, the current version is mainly based on parsing the lmtest index page. lmtest is not very complete, and there are tests covered in other packages and additional tests covered in Gretl.

For now I just keep it in a boring python module, with a string that's easy to manipulate and convert.

#cols: category, name, statsmodels, r_lmtest, gretl

s4 = '''\

acorr | Breusch-Godfrey Test | acorr_breush_godfrey | bgtest

acorr | Durbin-Watson Test | location_no_pvalues | dwtest

het | Breusch-Pagan Test | het_breush_pagan | bptest

het | Goldfeld-Quandt Test | het_goldfeld_quandt | gqtest

het | Harrison-McCabe test | missing | hmctest

het | White test | het_white | - |

causality | Test for Granger Causality | grangercausalitytest | grangertest

linear | Harvey-Collier Test | missing | harvtest

linear | PE Test for Linear vs. Log-Linear Specifications | missing | petest

linear | Rainbow Test | missing | raintest

func form | RESET Test | with outliers | resettest

compare nonnested | Cox Test | compare_cox | coxtest

compare nonnested | J Test | compare_j | jtest

compare nonnested | Encompassing Test | missing | encomptest

compare nested | Likelihood Ratio Test nested | compare_lr | lrtest

compare nested | Wald Test nested | compare_ftest | waldtest

coef | Testing Estimated Coefficients | t_test | coeftest

coef | Testing Estimated Coefficients | missing | coeftest.breakpointsfull

'''

add another separator

print '\n'.join(line + '|' for line in s4.split('\n'))

convert to list of list

def str2list(ss, sep='|', keep_empty=4):

Unfortunately copying into blogger doesn't preserve intend, so skip this. And convert separator to tabs, so that google spreadsheet separates the cells:

print '\n'.join(line + '|' for line in s4.split('\n'))

Tomorrow, I will start looking for the diagnostic tests that are not yet on the list. R stats has some, for example (fm is my test case linear model result)

> names(ls.diag(fm))
[1] "std.dev" "hat" "std.res" "stud.res" "cooks" "dfits"
[7] "correlation" "std.err" "cov.scaled" "cov.unscaled"

I figured out that json works pretty well transferring data from some R animals to python.

In other news:
The basic R doesn't save automatically the sessionlog or history. I was playing with Rcommander last night to see what default diagnostics they are proposing, and it crashed R after a while. Unfortunately, I hadn't saved my sessionlog and script file, so a day or two of work died with it. I didn't think about safeguarding against crashes anymore, Windows never crashes, not even the kids manage to turn it off anymore, spyder and firefox always recover after a crash with an existing history or session log. R never crashed before.

joepy

Friday, February 10, 2012

Starting a list of diagnostic and specification tests

No comments:

Post a Comment

Blog Archive

Labels

josef-pkt's Activity