The Testimator: Significance Day

A few more thoughts on JSM, from the Wednesday sessions:

I enjoyed the discussion on the US Supreme Court’s ruling regarding statistical significance. Some more details of the case are here.
In short, the company Matrixx claimed they did not need to tell investors about certain safety reports, since those results did not reach statistical significance. Matrixx essentially suggested that there should be a “bright line rule” that only statistically-significant results need to be reported.
However, the Supreme Court ruled against this view: All of the discussants seemed to agree that the Supreme Court made the right call in saying that statistical significance is not irrelevant, but we have to consider “the totality of the evidence.” That’s good advice for us all, in any context!

In particular, Jay Kadane and Don Rubin did not prepare slides and simply spoke well, which was a nice change of presentation style from most other sessions. Rubin brought up the fact that the p-value is not a property solely of the data, but also of the null hypothesis, test statistics, covariate selection, etc. So even if the court wanted a bright-line rule of this sort, how could they specify one in sufficient detail?
For that matter, while wider confidence intervals are more conservative
when trying to showing superiority of one drug over another, there are safety situations where narrower confidence intervals are actually the more conservative ones but “everyone still screws it up.” And “nobody really knows how to do multiple comparisons right” for subgroup analysis to check if the drug is safe on all subgroups. So p-values are not a good substitute for human judgment on the “totality of the evidence”.

I also enjoyed Rubin’s quote from Jerzy Neyman: “You’re getting misled by thinking that the mathematics is the statistics. It’s not.” This reminded me of David Cox’s earlier comments that statistics is about the concepts, not about the math. In the next session, Paul Velleman and Dick DeVeaux continued this theme by arguing that “statistics is science more than math.”
(I also love DeVeaux and Velleman’s 2008 Amstat News article on how “math is music; statistics is literature.” Of course Andrew Gelman presented his own views about stats vs. math on Sunday; and Perci Diaconis talked about the need for conceptually-unifying theory, rather than math-ier theory, at JSM 2010. See also recent discussion at The Statistics Forum. Clearly, defining “statistics” is a common theme lately!)

In any case, Velleman presented a common popular telling of the history behind Student’s t test, and then proceeded to bust myths behind every major point in the story. Most of all, he argued that we commonly take the wrong lessons from the story. Perhaps it was not his result (the t-test) that should be taught so much as the computationally-intensive method he first used, which is an approach that’s easier to do nowadays and may be more pedagogically valuable.
I’m also jealous of Gosset’s title at Guinness: “Head Experimental Brewer” would look great on a resume 🙂

After their talks, I went to the session honoring Joe Sedransk in order to hear Rod Little and Don Malec talk about topics closer to my work projects. Little made a point about “inferential schizophrenia”: if you use direct survey estimates for large areas, and model-based estimates for small areas, your entire estimation philosophy jumps drastically at the arbitrary dividing line between “large” and “small.” Wouldn’t it be better to use a Bayesian approach that transitions smoothly, closely approaching the direct estimates for large areas and the model estimates in small areas?
Pfeffermann and Rao commented afterwards that they don’t feel things are as “schizophrenic” as Little claims, but are glad that Bayesians are now okay with measuring the frequentist properties of their procedures (and Little claimed that Bayesian models can often end up with better frequentist properties than classical models).

In the afternoon, I sat in on Hadley Wickham’s talk about starting off statistics courses with graphical analysis.This less-intimidating approach lets beginners describe patterns right from the start.
He also commented that each new tool you introduce should be motivated by an actual problem where it’s needed: find an interesting question that is answered well by the new tool. In particular, when you combine a good dataset with an interesting question that’s well-answered by graphics, this gives students a good quick payoff for learning to program. Once they’re hooked, *then* you can move to the more abstract stuff.

Wickham grades students on their curiosity (what can we discover in this data?), skepticism (are we sure we’ve found a real pattern?), and organization (can we replicate and communicate this work well?). He provides practice drills to teach “muscle memory,” as well as many opportunities for mini-analyses to teach a good “disposition.”
This teaching philosophy reminds me a lot of Dan Meyer and Shawn Cornally’s approaches to teaching math (which I will post about separately sometime) (edit: which I have posted about elsewhere).
Wickham also collects interesting datasets, cleans them up, and posts them on Github along with his various R packages and tools including the excellent ggplot2.

The last talks I attended (by Eric Slud and Ansu Chatterjee, on variance estimation) were also related to my work on small area modeling.
I was amused by the mixed metaphors in Chatterjee’s warning to “not use the bootstrap as a sledgehammer,” and Bob Fay’s discussion featured the excellent term “Testimator” 🙂
This reminds me that last year Fay presented on the National Crime Victimization Survey, and got a laugh from the audience for pointing out that, “From a sampling point of view, it’s a problem that crime has gone down.”

Overall, I enjoyed JSM (as always). I did miss a few things from past JSM years:
This year I did not visit the ASA Student Stat Bowl competition, and I’m a bit sad that as a non-student I can no longer compete and defend my 2nd place title… although that ranking may not have held up across repeated sampling anyway 😛
I was also sad that last year’s wonderful StatAid / Statistics Without Borders mixer could not be repeated this year due to lack of funding.
But JSM was still a great chance to meet distant friends and respected colleagues, get feedback on my research and new ideas on many topics, see what’s going on in the wider world of stats (there are textbooks on Music Data Mining now?!?), and explore another city.
(Okay, I didn’t see too much of Miami beyond Lincoln Rd,
but I loved that the bookstore was creatively named Books & Books …
and the empanadas at Charlotte Bakery were outstanding!)
I also appreciate that it was an impetus to start this blog — knock on wood that it keeps going.

I look forward to JSM 2012 in San Diego!