The tuba effect

The Jingle All The Way 8k results are up, and naturally I was curious how I stacked against the other runners. I know I’m no sprinter, so I’ve just plotted the median times within each age-by-gender category. Apparently carrying a tuba gave me a race time comparable to the median among 70-74 year old women.

Of course I already knew I’d lose a race against my grandmother, a strong Polish woman who taught PE for many years. But when I’m carrying a tuba, your grandmother could likely beat me too.

Statistical Rules of Thumb, Gerald van Belle

Gerard van Belle’s Statistical Rules of Thumb has piqued my curiosity at conferences. It turns out my work library has a copy, which has been fun to skim, or should I say, to thumb through.

The book’s examples focus largely on medical and environmental studies, but most of the book does apply to statistics in general.

The book starts off with good “rules of thumb” in the sense of quick calculations, i.e. for the approximate sample size you’d need to get suitably precise estimates in several common situations. But van Belle also suggests more general good advice, such as typical models to start with: when to use Normal vs Exponential vs Poisson etc as your initial model, etc.

Some of my favorite pithy or self-explanatory “rules”:

1.9: “Use p-values to determine sample size, confidence intervals to report results”
3.3: “Do not correlate rates or ratios indiscriminately”
i.e. if X, Y, and Z are mutually independent, then X/Z and Y/Z will show spurious correlation.
5.8 “Distinguish between variability and uncertainty”
i.e. “reduce uncertainty but account for variability”
5.13 “Distinguish between confidence, prediction, and tolerance intervals”
6.2 “Blocking is the key to reducing variability”
6.6 “Analysis follows design”
i.e. the possible analyses will depend on how the randomization was done
6.11 “Plan for missing data”
i.e. be explicit about how you intend to deal with it
6.12 “Address multiple comparisons before starting the study”

Continue reading “Statistical Rules of Thumb, Gerald van Belle” →

Too close for bells, I’m switching to tubas

So when I’m not visualizing data or crunching small area estimates, I’ve been training to run DC’s Jingle All The Way 8k.

Most people wear little jingle bells as they run this race.
I decided to carry a tuba instead.

More photos here. The one above is thanks to a blog I found by googling the race name + tuba. Our team t-shirts said Tuba Awareness, and apparently people were indeed aware! 🙂

My time was super slow (although I placed 1st in the carrying-a-tuba category), but I did run the whole thing, and I had a blast playing carols along the way. I really need to find somewhere in DC to play regularly, though perhaps a bit more sedentary…

Moore method / inquiry-based learning in statistics?

Via Dave Richeson:

For the last 10+ years I’ve taught topology using a modified Moore method, also known as inquiry-based learning (IBL). The students are given the skeleton of a textbook; then they must prove all the theorems and solve all of the problems. They are forbidden from looking at outside sources. The class types up their work as they go. At the end of the semester they have a textbook that they wrote. It is a great way to learn, and at the end of the semester the student are thrilled to hold a bound copy of the textbook that they created.

I love this idea! Wikipedia lists several universities with math courses using the Moore method, but none in probability or mathematical statistics. Google doesn’t suggest much besides this blog post with the same idea, and this article which seems to have good advice but is no longer accessible.

Have you ever seen the Moore approach used for a statistics course? Do you have any success stories or pitfalls to share?

A Theory of Data, Clyde Coombs

Earlier I’ve quoted Leland Wilkinson in The Grammar of Graphics, where he recommends Clyde Coombs’ book A Theory of Data:

…in a landmark book, now out of print and seldom read by statisticians, Coombs (1964) … believed that the prevalent practice of modeling based on cases-by-variables data layouts often prevents researchers from considering more parsimonious structural theories and keeps them from noticing meaningful patterns in their data.

I checked out Coombs’ book through interlibrary loan and haven’t had time to read it thoroughly before the due date. But even from skimming it on the train a few days, I can see why Wilkinson recommends it.

Continue reading “A Theory of Data, Clyde Coombs” →