A few friends have asked for self-study resources on learning (or brushing up on) basic statistics. I plan to keep updating this post as I find more good suggestions.
Of course the ideal case is to have a good teacher in a nice classroom environment:
For self-study, however, you might try an open course online. MIT has some OpenCourseWare for mathematics (including course 18.433, “Statistics for Applications”), and Carnegie Mellon offers free online courses in statistics. I have not tried them myself yet but hear good things so far.
As for textbooks: Freedman, Pisani, and Purves’ Statistics is a classic intro to the key concepts and seems a great way to get up to speed.
Two other good “gentle” conceptual intros are The Cartoon Guide to
Statistics and How to Lie with Statistics.
But I believe they all try to avoid equations, so you might need another source to show you how to actually crunch the numbers.
My undergrad statistics class used Devore and Farnum’s Applied Statistics for Engineers and Scientists. I haven’t touched it in years, so I ought to browse it again, but I remember it demonstrated the steps of each analysis quite clearly.
If you end up using the Devore and Farnum book, Jonathan Godfrey has converted the 2nd edtion’s examples into R.
[Edit: John Cook’s blog and his commenters have some good advice about textbooks. They also cite a great article by George Cobb about how to choose a stats textbook.]
Speaking of R, I would suggest it if you don’t already have a particular statistical software package in mind. It is open source and a free download, and I find that working in R is similar to the way I think about math while working it out on paper (unlike SPSS or SAS or Stata, all of which are expensive and require a very different mindset).
I list plenty of R resources in my R101 post. In particular, John Verzani’s simpleR seems to be a good introduction to using R, and reviews on a lot of basic statistics along the way (though not in detail).
People have also recommended some books on intro stats with R, especially Dalgaard’s Introductory Statistics with R or Maindonald & Braun’s Data Analysis and Graphics Using R.
For a very different approach to introductory stats, my former professor Allen Downey wrote a book called Think Stats aimed at programmers and using Python. I’ve only just read it, and I have a few minor quibbles that I want to discuss with him, but it’s a great alternative to the classic approach. As Allen points out, “standard statistical techniques are really computational shortcuts, which is less important when computation is cheap.” Both mindsets are good to have under your belt, but Allen’s is one of the few intro books so far for the computational route. It’s published by O’Reilly but Allen also makes a free version available online, as well as a related blog full of good resources.
Speaking of O’Reilly, apparently their book Statistics Hacks contains major conceptual errors, so I would have to advise against it unless they fix them in a future edition.