Springer Link
At CMU library or other library with Springer Link access, you can download PDFs of these books for free.
- Cook & Swayne, Interactive and Dynamic Graphics for Data Analysis, With R and GGobi (Springer, Amazon). Great advice on plotting high-dimensional data, missing data, statistical and ML models, longitudinal data, networks, and visual inference. The advice is handy even if you don’t use R or GGobi.
- Wilkinson, The Grammar of Graphics (Springer, Amazon). Useful for a thorough understanding of the concepts underlying
ggplot2
and Tableau.
- Wickham,
ggplot2
(Springer, Amazon). Worth reading to really understand how the parts of ggplot2
all work together. Includes a handy reference of the available geoms, stats, and their related aes.
- Sarkar,
lattice
(Springer, Amazon). Another alternative to R base graphics, based on the work of Bill Cleveland (see Historical Classics below). Several useful plots built-in to lattice
are hard to replicate in ggplot2
, so it’s worth knowing both.
- Bivand et al., Applied Spatial Data Analysis with R (Springer, Amazon). Mostly about spatial statistical models, but also covers plotting geographic data, working with shapefiles in R, etc.
- Chen et al., Handbook of Data Visualization (Springer, Amazon). I haven’t read this yet, but several articles look useful/interesting, especially Friendly’s chapter on the history of data visualization.
- Unwin et al., Graphics of Large Datasets (Springer, Amazon). Again, I haven’t read this yet, but the authors are well-respected researchers.
Research-based overviews and guidelines
Books that summarize and cite research-backed principles, not just the author’s own preferences.
- Cairo, The Functional Art. Our main course textbook. Advice on designing graphics for a general audience, written from a data journalism perspective. Besides “pure” dataviz, also covers other information visualizations such as scientific illustration. Includes great interviews with many prominent visualization creators about their process.
- Robbins, Creating More Effective Graphs. A gentler, non-statistician’s summary of the advice in Bill Cleveland’s books (see below). Uses real examples from businesses, non-profits, government, etc. and not just scientific graphs.
- Few, Now You See It and Show Me The Numbers and Information Dashboard Design. Aimed at a broad business audience. First book is about exploratory data analysis, second book is about explanatory / presentation graphics, and third book is about dashboards. Second book has excellent discussion of table design, for when tables are necessary instead of graphs. Advice is great, but unfortunately most examples use fake datasets which do not illustrate insight into a real problem.
- Kosslyn, Graph Design for the Eye and Mind. Summary of psychology research on visual perception and lessons we can draw. Many before-and-after examples of remade graphs (of fake data). Advice about margins of error shows the author is clearly not a statistician :) but the rest of the advice is solid.
- Ware, Information Visualization. Thorough, academic, but very readable textbook summarizing the state of dataviz/infoviz research. All the evidence-based advice throughout the book is compiled in a handy appendix.
- Meirelles, Design for Information. Good coverage of data structures beyond the statistician’s usual rows-and-columns data: networks, trees, spatial, temporal, and text data. Contains many well-annotated case studies, explaining the structure of each example graphic.
- Munzer, Visualization Analysis and Design. I haven’t read it yet, but sounds very promising given early reviews and the author’s research credentials.
Influential classics
Older books that are still worth reading.
- Bertin, Semiology of Graphics. Important development in thinking about the mappings from data variables to visual variables.
- Cleveland, The Elements of Graphing Data and Visualizing Data. Hugely influential on base and
lattice
graphics in R. First book is mostly general advice on presenting graphics, based on Cleveland’s own empirical experimental research. Second book is mostly about exploratory data analysis for statistical modeling; many case studies show how graphics help you check assumptions and iteratively refine your models, and how this beats “rote data analysis.” Most of the graphics here are not beautiful, but they are extremely functional.
- Tufte, The Visual Display of Quantitative Information and Envisioning Information. If you’ve only heard of (or read) one dataviz book, it’s probably Tufte’s first book. Celebrates excellent examples of dataviz and information design. Hugely popular and beautifully designed, though largely based on the author’s personal taste, and infamous for his harsh criticism of “chartjunk.”
- Wainer, Visual Revelations. Case studies of real visualizations, with critiques and redesigns. Similar in spirit to Tufte’s books, but covering different graphics and lighter in tone. Wainer is also responsible for bringing back to print several classics including Bertin and Playfair.
Historical visualizations
Some original sources for visualizations still in use today.
Other
Programming graphics and a few special topics.
- Chang, R Graphics Cookbook. Solutions to specific problems when plotting in R. Mostly focused on
ggplot2
, but also covers general advice on R graphics and a few specialized plots that are easier in base R.
- Murrell, R Graphics. Not for beginners, but rather a guide to deep technical understanding of the R graphics engine: what’s going on under the hood with layouts, scales, etc.? Invaluable if you want to develop a super-customized R plot or ensure your plotting functions are clean enough for others to use.
- Murray, Interactive Data Visualization for the Web. This is the introductory book on D3.js; free to read online.
- Yau, Visualize This and Data Points. First book contains tutorials and general advice for several common types of graphs. Shows the wide range of tools that are useful (or even needed) for good dataviz work, from basic data manipulation through plotting and onto fine-tuning: R, Python, Illustrator, etc. plus data formats like CSV, XML, JSON, etc… though the breadth means there’s not much depth on any given tool/topic. Second book collects best-practices advice and examples.
- Donahue, Fundamental Statistical Concepts in Presenting Data. Our other course textbook. Dataviz case studies in the spirit of Tufte and Wainer, but from the author’s own consulting work (as a practicing statistician) rather than graphs found in the wild. Good advice to show the data itself first and foremost, then overlay statistical summaries only as needed. Free PDF.
- Monmonier, How to Lie with Maps. Principles of geographic map-making and how easy it is to mislead with maps, even unintentionally. Covers important topics too often neglected by statisticians graphing spatial data, including map projections. Chapter 10 is especially useful, on data maps (including dangers inherent in aggregation and classification).
- Williams, The Non-Designer’s Design Book. Best beginner’s introduction to graphic design and typography I have seen so far.
- Norman, The Design of Everyday Things. Not about dataviz as such, but a classic introduction to interaction design, with many useful lessons we can apply to interactive graphics.
- Johnson, The Ghost Map. Not a dataviz book as such, but a fascinating history of the London epidemic that was visualized in John Snow’s famous cholera map.