PolicyViz episode on teaching data visualization

When I was still in DC, I knew Jon Schwabish’s work designing information and data graphics for the Congressional Budget Office. Now I’ve run across his podcast and blog, PolicyViz. There’s a lot of good material there.

I particularly liked a recent podcast episode that was a panel discussion about teaching dataviz. Schwabish and four other experienced instructors talked about course design, assignments and assessment, how to teach implementation tools, etc.

I recommend listening to the whole thing. Below are just notes-to-self on the episode, for my own future reference.

Tips for next time I teach:

  • I should put more emphasis on defining the audience for each assigned dataviz.
    Students may choose an audience themselves, or I may assign it for them… but either way, consideration of the audience should be part of my rubrics. That can also include critiquing an existing viz and figuring out who the intended audience seems to be.
    Focus on the audience is a hallmark of projects aiming for good design, rather than just showing off technique.
  • Hearst suggests having students evaluate each other in two ways: a rubric of heuristics / principles, and the ability to answer concrete questions about the data. If both approaches agree, it’s easy to grade. If they disagree, this “outlier” may be a novel design, and it’s certainly an interesting learning experience that might make you reconsider your rubric.

Course design:

  • Hearst teaches using “4 Ps”: principles, practice, peer learning, and programming.
    Her course is made of 4 modules: principles, storytelling/narrative, EDA, and advanced topics (like multidimensional viz, animation, etc.). The tools taught for each of the last 3 modules are: Illustrator, Tableau, and D3.
    It sounds like Hearst teaches principles foremost, and just picks specific tools to support that. (Though D3 was chosen by extreme popular demand, even if it’s not the best choice pedagogically.)
    There are short exercises and discussions during each lecture, plus individual pre-lecture exercises to prepare for each class, and a capstone final project.
    Hearst’s course also seems to be the only one focused on making static visualizations and graphic design; the others’ courses are more about interaction, animation, and/or research.
  • Munzner has students read a research paper (& a chapter from her book) for each class. She wants to develop them as researchers, so they read papers & write comments & present.
    There are not yet many in-class exercises at this graduate level / goal.
    There’s a big final project, for which students may choose programming or data analysis as the focus.
  • Adar assigns two big projects: one visual explanation a la Bret Victor, and one group project. [The 1st seems clearly programming-focused; is the 2nd a data analysis?]
    Students watch videos in advance (flipped classroom). Then each lecture session has in-class short group exercises, and ends with a long in-class project (structured design exercise). The long one is based on a specific paper / solution he’s found but not shown them yet; their HW for next class is to read that paper and comment on their solution, after having had to try your own solution first. This gives students more appreciation for the paper’s proposed response to the challenge; otherwise they’d give a shallower critique without understanding the necessary tradeoffs.
  • Adar says interactivity is one of the key things he wants them to learn (in Information Visualization). So while static graphics are useful practice, they’re not the ultimate goal.
    Hearst instead treats static graphs as a very important baseline to understand before interaction.
  • Student audience (in terms of “What students will take my class?”) affects the course design.
    When students come in with different skillsets (programming vs user design vs subject-matter knowledge), it’s nice to mix them up in groups to get interesting projects. Diverse subject-matter backgrounds also bring in interesting new datasets. So, there are benefits to having a mix of people in the same room, if there’s only one dataviz course on campus.
    But there are also benefits to having more targeted courses within different departments, able to dig deeper as needed. “Let a thousand flowers bloom”…
    Either way, a broad campus-wide course for undergrads (teaching statistical & visual literacy and design thinking) would be really useful at many schools.
    Also, it’d be good to figure out modules that can be taught within different departments/courses. Visual presentation should be a cross-cutting skill across many settings.

Teaching software:

  • Munzner often teaches advanced CS students, so she feels OK saying, “You go learn D3 on your own, I’ll teach the rest.” [Maybe I could treat R the same way? I should be very up-front about it!]
  • Adar finds so many people demanding D3 that he has to teach it explicitly in lab sessions, so they can have an interactive viz for their online portfolio.
  • Kosara says that Tableau lets you get started thinking about the data sooner, without bogging down in the programming first (as in R or D3 or others). You can even do data aggregation and modeling in Tableau, then export the resulting CSV for use in a more-polished D3 final product.
    Tableau also lets you play around a lot more a lot faster, which is both good & bad: it’s bad if you get stuck playing “just making pictures” instead of finding insights.

Finally, it’s funny to hear HCI folks talking about “this Data Science enthusiasm” as a mix of threat and opportunity. It’s not just stats and ML and databases who worry, apparently!
Hearst worries that “the humans are getting lost a bit”; the panelists want “to make things understandable to people, and take people into account.” I think this means: teach design thinking and consider the audience, rather than just running automated analyses. Amen to that.