Category Archives: R

audiolyzR: Data sonification with R

In his talk “Give Your Data A Listen” at last summer’s useR! 2012 conference, Eric Stone presented joint work with Jesse Garrison on audiolyzR, an R package for “data sonification.” I thought this was a nifty and well-executed idea. Since I haven’t seen Eric and Jesse post any demos online yet, I’d like to share a summary and video clip here, so that I can point to them whenever I describe audiolyzR to other folks.

audiolyzR

In August I invited Eric to my workplace to speak, and he gave us a great talk including demos of features added since the useR session. Here’s the post-event summary:

Eric Stone, a PhD student at Temple University, presented his co-authored work with Jesse Garisson on “data sonification”: using sound (other than speech) to visualize a dataset.
Eric demonstrated audiolizations of scatterplots and histograms using the statistical software R and the audio toolkit Max/MSP, as well as his ongoing research on time-series line plots. The software shows a visual display of the data and then plays an audio version, with the x-axis mapped to time and the y-axis to pitch. For instance, a positively-correlated scatterplot sounds like rising scales or arpeggios. Other variables are represented by timbre, volume, etc. to distinguish them. The analyst can also tweak the tempo and other settings while listening to the data repeatedly to help outliers stand out more clearly. A few training examples helped the audience to learn how to listen to these audiolizations and identify these outliers.
Eric believes that, even if the audiolization itself is no clearer than a visual plot, activating multiple cortices in the brain makes the analyst more attuned to the data. As a musician since childhood, he succeeded in making the results sound pleasant so that they do not wear out the listener.
The software will soon be released as an R package and linked to RExcel to expand its reach to Excel users. Future work includes: 1) supporting more data structures and more layers of data in the same audiolization; 2) testing the software with visually impaired users as a tool for accessibility; and 3) developing ways to embed the audiolizations into a website.

Eric suggested that he can imagine someone using this as part of an information dashboard or for reviewing a zillion different data views in a row, while multi-tasking: Just set it to loop through each slice of the data while you work on something else. Your ears will alert you when you hit a data slice that’s unusual and worth investigating further.

Eric has kindly sent me a version of the package, and below I demonstrate a few examples using NHANES data:

I’ve asked Eric if there’s a public release coming anytime soon, but it may be a while:

I am nearly ready to release it, but it’s one of those situations where my advisor will come up with “just one more thing” to add, so, you know, it might be a while.. Anyway, if people are interested I can provide them with the software and everything. Just let me know if anyone is.

If you want to get in touch with Eric, his contact info is in the useR talk abstract linked at the top.

On a very-loosely-related note, consider also John Cook’s post on measuring evidence in decibels. Someday I’d like to re-read this after I’ve had my morning coffee and think about if there’s any useful way to turn this metaphor into literal sonic hypothesis testing.

DC R Meetup: “Analyze US Government Survey Data with R”

I really enjoyed tonight’s DC R Meetup, presented by the prolific Anthony Damico. [Edit: adding link to the full video of Anthony's talk; review is below.]

DamicoFlowchart (small)

I’ve met Anthony before to discuss whether the Census Bureau could either…

  • publish R-readable input statements for flat file public datasets (instead of only the SAS input statements we publish now); or…
  • cite his R package sascii, which automatically processes a SAS input file and reads data directly into R (no actual SAS installation required!). Folks agree sascii is an excellent tool and we’re working on the approvals to mention it on the relevant download pages.

Meanwhile, Anthony’s not just waiting around. He’s put together an awesome blog, asdfree.com (“Analyze Survey Data for Free”), where he posts complete R instructions for finding, downloading, importing, and analyzing each of several publicly-available US government survey datasets. These include, in his words, “obsessively commented” R scripts that make it easy to follow his logic and understand the analysis examples. Of course, “My syntax does not excuse you from reading the technical documentation,” but the blog posts point you to the key features of the tech docs. For each dataset on the blog, he also makes sure to replicate a set of official estimates from that survey, so you can be confident that R is producing the same results that it should. Continue reading

Animated map of 2012 US election campaigning, with R and ffmpeg

(Video link here, in case the embedded player doesn’t work for you.)

Idea: see if I can mimic the idea behind Ben Schmidt’s lovely video of ocean shipping routes, and apply it to another dataset. But which?
“Hmm… what’s another interesting dataset about some competitors traveling around a mostly-fixed area at the same time?… Hey friends, stop  giving me election news, I need to think of an idea… Oh.” Continue reading

Javascript and D3 for R users, part 2: running off the R server instead of Python

Thank you all for the positive responses to Basics of JavaScript and D3 for R Users! Quick update: last time we had to dabble in a tiny bit of Python to start a local server, in order to actually run JavaScript and D3 examples on our home computer… However, commenter Shankar had the great idea of using the R server instead. He provided some example code, but reported that it didn’t work with all the examples.

Here’s my alternative code, which works with all the D3 examples I’ve tried so far. Unlike Shankar’s approach with lower-level functions, I found it simpler to use Jeffrey Horner’s excellent Rook package.

# Load the Rook library
library(Rook)

# Where is your d3 directory located?
myD3dir <- 'C:/Downloads'

# Start the server
s <- Rhttpd$new()
s$start(quiet=TRUE)

# To view a different D3 example,
# change the directory and .html file names below
# and rerun s$add() and s$browse()
s$add(
app=Builder$new(
Static$new(
# List all the subdirectories that contain
# any files it will need to access (.js, .css, .html, etc)
urls = c('/d3','/d3/examples','/d3/examples/choropleth'),
root = myD3dir
),
Redirect$new('/d3/examples/choropleth/choropleth.html')
),
name='d3'
)
s$browse(2)
# browse(1) would load the default RookTest app instead

# When you're done,
# clean up by stopping and removing the server
s$stop()
s$remove(all=TRUE)
rm(s)

If I understand the Rook documentation correctly, you just can’t browse directories using R’s local server. So you’ll have to type in the exact directory and HTML file for each example separately. But otherwise, this should be a simple way to play with D3 for anyone who’d rather stick within R instead of installing Python.

Basics of JavaScript and D3 for R Users

Hadley Wickham, creator of the ggplot2 R package, has been learning JavaScript and its D3 library for the next iteration of ggplot2 (tentatively titled r2d3?)… so I suspect it’s only a matter of time before he pulls the rest of the R community along.

Below are a few things that weren’t obvious when I first tried reading JavaScript code and the D3 library in particular. (Please comment if you notice any errors.) Then there’s also a quick walkthrough for getting D3 examples running locally on your computer, and finally a list of other tutorials & resources. In a future post, we’ll explore one of the D3 examples and practice tweaking it.

Perhaps these short notes will help other R users get started more quickly than I did. Even if you’re a ways away from writing complex JavaScript from scratch, it can still be useful to take one of the plentiful D3 examples and modify it for your own purposes. Continue reading

Carl Morris Symposium on Large-Scale Data Inference (2/3)

Continuing the summary of last week’s symposium on statistics and data visualization (see part 1 and part 3)… Here I describe Dianne Cook’s discussion of visual inference, and Rob Kass’ talk on statistics in cognitive neuroscience.

[Edit: I've added a few more related links throughout the post.]

Continue reading

Commandeering a map from PDF or EPS, using Inkscape and R

I love Nathan Yau’s tutorial on making choropleths from a SVG file. However, if you don’t have a SVG handy already and instead you want to repurpose a map from another vector format such as PDF or EPS, there are a few extra steps that can be done in the free tool Inkscape. And while I’m at it, how could I turn down the opportunity to replicate Nathan’s Python approach in R instead?

The following was inspired by the 300-page Census Atlas of the United States, full of beautiful maps of 2000 decennial census data. I particularly liked the small multiples of state maps, which were highly generalized (i.e. the fine detail was smoothed out) but still recognizable, and DC was enlarged to be big enough to see.

I have wanted a map like this for my own purposes, when mapping a variable for all 50 states and DC. Unfortunately, I haven’t been able to track down any colleagues who know where to find the original shapefiles for this map. Fortunately, several images from the Census Atlas are available in EPS format near the bottom of this page, under “PostScript Map Files.” With access to such vector graphics, we can get started.

Continue reading

Making R graphics legible in presentation slides

I only visited a few JSM sessions today, as I’ve been focused on preparing for my own talk tomorrow morning. However, I went to several talks in a row which all had a common problem that made me cringe: graphics where the fonts (titles, axes, labels) are too small to read.

You used R's default settings when putting this graph in your slides? Too bad I won't be able to read it from anywhere but the front of the room.

Dear colleagues: if we’re going to the effort of analyzing our data carefully, and creating a lovely graph in R or otherwise to convey our results in a slideshow, let’s PLEASE save our graphs in a way that the text is legible on the slides! If the audience has to strain to read your graphics, it’s no easier to digest than a slide with dense equations or massive tables of numbers.

For those of us working in R, here are some very quick suggestions that would help me focus on the content of your graphics, not on how hard I’m squinting to read them.

Continue reading

useR 2012: main conference braindump

I knew R was versatile, but DANG, people do a lot with it:

> > … I don’t think anyone actually believes that R is designed to make *everyone* happy. For me, R does about 99% of the things I need to do, but sadly, when I need to order a pizza, I still have to pick up the telephone. —Roger Peng

> There are several chains of pizzerias in the U.S. that provide for Internet-based ordering (e.g. www.papajohnsonline.com) so, with the Internet modules in R, it’s only a matter of time before you will have a pizza-ordering function available. —Doug Bates

Indeed, the GraphApp toolkit … provides one (for use in Sydney, Australia, we presume as that is where the GraphApp author hails from). —Brian Ripley

So, heads up: the following post is super long, given how much R was covered at the conference. Much of this is a “notes-to-self” braindump of topics I’d like to follow up with further. I’m writing up the invited talks, the presentation and poster sessions, and a few other notes. The conference program has links to all the abstracts, and the main website should collect most of the slides eventually.

Continue reading

Maps of changes in area boundaries, with R

Today a coworker needed some maps showing boundary changes. I used what I learned last week in the useR 2012 geospatial data course to make a few simple maps in R, overlaid on OpenStreetMap tiles. I’m posting my maps and my R code in case others find them useful.

A change in Census block-groups from 2000 to 2010, in Mobile, AL

Continue reading