My philosophy of visualizing data is keep it simple, you want the audience to look at a graph and know within a few moments what it is showing. The key is for accuracy in interpreting the graph and speed; we need to be careful of potentially distorting the data because we don’t want the audience to misinterpret the graph too quickly. An easy way of distorting the data is changing the range of the axis which can cause the audience to believe there are more extreme changes in the data when there actually isn’t and the data is overall stable.
I recently read an article written by Petra Isenberg published in IEEE, which can be found here, that was a study that looked at error and time of the perception of different types of graphs including a superimposed graph which is similar to a graph with dual y-axis. They also looked at several different ways to change the scale of x-axis on the graph called lens and bifocal graphs. Out of all of the graphs that they studied the superimposed graph performed the worst and had the lowest subjective ranking of all of the graph types. A graph with two y-axes is similar to the superimposed graph because the scale of the two y-axes can be different but magnitude of the data may look more dramatic.
Out of all of the graphs that they studied the superimposed graph performed the worst and had the lowest subjective ranking of all of the graph types.
As an example I created a data set in R where the x variable was an integer and the two y variables are randomly selected from normal distributions one with mean of 2000 and the other with a mean of 0.5, their standard deviations were 500 and 0.2 respectively. I have created a plot with both axes on the same graph:
In this example we have two variables plotted against a common x, it appears that these two charts are similar but the scale of both of these are significantly different. A problem with this technique is that the data can be distorted having both graphs on the same plot which could ultimately lead to a misinterpretation of the data. This is a technique that may have potential with showing data but the possibility of distorting the data is high. ggplot2 does not support these types of graphs in R and it needs to be created using the base grid package in R.
An alternative solution to the problem is stacking the graphs on top of each other which will allow the data to be shown with an appropriate range for their respective y-axis.
I feel that this is a good alternative to the first graph because you can see that the changes in y1 (blue) are not as extreme as they look in the first graph while y2 (red) looks similar to the first. The difference between the two graphs could be cause for two different conclusions of the data. Some may argue that this graph is difficult to read because of the way the two graphs are stacked.
I personally would try to avoid the superimposed graphic if possible but of course a situation may arise that it is appropriate to use. It would be a good idea to explore other plotting options before using superimposed graph. Stacking a graph may be a good solution but you will have to decide what is best for your business needs.