Parallel Coordinates

Introduction

Parallel coordinates are a way to visualize multi-dimensional data. In the visualization, there is one axis for each dimension. Each object in the dataset represented with a polyline that connects the object's values across each axis. Parallel coordinates are meant to help users detect patterns and trends in multi-dimensional data.

Multi-dimensional data often does not have an inherent ordering of the dimensions, so the axes in parallel coordinates can have any arrangement. However, this order has a large impact on what the visualization looks like. Certain trends or patterns in the data can be easier to detect with some axis orderings than others.

The parallel coordinates plot below is for a dataset on cars. Above the visualization, there is a "Shuffle Axes" button. You can click that button to randomly arrange the axes, which will give you a sense of the impact that axis order has on the visualization.

Given this, we can try to find the order of the axes that optimizes some desired property. This process involves calculating a weight or score for each pair of axes and then finding the ordering of axes that minimizes or maximizes the sum of the weights. There are many possible scoring methods. Dasgupta and Korsara propose metrics including number of edge crossings and angle of edge crossing [2], Hurley suggests maximizing the correlation between neighboring axes or minimizing total line length [5], Ankerst et al. propose similarity based measures [1], and Peng et al. define a clutter metric based on number of outliers [6]. The axis ordering section of Heinrich and Weiskopf's "State of the Art of Parallel Coordinates" was particularly helpful starting point for this project [3].

I will focus on minimizing the number of between axis edge crossings. The intuition behind this is that having a lot of edge crossings can make the visualization difficult to read.

Edge Crossings

The first step is to calculate the number of edge crossings between each pair of axes.

Following the notation used by Dasgupta and Kosara, \((l_i, r_i)\) represents a line segment from position \(l_i\) on the left axis to position \(r_i\) on the right axis [2]. If we have two lines like shown below,

then in constant time, we know there is a line crossing if \((l_1 > l_2 \wedge r_1 < r_2) \vee (l_1 < l_2 \wedge r_1 > r_2))\). We are only concerned with intersections that occur between the axes. A brute force approach to get the total number of edge crossings between a pair of axes can therefore be done in \(O(n^2)\), where \(n\) is the number of elements in the dataset.

Dasgupta and Korsara propose another approach, which depends on the height of the visualization rather than on the size of the dataset. Parallel coordinates are visualized on screens that have a finite number of discrete pixels. If an axis is \(h\) pixels high, then \(l_i\) and \(r_i\) are in the range \([0, h)\). We can then create a two-dimensional histogram representing the lines between a pair of axes, with bins defined as follows.

\[b_{ij} = |\{ k | \lfloor l_k \rfloor = i \wedge \lfloor r_k \rfloor = j\}|\]

\(b_{ij}\) represents the number of lines that go from pixel \(i\) on the left axis to pixel \(j\) on the right axis. The authors state that this histogram can be used to calculate the number of edge crossings between the two axes:

\[\text{# crossings} = \sum_{i=0}^{h-1} \sum_{j=0}^{h-1} \sum_{k=i+1}^{h-1} \sum_{l=j+1}^{h-1} b_{ij}b_{kl}\]

However, as best I can tell, this is incorrect. Consider the case when you have two parallel lines, \((0, 0)\) and \((1, 1)\). That is, one line goes from pixel 0 on the left axis to pixel 0 on the right axis and the other goes from pixel 1 on the left axis to pixel 1 on the right axis. In this situation, \(b_{00} = 1\) and \(b_{11} = 1\). The first iteration will add \(b_{00}b_{11} = 1\) to the count, even though there is no crossing. I realized this while making the below animation. It seems the algorithm can be corrected by looping over the columns in reverse order. If we take \(\sum_{j=h-1}^{0}\) to mean that \(j\) counts down from \(h-1\) to \(0\), then we have:

\[\text{# crossings} = \sum_{i=0}^{h-1} \sum_{j=h-1}^{0} \sum_{k=i+1}^{h-1} \sum_{l=j-1}^{0} b_{ij}b_{kl}\]

This algorithm is \(O(h^4)\), though as the authors note, \(b_{ij}\) is often 0, which reduces the amount of work needed to be done, so it can be sensible when \(n >> h\).

Below is an animation to see how the algorithm works, inspired by Fig. 1 in [2]. In this example, there are two axes, A and B, that are each 5 "pixels" high. To add an edge, you first click on a pixel in one axis and then click on a pixel in the other axis. Or, you can click the "Random" button to add random edges. As you add edges, the histogram to the right will update accordingly. Clicking "Calculate Crossings" will animate the steps the algorithm takes. \(b_{ij}\) is shown in blue and \(b_{kl}\) is shown in red when there is no crossing or green when there is a crossing.

We can run our edge crossing algorithm of choice to get the number of crossings between each possible pair of axes. The table below shows the results for the cars dataset. You can click on a row of the table to visualize the edges between that pair of axes.

Axis 1	Axis 2	Crosses

Optimizing Order

Complexity

Once we know the number of edge crossings between each pair of axes, the next step is to find the optimal order of the axes that minimizes the total number of crossings. Ankerst et al. prove that this problem is NP-complete [1]. Hurley and Oldford represent this as a graph traversal problem [4]. To avoid confusion when describing this graph, I will refer to its edges as links. Consider a complete graph where the vertices represent the axes of our parallel coordinates and the weight of the links represent the number of edge crossings between the two axes. For example, say we have five axes, A-E, then \(w_{AB}\) represents the number of edge crossings between axis A and axis B. Our graph would then look like this:

Finding the order of the axes that minimizes the total number of edge crossings then becomes finding the shortest path through this graph that visits every node exactly once. That is, we want to find the shortest hamiltonian path. In [5], Hurley notes that if we add an extra vertex to this graph and connect it to every other vertex with a weight of 0, then this becomes the traveling salesman problem.

Algorithms

Given the complexity of the task, a brute force solution of trying every ordering may be too expensive. There are many algorithms that can be used to get an approximate solution.

Clustering

In [5], Hurley notes that for interactive visualization, showing the data quickly and being responsive is important, so it is better to find a good ordering quickly than a near-optimal ordering slowly. She proposes an algorithm based on clustering. This algorithm starts with each axis in its own cluster. At each step, you merge the two clusters that have the pair of available axes with the lowest weight. Each cluster is ordered, so you can only join two clusters at their end points.

In the below demo, the possible pairs are listed in ascending order of edge crossings. The clusters are represented using colors. You can use the "Next" and "Back" buttons to step through the clustering. At each step, the algorithm attempts to join the next pair of axes. This isn't always possible, such as when one of the axes is in the middle of a cluster or when the axes are already in the same cluster. After a pair is attempted, it is grayed out.

If we instead sort the axis pairs in descending order of edge crossings, then the algorithm will attempt to maximize the number of crossings. You can compare the two outcomes by toggling the radio button below.

Min Max

Potential ways to improve this project

Explain more algorithms for how to find a good axis ordering once we have the weights. For example, [2] uses a branch and bound, best first search algorithm and [1] uses an algorithm inspired by how ants find shortest paths. Other algorithms for the traveling salesman problem could be explained as well.
Go into more depth on how this problem and ones similar to it relate to graph theory. [4] goes into more depth on this. For example, if we extended the parallel coordinates visualization to show all possible pairs of axes, how could we find a good ordering?
Explore more efficient algorithms for counting the number of edge crossings.

References

Papers

Ankerst, Mihael, et al. “Similarity Clustering of Dimensions for an Enhanced Visualization of Multidimensional Data.” Proceedings IEEE Symposium on Information Visualization (Cat. No.98TB100258), 1998, doi:10.1109/infvis.1998.729559.
Dasgupta, Aritra, and Robert Kosara. “Pargnostics: Screen-Space Metrics for Parallel Coordinates.” IEEE Transactions on Visualization and Computer Graphics, vol. 16, no. 6, 2010, pp. 1017–1026., doi:10.1109/tvcg.2010.184.
Heinrich, Julian, and Daniel Weiskopf. “State of the Art of Parallel Coordinates.” Eurographics, 2013, doi:10.2312/conf/EG2013/stars/095-116.
Hurley, C. B., and R. W. Oldford. “Pairwise Display of High-Dimensional Information via Eulerian Tours and Hamiltonian Decompositions.” Journal of Computational and Graphical Statistics, vol. 19, no. 4, 2010, pp. 861–886., doi:10.1198/jcgs.2010.09136.
Hurley, Catherine B. “Clustering Visualizations of Multidimensional Data.” Journal of Computational and Graphical Statistics, vol. 13, no. 4, 2004, pp. 788–806., doi:10.1198/106186004x12425.
Peng, Wei, et al. “Clutter Reduction in Multi-Dimensional Data Visualization Using Dimension Reordering.” IEEE Symposium on Information Visualization, 2004, doi:10.1109/infvis.2004.15.

Software

The code for this project is in the following GitHub repository.
I made the interactive visualizations using D3.js. I referenced the parallel coordinates implementation by Mike Bostock, specifically for how to style the axis tick marks. My implementation of parallel coordinates follows Bostock's reusable charts pattern.
The design of this site is based on a guide by Jeremy Thomas.

Datasets

Cars