4-Point Plan for Effective Data Visualization

Designing Charts that Help Make Informed Decisions

Pratik Hegde
10 min readDec 21, 2020

--

Big data analytics has been picking up very fast in all industries and making sense of humongous data just in time across resources is becoming more challenging than ever. Many companies depend on data visualization to get instant actionable insights from underlying data and access complex data sets in a user-friendly way to open up opportunities for improvement.

Data Visualization

Data visualization is the process of representing data in a meaningful and an intuitive fashion. Goal is very simple; enable users comprehend information easily to make not only faster, but also better informed decisions.

Great visualization consists of a clear message followed by the most appropriate chart

The two most important aspects of any Data Visualization are Clear Message and Appropriate Chart. Simply speaking, Clear Message is all about defining an unambiguous fundamental message to be communicated from the visualization. Appropriate Chart demands meaningfulness and usefulness out of the designed form. Coming out with these is not a rocket science that only chosen ones are capable to do. It requires understanding of very basic questions about the underlying data to be visualized.

Tip : Never overdo visualizations. Hit the sweet spot where they are catchy, informative, and easy to navigate.

As important it is to use data visualization, it is even more important to use it wisely. Before we deal with designing proper charts, let us understand some common pitfalls of why visualizations fail in most of the cases.

1. Bad data

It goes without saying that bad data leads to bad visualization. Common examples of bad data include uncleaned data, data duplication, missed data, NA values not marked, and so on. It is a good practice to track data lineage and ensure the data to be visualized yields the desired insights.

Below chart is an example of duplicate data. A pie or a donut chart should add up to 100% which in below case comes out to 130.8%. The insights derived are definitely not helping to make a proper decision.

2. Choosing inappropriate charts

This is the second biggest problem as to why visualizations fail. Wrong choice of charts fail to communicate the objective of plotting the data into charts. Also it increases time to derive insights due to confusion between the title and what is depicted.

Consider the pie chart below plotted to show the comparison between ‘Aggregate Hospital Costs’ and ‘Hospital Stays’ by Payer. The visualization enables user to read the cost distribution and relatively compare which setting did the Payer had to cover more. However, the usage of pie chart is inappropriate in this case since, the primary objective of comparing costs by Payer is not effectively communicated. Multiple shades of both green and blue colors are used for depicting costs covered by different Payers; resulting in different colors used for same Payers in the chart. This makes it difficult for end user to effectively compare between different costs to be covered by individual Payers.

Instead, a simple clustered bar chart with only two colors, one for Aggregate Hospital Costs and another for Hospital Stays, mapped against Payers on X-axis would have made the chart more consumable for the end user.

3. Incorrect proportions

Effective data visualization means making it simpler and faster for the reader to make sense of the data. Plotting data with incorrect proportions results in making user spend additional effort in firstly, ignoring the visual representation and later, in making a comparison between categories by paying focused attention on reading just the numbers to derive insights. In short, not paying attention to the proportions in data visualization is as good as leading the user to make wrong interpretation of the underlying data.

The chart snippet below summarizes comparison between care gaps identified across Heart Failure, Diabetes and Cancer.

By skimming through the chart, user interpretation is

  • All three diseases in total have equal number of gaps
  • GIC gaps are highest for diabetes
  • HCC gaps in cancer is almost twice as compared to diabetes and heart failure

But is the reading correct? Just for a moment ignore the chart and pay attention to the numbers. Firstly, heart failure has most number of gaps and the number is almost twice of gaps in cancer. Cancer has 10 HCC gaps while heart failure has 9 and diabetes has 6; not even close to our inference. Scheduling gaps proportions of 8,3 and 1 give a completely different picture. Simply said, incorrect proportions in chart is forcing the user in many ways to make erroneous interpretations of a simple data.

4. Not following conventions

Sometimes, project stakeholders go an extra mile in being creative far too enough to miss aspects of the usability of visualizations. One should understand that users have a mental model when it comes to reading any graphs. Deviating from them means risking information to be wrongly read. Also every new type of chart introduce bring in a learning curve for the user reading the chart.

A golden rule of thumb is sticking to minimal number of charts while designing for a user group.

5. Misleading axes

Data visualizations are designed to tell stories. A very subtle choice, such as changing the range of the axes creates a huge impact by misleading the user to read a completely different story.

In the chart below, the quality compliance of Commercial health insurances is perceived to be almost 2 times that of Medicaid and more than 4 times of Medicare; However, reading the numbers just show a difference of 5% and 10% respectively. What led the user to make such a bad interpretation? Missing Y axis does not provide a proper baseline forcing the users to ignore the fact that the bars are not starting from 0. Adding the Y axis provides a clear representation of the data (Chart on the right).

Another example to emphasize how removing axis jeopardizes the chart and misleads the users to derive incorrect conclusion. Line for Cancer Screening & Prevention services and line for Abortions are drawn arbitrarily. If we include Y axis, the visualization changes rapidly and the rate of slope changes and the lines would no longer overlap each other.

Lines drawn arbitrarily due to missing Y axis. The chart is misleading

Having seen major pitfalls leading to data visualization failures, it is important to think, whether the graph has been designed to tell a story that accurately reflects the underlying data, or it has been designed to tell a story that aligns more with what the designer would like us to believe.

4 Point Plan for effective Data Visualization

1. Defining clear message for the data to be visualized

Defining a clear message starts with a question, “what is the purpose of this chart?”. We need to understand the data that needs to be visualized before we can make a decision on how to represent it. Understanding objective of the KPI from a providers perspective forms a guiding path to effective visualization that communicates a clear message, a purpose for which the chart has been designed.

“Data should drive the type of chart we use and not the other way round”

However, it can sometimes, be challenging to define purpose of the charts. To aid that, here is a list of the most common reasons for which charts are created.

  • Comparison: For example, as a Medical Director, I want to compare the operation cost of different procedures performed at my facility.
  • Distribution: For example, as a Payer, I want to understand distribution of my diabetic population based on their age groups so that I can run relevant awareness programs.
  • Composition: For example, as an IT admin working for a healthcare organization, I want to know the different types of data quality issues that occurred in year 2018.
  • Relationship: For example, as a Medical Director, I want to understand the relationship between average length of stay (ALOS) and readmission rate.

Tip : Try to keep only one message per chart as much as possible. If there are more than one purposes for a chart, try breaking up into two or more charts or use combination charts

2. Understanding graphic literacy of the Users

Before we go populating high-end and fancy charts to represent the data, remember that the purpose of designing a visualization is to make the underlying data easily consumable and empower our audience to take informed decisions. To ensure this, before choosing a visualization, we must gauge the comfort level of our users to deal with different types of graphs.

For financial analysts or stock market brokers, box chart would have been a handy choice. For physicians or nurses who need to understand the information buried in the medical records, we need to minimize complexity and present the information in easily digestible ways to take smarter decisions.

Tip : Avoid using less familiar representations like sankey charts or nightingales rose chart. Use basic graphs like bar charts, line charts, pie or donut charts wherever possible.

3. Choosing a visualization (How)

Having done the background work of understanding the users and purpose of the creating the chart, isn’t it easy now to chose a visualization? Not at all! This is where it gets complicated and messy. Choosing a visualization takes more than good looking charts. Sometimes the best looking charts are the least usable charts.

Chart A

In the example above, the chart looks high-end and design portfolio-worthy. However, if we have to read it, what should we make out? The overlapping waves and the choice of colors make it difficult to differentiate the story category. This data can be made easily comprehensible by using a cluster bar chart.

Rectified version of Chart A

Some questions that can help us arrive at the correct chart are:

  • How does the chart work with big numbers?
  • How does the chart behave with maximal and minimal data together?
  • How does the chart behave if one of the attributes have no values?
  • How does the chart behave with change in number of categories?
  • Is the chart usable on all screen sizes?
  • Does the chart require legends?
Charts that can be used in different scenarios

Sometimes the best way to go might be not using a chart at all and use a grid, a map, or a single number. It all comes down to what brings the most value to the end user, not what looks better.

4. Formatting the chosen visualization

While most of the formatting depends on the context and the businesses for which the charts are design, there are some best practices that enhance usefulness and meaningfulness of the visualization. They are:

  • Chart captions: Do not leave the captions of the charts as Y axis v/s X axis. Write meaningful captions that align with users objective of viewing the chart.
  • Axes labels: Ensure all the axes have meaningful labels. Include units and multipliers if applicable. For example, Revenue (in thousand dollars)
  • Values on Axes: Based on the context provide appropriate periodic or stepped values on Axes. Example, weekly data April 01, April 02, April 03, April 04 , May 01 etc. while comparing weekly performance over quarter instead of Week1, Week2, Week3 etc.
  • Legends: Include appropriate legends wherever needed. Individual/stepped legends to assist absolute reading. Use gradient legends where relative reading is not a problem.
  • Order of data series: Chart viewers are generally skimmers. Sorting the data series reduces motor load involved and improves scanability of the chart. However, not all charts can be sorted. For example, charts showing quantitative data over time.
  • Highlights: Make smart use of highlights in the visualization to communicate insights effectively. For example, while comparing sugar levels of a diabetic patient before and after meal, highlight the part which exceeds normal sugar level.
  • Colors: Ensure to have enough colors to represent all the data series that could potentially be on the chart through out the page. In case of highlights, use universally accepted functional color hues. Green for good and red for bad will never be other way. It is through these standardization, that we are able to free-up limited attentional resources, to learn and interpret new information.
  • Remove unnecessary precision: Round off the values wherever precise data is not needed to reduce cognitive load. For example, claim denials are of value $202,000.21 for a provider. This can be rounded off as 202K in the charts and precise value can be part of the tooltip and displayed on demand.
  • Grid lines: Provide grid lines if and only if the chart demands the differences between distinct items to stand out. This is because we visually perceive difference in heights of components as ratio and not as an absolute value.

Conclusion :

Usefulness of data visualization is dictated by strong usability. As UX designers, we have a responsibility to ensure that every element we design and every interaction we introduce to the charts or any forms of data visualization has a purpose, adds value, is easy to use and understand.

We’re living in a data-driven business era and UX designers have an important role to play; make that data accessible and meaningful for end users.

If you liked reading this, take a look at

https://pratikhegde.medium.com/wireframes-vs-prototypes-what-is-the-best-design-deliverable-fd226eb95393

--

--