UX metrics: why not measure aesthetics?

Extending proven UX metrics to account for aesthetics-usability effect.

Designers do user research, prototypes, and usability tests, hoping that this will help create products that provide a great user experience. But how do we know if we are improving the product?

As renowned management theorist Peter Drucker put it:

“If you cannot measure it, you cannot improve it.”

So … we measure, in the hope of continually improving UX. We use a variety of mechanisms to determine if the prototypes and products we create provide more value than their previous versions or those of competitors. But are we measuring the right characteristics?

System Usability Scale

One tried and true UX measurement tool that many readers are likely familiar with is the System Usability Scale (SUS). SUS has been around since the 1980s and is considered a reliable indicator of a user’s subjective assessment of a product’s usability. Unfortunately, it has drawbacks.

The user is required to give only 10 answers. This may not seem like a very long list. However, this can result in users not completing the survey or pondering their response to each statement. This will lead to inaccurate answers that may affect your results. Especially when SUS is added (often at the end) to a larger survey, usability study, or user study. Then the respondents may face decision fatigue.

In addition, although the specific wording of SUS statements varies from version to version, they often alternate between positive and negative tone. For example:

  • I think I would like to use this system often. (Positive tone)
  • I found the system overly complex. (Negative tone)

This leads to several troublesome side effects. For example, research has shown that the use of mixed tone in SUS has no practical value. On top of that, if the user does not read carefully, they may misjudge their agreement or disagreement with each statement. Because he may think that all statements are written in the same tone (usually positive). Finally, the alternation of positive and negative wording complicates the calculation of the overall SUS score somewhat.

In short, SUS is a great tool and not without its drawbacks.

Is there an easier way?

UMUX-Lite says: “YES!“It shortens the 10 SUS statements to two:

  • Capabilities [этой системы] meet my requirements.
  • [Эта система] easy to use.

The first statement measures the perceived utility: does it add value to me? Is this helpful to me? The second measures usability: can i figure it out? Does this make sense to me?

Fiction! We can measure UX with just two answers. But wait … what about aesthetics?

There is compelling evidence that product aesthetics affect the perception of user experience. So why not measure it?

Conduct a UUA survey

To account for the impact of aesthetics on the overall UX rating, I expanded on UMUX-Lite by adding a third question. The result is a UUA survey (Utility, Usability, Aesthetics – utility, usability, aesthetics).

UUA poll

  • Capabilities [этой системы] meet my requirements.
  • [Эта система] easy to use.
  • [Эта система] aesthetically pleasing and acceptable.

The aesthetic statement focuses on the appearance of the product. It asks respondents to judge whether a product suits them and whether its style fits the context. The aesthetic appeal and acceptability of the financial system and child’s play will obviously differ greatly.

All statements are rated on a scale of 0 to 5, where 0 means you strongly disagree and 5 means you strongly agree with a statement. This provides a clear indication of the user’s perception of the usefulness, usability and aesthetics of the product. Each of these metrics taken separately can provide useful insights, but how do they all fit together to give us an overall user experience score?

Determining the overall score

We could calculate the overall UX rating by simply averaging three metrics, but are utility, usability, and aesthetics really equal?

If a product lacks usefulness (i.e., I don’t use it), usability and aesthetics don’t really matter. For example, if I need to hammer in a nail, I’ll use an ugly hammer, not the world’s best screwdriver.

Likewise, utility is more important than usability. Fred Davis (creator of the Technology Adoption Model) has shown that utility is 1.5 times more important than ease of use as an indicator of actual product use.

To account for this imbalance between the three metrics, the overall UX rating is calculated by weighing the results of each individual element: utility costs 3x, usability 2x, and aesthetics 1x.

This is expressed as an equation (resulting in scores of 0 to 5):

Total = ((Usefulness * 3) + (Usability * 2) + Aesthetics) / 6

A pyramid of utility, usability and aesthetics

Targets

Some answers of respondents with a grade of 2 or lower are considered negative, and 3 or higher – positive. However, grades 4 and 5 are considered design targets.

When looking at the average scores for individual statement responses or the overall UX score calculation, the following ranges are available:

  • bad (<= 2): Any ratings in this range are for further study to address the root cause of such a low rating.
  • Need to be finalized (from 2.1 to 3.4): not negative, but not very positive either.
  • Okay (3.5 to 4.4): A score of 3.5 or higher corresponds to the results you want to see.
  • Excellent (4.5 to 5): A score of 4.5 or higher indicates exceptional user satisfaction.

This graph illustrates targets averaged across a group of respondents.

UX metrics: why not measure aesthetics?
Target Ranges for UUA Means

Aesthetics are important, but should not be overestimated

UMUX-Lite completely ignores the aesthetic dimension, which (given the aesthetic-usability effect) can be problematic. However, beauty is not everything. We need products to be useful and easy to use, perhaps even more than stunning looks.

The UUA survey strives to properly balance the impact of aesthetics, usability, and usefulness on user experience measurements. It tries to do it in a short and simple format that users are more likely to respond to.

I want to thank Didier chincholle, Jeff Patton, and Catherine Chiodo, for sharing unique ideas with me that greatly influenced my understanding of this topic.

Leave a Reply

Your email address will not be published. Required fields are marked *