What is the DMOS Scale

This is a continuation from our Understanding the JND Scale White Paper.

MOS is the most straight forward way of measuring video quality. A group of people are asked to rate a video sequence relative to reference (full reference). The general methodology for conducting subjective tests is outlined in ITU-R BT.500. The measurement gives a numeric value based on a 1-5 scale. The ITU recommends MOS (or DMOS) under ITU-T P.910. The heuristic, nominal values for MOS are listed below:
  • 4.4-5.0 – Very Satisfied
  • 4.0-4.3 – Satisfied
  • 3.0-3.9 – Some Users Satisfied
  • 2.0-2.9 – Many Users Dissatisfied
  • 1.0-1.9 – Most Users Dissatisfied
We chose a more modern and complimentary type of analysis that differs in philosophy from the bottom-up approaches talked about in our JND paper (referenced above). Those algorithms count the number of anomalies found in an image.

The structural similarity approach provides an alternative and complementary way to tackle the problem of video quality assessment. It is based on a top-down assumption that the HVS is highly adapted for extracting structural information from the scene, and therefore a measure of structural similarity should be a good approximation of perceived image quality. The idea is that the eye can recognize a shape even if part of it is missing or blurred/blocked. It has been shown that a simple implementation of structural similarity (SSIM) outperforms state-of-the-art perceptual image quality metrics. However, the SSIM index achieves the best performance when applied at an appropriate scale (i.e. viewer distance/screen height). Calibrating the parameters, such as viewing distance and picture resolution, create the most challenges of this approach. To rectify this, multi-scale, structure similarity (MS-SSIM) has been defined. In MS-SSIM, the picture is evaluated at various resolutions and the result is an average of these calibrated steps. It has been shown that MS-SSIM out-performs simple SSIM even when the SSIM is correctly calibrated to the environment and dataset.

The MS-SSIM algorithm generates a number per frame (or field) and this number is correlated with the subjective data collected by the University of Texas' LIVE studies. The number is reported on the DMOS scale. DMOS is simply the MOS of the reference video - MOS of the processed video. The 'D' stands for difference (or subtraction).

Video Clarity created many different test cases and scored the results. The following is a graph created using the uncompressed football sequence as the source and a 15Mbps MPEG-2 compression as the processed. The areas with better scores are I-frames as discussed here. In general, this video is considered broadcast quality.

DMOS Plot

Using ClearView, you can automatically reject frames that fall under your perceived video quality thresholds. Setting these thresholds depends on the goals of your organization. We made the following chart as a guideline that we use internally as an example.

DMOS Description
4-3.5 Unwatchable
3.4999-3.0 Annoying
2.9999-0.4 Broadcast Quality
0.3999-0.0001 Production Quality
0 No Defects

You can see that DMOS declared many of the frames unwatchable, if you view these frames, you may agree.