After a 3 month absence, the monthly Video Clarity newsletter is back. We have been busily preparing our 6.0 release, which includes audio and video DMOS perceptual metrics along with native 3G and HDMI support.

During my recent travels, I have been asked numerous times to explain the following terms:

  • Full reference versus no reference metrics
  • perceptual versus objective metrics
  • JND versus DMOS

The June newsletter will define these terms.

As always, if you have any questions, feel free to contact me.

Bill Reckwerdt
CTO, Video Clarity
bill@videoclarity.com

Definitions

  • DMOS – Differential Mean Opinion Score
  • JND – Just Noticeable Differences
  • MOS – Mean Opinion Score
  • MS-SSIM – University of Texas’ Multi-Scale Structural SIMilarities algorithm
  • PEAQ – Perceptual Evaluation of Audio Quality
  • PQR – Picture Quality Rating. Another name for the Sarnoff JND algorithm
  • Processed: This refers to the source video after it has been re-compressed, deflickered, etc. (any type of processing)
  • PSNR – Peak Signal to Noise Ratio. An objective metric that compares pixel for pixel one video to another
  • PEVQ – Perceptual Evaluation of Video Quality
  • Source - also known as reference. This refers to the original video which may be compressed or uncompressed.
  • University of Texas - http://live.ece.utexas.edu/
  • Video - a sequence that includes both audio and video
  • VQEG – Video Quality Experts Group – http://www.vqeg.org
  • VQM – Video Quality Metric

Background

We have witnessed tremendous growth in the use of digital video for communicating information. Everywhere you look you see video - YouTube, Television, Surveillance Cameras, Movies, Podcasts, etc. How do we know if the digital video quality is good or not? Frankly, it’s hard. Videos are subject to distortions during acquisition, compression, transmission, procession, and reproduction. To maintain control, it is important to vigilantly identify and quantify video quality at each and every step.

In order to assess the quality of various algorithms, a group of human testers must be gathered. The testers are shown a series of videos and asked for their opinion of the quality. The opinions are statistically averaged and a number is produced called the Mean Opinion Score (MOS). This process is called subjective testing and must be done to set a baseline. Two main organizations have done extensive subjective testing – VQEG (Video Quality Experts Group) and the University of Texas (Live Database).

Full Reference versus No Reference

Evaluating video quality falls into 3 categories:

  • Full Reference
  • No Reference
  • Reduced Reference

Full Reference presents 2 videos – the source and the processed – to a group of people and asks which one is better and by how much. Small differences in video quality can be easily judged as a comparison video is always available. What happens when it is impractical to have 2 videos present?

No Reference presents just the processed video and asks for the quality. This type of testing can only detect gross impairments as there is nothing with which to compare. What happens if the source video is bad? No reference needs some knowledge about the source video.

Reduced Reference sends data about the source video without sending the actual video. This is a hybrid approach. It sounds like the ideal method, but it does not work for the same reason that full-reference does not. It may be impractical to send additional information.

Most of the practical work has centered on defining algorithms to judge the video quality using full reference while research is looking into no reference.

Video Clarity includes full reference algorithms for determining quality and is working with several organizations and universities to define no reference metrics. Currently, the correlation of the no reference metrics to the full reference subjective data is not adequate, but rest assured that when the algorithms improve, we will provide them.

Perceptual versus Objective Metrics

In the analog days, it was pretty easy to assess the video quality by measuring frequency response or looking at color bars. Now, in the digital age, the evaluation has gotten much harder. Videos are compressed. Compression techniques use a psychophysical approach based on the human audile/visual system to remove aspects deemed to be irrelevant to quality. For instance, color fidelity, contrast sensitivity and high frequency sounds are often reduced/removed.

For this paper, we will define two terms that some authors use interchangeably – Objective and Perceptual Metrics.

Objective metrics use mathematical models to calculate the differences between the source and the processed. They then report the differences in absolute terms. For instance, PSNR (Peak Signal to Noise Ratio) is the most widely used objective metric. It measures the absolute difference between the pixel values of the two videos. What does it mean if two videos differ by 40%? Is the difference perceived? A PSNR of 60 (around 40% difference) means that every pixel could be off by 1 pixel level or it could mean that a section of the video is very bad while the rest is perfect. Most people would not be concerned about 1 pixel level, but section being bad is easily perceived. Why are Objective Metrics used then? They are excellent when you know your video characteristics. You can measure absolute difference from what it should be, which lends itself to monitoring and QA applications.

Today, people must screen all videos in real time. However, the person can miss errors. Video Clarity created RTM (http://www.videoclarity.com/cvrtm.html) to automatically detect and report drops in video quality while reporting lip-sync errors.

Many algorithms take an “engineering” approach when evaluating video quality. They look for distortions caused by applying the psychophysical approach. They count the number of distortions. The number produced by the algorithm is correlated to the subjective studies. Four algorithms were found to highly correlate with the subjective studies:

  • PQR also known as Sarnoff JND
  • VQM
  • PEAQ
  • PEVQ

Based on data published by the University of Texas, the above perceptual metrics correlated with their subjective studies between 85-90% of the time.

Another approach is to take a “top-down” approach when evaluating video quality. Can the user recognize an object regardless of the distortion? At what distance can the user no longer recognize an object? This algorithm – MS-SSIM (Multi-Scale Structural SIMilarities) – correlated with the subjective data around 93% of the time based on the same University of Texas’ study.

Video Clarity provides 3 of the 5 perceptual metrics mentioned above.

  • PQR – JND
  • PEAQ – Audio DMOS
  • MS-SSIM – Video DMOS

We chose these because they were the most widely used metrics in each category.

We also provide the following 3 objective metrics

  • PSNR
  • Temporal – Frame-to-Frame differences mapped
  • Spatial – Activity for each Frame

JND versus DMOS

JND (Just Noticeable Difference) is a scale where a JND=1 means that there is 75% probability that an observer can see the differences between 2 videos if he sees them multiple times. This can be thought of in a different way that might be simpler. The JND value identifies the sample size needed to judge the video quality before someone cannot view the difference. Using this definition, the number of people in the sample equals . For example if the JND=0 – no difference, the sample size would be 2 people. If the JND=10 – very poor quality, the sample size would be 2048 people. Many people prefer this scale due to its larger range.

DMOS (Difference Mean Opinion Score) is a scale based on the Mean Opinion Score. The scale can be any range, but the most popular is 1-5; where 5 is perfect quality. This tells you the perceived quality on the same scale as the subjective data.

Video Clarity provides both scales and created a website, based on our heuristics, for the correlation between DMOS and JND - http://www.videoclarity.com/WhatIsTheJNDScale.html.

Our ClearView product line (http://www.videoclarity.com/productsMatrix.html) plays out, records, imports files, aligns, measures, and displays videos up to 1080P/60Hz with up to 8-channels of audio.

Conclusions

When estimating video quality, research has been focused on developing novel methods to predict perceived quality at every point along the transmission/creation line. Testing video quality is an ongoing effort.  Have you tested your video quality today? If you have not, give us a call. We will show you how.

Testimonial

"With the emergence of next generation video compression algorithms, we are continuously challenged to ensure that our customers receive the best video quality experience," said Jim DeFilippis P.E., Senior Vice President & Principle Engineer Digital Television Technologies and Standards. "Using Video Clarity's ClearView system, we can quantify both subjective and objective video quality of these new compression algorithms." -FOX Technology Group