Audio Video Sync Monitor

PDF Datasheet of Audio Video Sync Monitor (2 pages, English)

Audio Video Sync Monitor (AVSM) is a very innovative software solution to measure and monitor the synchronization between audio and video based on audio and video fingerprinting.

AVSM measures the skew of audio video sources. The skew (sometimes called lip-sync offset) is the misalignment between audio and video.

In AVSM the skew is expressed in milliseconds. Perfectly synchronized audio video signals have a skew of 0 ms (no delay between audio and video). In AVSM, the skew is positive when audio arrives before video (for example, you hear a voice before you can see someone talking) and negative when video comes before audio (for example, you can see someone starting to talk but it takes a while before you can hear the voice).

AVSM is mainly designed for:

  • broadcast equipment monitoring
  • monitoring and benchmarking of different IPTV providers
  • monitoring and benchmarking of different broadcasters
  • latency measurement of audio video equipment

AVSM works under Windows (XP, Vista or Seven).

AVSM is easy to install (you just have to run the installer and click on "Next..." several times). It can run on virtually any PC. It can even run on a laptop.

However, to run the minimum configuration (two probes and one server), the two probes must be able to capture the two different audio video signals, so two capture devices may be required (unless you want to process IP streaming). Also, a fast CPU, like Intel Core i7, is highly recommended. Otherwise, the probes and the server can run on different machines.

Show more

Hide

General overview

AVSM consists in 2 applications:

  • the AVSM probe which captures audio and video signals and sends audio and video fingerprints to the AVSM server
  • the AVSM server which receives audio and video fingerprints from several probes and compares them in order to measure the skew.

At least 2 probes must run (on different machines or on the same machine):

  • one probe will process the reference audio and video signals (we'll see later that the signals are said "reference signals" but they can also be real signals: encoded and distorted): the reference probe
  • one or several probe(s) will process the tested audio and video signals: the test probe(s)

Each probe (reference probe or test probe) continuously transforms the audio video signals it receives into audio and video fingerprints. These audio and video fingerprints describe the audio signal and each video frame under the form of a quasi-unique signature. Fingerprints are robust to encoding, re-encoding, transcoding and resizing. Each fingerprint also contains the timestamp at which it was computed. This information will be the basis for all synchronization computations.

The fingerprints are regularly sent to the AVSM server. The AVSM server compares the audio and video fingerprints between the different probes connected which are connected to it.

The comparison between fingerprints from two different probes enables to precisely measure the time offsets (for both video frames and audio signals) between these two probes. Then, these time offsets enable to compute the skew (also called lip-sync offset) between the two probes.

The skew value between two probes indicates if there is a synchronization problem on a probe:

  • if you can guarantee that the reference probe captures well synchronized audio and video signals (because they are your master signals, and maybe you have someone watching it 24 hours a day): if you measure some skew on a test probe, it means that the signals at this test probe have an audio video synchronization problem
  • if you can't guarantee that the reference probe captures audio and video which are normally synchronized (because no signal is sure enough or subjectively monitored: if you measure some skew on a probe, it means that the audio and video signals at this test probe or at the reference probe have a synchronization problem. If you use several test probes, then:
    • if all the test probes have a problem, then that's the signals at your reference probe which have a synchronization problem
    • if only one test probes has a problem, then that's the signals at testtest probe which have a synchronization problem

Each AVSM probe saves the computed fingerprints in buffers. Therefore, if the connection is interrupted between an AVSM probe and the AVSM server, nothing will be lost and all the fingerprints will be sent as soon as the connection is back.

Show more

Hide

Screenshot #1 of Audio Video Sync Monitor

AVSM probe

Screenshot #2 of Audio Video Sync Monitor

Each AVSM probe processes one audio signal and one video signal.

An AVSM probe is very flexible since audio and video signals can be captured:

  • from a capture card
  • or from IP streaming (using UDP or RTP)
  • or from a file

Audio can be in mono, stereo or multi-canal. Audio can be in any depth and sampling rate but will be resampled to 16-bit 48.0 kHz. Video can be in any resolution, any frame rate.

In production, probes will generally use a capture card (like Blackmagic Design cards for example) or IP streaming. But using files can permit to evaluate AVSM more quickly.

Show more

Hide

Each probe displays:

  • descriptions of audio and video formats and received data
  • a video preview window
  • an 8-bar spectrum of the processed audio data
  • the audio time offset, the video time offset and the skew (measured by the AVSM server)

A probe can also play (in speakers) captured audio data (a slider enables to adjust audio volume).

At last, a probe also enables the user to manually save an audio video sample.

Audio video samples are saved in TS format. Since all processed data is encoded and stored in circular buffers, audio video samples represent audio and video saved from several seconds before the click until several seconds after the click.

Show more

Hide

AVSM server

The AVSM server receives and compares the fingerprints sent by the different probes: the reference probe and one or several test probe(s).

Each probe belongs to a "system". A system generally represents an audio video service, like a TV channel, that needs to be monitored from different locations and/or from different broadcasters.

A system contains at least one reference probe and one test probe. It can also contain several test probes (but only one reference probe).

In a system, the fingerprints from each test probe are compared to the fingerprints from the reference probe of this system.

A single AVSM server can process fingerprints coming from several systems and therefore an AVSM server can monitor different TV channels.

A system is identified by its "System ID" and a probe is identified by its "Probe ID". Each system and each probe can have various properties (like a name or encryption key to secure communications and prevent hacking).

Show more

Hide

By comparing the fingerprints sent by the different probes, the AVSM server computes the video time offset, the audio time offset and, finally, the skew of each probe.

The skew is then be used to trigger alerts, in order to warn users in real time that a synchronization problem has been detected between the audio and video signals at a given probe.

The measured audio time offset, video time offset and skew values are sent back from the server to the probe, to be displayed in the probe's graphical user interface.

Show more

Hide

Screenshot #3 of Audio Video Sync Monitor

Configuration (systems, probes, users) and alerts

Screenshot #4 of Audio Video Sync Monitor

The whole configuration is defined in the AVSM server. Several windows enable to create, edit and delete systems, probes and users.

As explained before:

  • a system represents a monitored audio video service, like a TV channel
  • a probe is a location where audio and video are captured and transformed into fingerprints (that will be sent to the AVSM server)
  • a user is someone who will be alerted if a synchronization problem is detected, based on detection thresholds defined by this user

Configuration is saved in human-readable text file which can also be manually edited.

Show more

Hide

The measured skew values are used to send alerts by email. Alerts exist in two types:

  • warnings
  • errors

An email alert is sent when the skew is greater than a user-defined threshold, for at least a user-defined duration.

Different thresholds and durations can be used for positive and negative skews, as well as for warnings or errors.

For example, you can:

  • send a warning to warnings@yourdomain.com when the skew is above 20 ms during at least 10 seconds
  • send an error to errors@yourdomain.com when the skew is above 100 ms during at least 500 ms

Show more

Hide

Web server

AVSM server integrates a web server (HTTP server) which provides a web interface to get the measurement results (for one or several users) using any web browser (like Mozilla Firefox, Google Chrome, Microsoft Internet Explorer or Apple Safari). It can also make the results available from any smartphone.

The web interface enables a user to:

  • compute statistics, display curves (of skew, audio time offset or video time offset) and download audio video sample from any probe
  • see in real time (with statistics and curves) the measurement being performed for a probe

Access to results can also be password-protected.

Show more

Hide

Screenshot #5 of Audio Video Sync Monitor

Results and audio video samples

Screenshot #6 of Audio Video Sync Monitor

Using the web interface, the skew curve (but also the video time offset curve and the audio time offset curve) can be drawn, for any probe, between two user-chosen dates and times. Each date and time is chosen on a calendar, with a precision of one minute (displayed results have a precision of one second).

The curve is interactive: by clicking and dragging their mouse, users can zoom (in time) on a given part of the curve.

Above the curve, statistics are computed over the displayed period of time.

Below the curve, an array displays the list of saved audio video samples during the displayed period of time. A simple click on a file enables to download it from the AVSM probe to the AVSM server and then play it from the web browser.

Audio video samples can be played using any multimedia player, like Windows Media Player, Media Player Classic or VLC.

Show more

Hide

A button also enables to manually save an audio video sample on the selected probe. This enables, from a simple web browser, to save an audio video sample at a distant location, from a several seconds before the click to several seconds after the click, and then download it.

At last, the values being displayed can be exported to a CSV file.

In real time monitoring mode, the page is regularly refreshed, the curve's start and stop times being adjusted to display the last 2 minutes, the last 5 minutes, the last hour, the last day, etc.

Show more

Hide

Conclusion

With its pioneering audio and video fingerprint technology, its flexible probes and its measurement server, AVSM at last enables to:

  • monitor synchronization between audio and video over various broadcasting networks like Terrestrial, Cable, Satellite and IPTV
  • deploy an important number of probes with "cheap" PCs
  • concentrate the CPU power (and cost) on a single server able to process all the test probes

PDF Datasheet of Audio Video Sync Monitor (2 pages, English)