An Introduction to Caskeid - Wireless Stream Synchronisation IP

By Imagination Technologies

Going wireless

It’s a trend happening everywhere today – devices are going wireless and connecting to the internet. And not just the ubiquitous smartphones, tablets and the smart TV either, add to this a growing list of consumer electronics products: refrigerators, ovens, games consoles, central heating systems, weather stations, radios and home stereo systems.

Indeed the simple home stereo is struggling in the digital world. Little black boxes bristling with audio outputs are proliferating throughout the home, remote controls litter the living room table, docking stations clutter the bedrooms, kitchen and study. Countless inputs and control methods, incompatible interfaces, and way too many wires have created a home entertainment headache.

Fortunately the situation is improving. Modern stereo systems often come equipped with connectivity – Ethernet, Bluetooth or Wi-Fi, enabling them to access your favourite online music streaming services or play content from your smartphone, PC or NAS drive. And now that wireless connectivity is becoming more popular, the next logical step is to incorporate wireless technology into the speakers themselves, creating an untethered audio experience to be enjoyed anywhere within the home. In turn these can build into fantastic multiroom systems enabling the music to follow the listener wherever they might be. Unfortunately this is where many fall short.

Of course, the idea of wireless audio isn’t new. We enjoyed home theatres and TVs with wireless analogue rear surround speakers over a decade ago, and today there are several digital solutions available. Some use the existing home mains wiring to eliminate unnecessary connections, others create private mesh networks to communicate between devices, many are now using Wi-Fi.

Many high-end stereo systems offer a ‘party mode’ where several devices are linked together to play the same audio stream to create a multiroom experience. However sometimes the network is unreliable, perhaps the environmental conditions are unfavourable, or the location of the transmitter is unsuitable. Networked audio devices rely upon a master clock to keep them all synchronised, if that clock signal is lost or delayed due to poor bandwidth or intermittent network communications then ‘party mode’ quickly becomes a jumbled cacophony of sound.

Figure 1: Example multiroom audio configuration, with players and speakers connected wirelessly

The problem of synchronicity

One of the main difficulties in synchronising streams over TCP/IP networks is that they employ ‘best effort’ methods to deliver IP packets – there’s no guarantee of delivery and packets arrive randomly, albeit within a reasonably predicable timeframe. Even using timestamped multicast streams, clients will drift out of synchronisation and need to be periodically corrected.

In the case of audio-visual stream delivery, the inherent latency within the network is often too large and unpredictable to reliably synchronise streams between several client devices. Whilst it’s true that software timestamping protocols running over TCP/IP can provide some degree of synchronicity between pairs of devices, these methods have to factor in network latency and round-trip packet times and are therefore not finegrained enough for audio, especially when maintaining the separation between left and right stereo audio channels or recreating perfect 5.1 surround sound.

Why synchronisation matters

Humans have evolved a highly accurate perception of sound spatial location, created by the brain analysing the minute variances in apparent amplitude and interaural time difference between sound sources. This occurs because the sound must travel slightly different distances to each ear. Studies have shown that fractions of milliseconds between sound waves arriving via left and right auditory pathways are sufficient for humans to accurately determine the direction of an audio source. In fact the human auditory system is so acutely sensitive that it is possible for us to distinguish sounds from two different locations where the angle between sources is as little as three degrees.

The chart in Figure 2 illustrates how interaural time difference corresponds to angular direction. For example, a sound delayed by 0.64mS between each ear is perceived as coming from 90° immediately left or right of the listener. Smaller delays correspond to narrower angles: a time delay of around 25μs yields the 3° difference in angle mentioned above.

Simple trigonometry and basic physics allows us to derive a model to prove the theory, and from this it can be calculated that delays in the order of microseconds are significant enough to induce a perception of several degrees shift in sound location. And it’s these subtle nuances that are introduced into an audio recording to recreate the effect of stereo and surround sound.

For products such as wireless speakers, a very tight synchronisation must be maintained between each device in order to faithfully reproduce the dynamics of the audio environment.

Figure 2: How interaural delay corresponds to spatial location of sound sources

So how does Imagination’s Caskeid IP differ?

Whereas competitive solutions rely on proprietary technologies or software timing methods, Caskeid is unique in that it exploits timing signals intrinsic to the existing Wi-Fi infrastructure in order to guarantee devices are synchronised. The beauty of this approach means that Caskeid is immune from latency within the network. It also minimises clock drift, which would otherwise propagate across clients.

Figure 3 charts the deviation of a stereo audio source between speakers over time, measured in milliseconds. Zero deviation shows audio is perfectly synchronised, and any deviation (plus or minus) from the centreline means the stereo image is biased towards either the left or right speaker.

The green line illustrates Caskeid’s performance in maintaining synchronisation between Wi-Fi connected stereo speakers. Caskeid is capable of guaranteeing a deviation of only 20μs between speakers with absolutely no drift and therefore no resultant shift in stereo image.

Figure 4 compares the results obtained from Caskeid with three leading competitive technologies for wireless multiroom and stereo audio systems.

Competitor A illustrates the problems associated with poor synchronisation: the measured maximum deviation of 1.8ms (=1,800μs) coupled with the high rate of change between left and right channels means that stereo reproduction is impossible. The listener will experience the sound field moving between speakers.

Figure 3: Caskeid synchronisation across Wi-Fi connected stereo speakers, locked at 20μs maximum deviation

Competitor B’s solution attempts to maintain a constant synchronisation offset between speakers, which lessens the perception of the stereo image migrating across the sound stage. However during tests this technology needed an average of 25 minutes to converge in order to create a fully synchronised stereo pair. Contrast this with Caskeid’s ability to lock synchronisation between devices immediately and maintain synchronous operation indefinitely.

And finally, Competitor C provided the closest results to Caskeid. During our testing the technology exhibited 160μs deviation between speakers but required an average of 20 minutes to achieve optimum synchronisation. Again compare this with Caskeid which, under identical conditions, is eight times more accurate in synchronisation from the outset. Overall the competitive technology exhibits slow synchronisation and large drift between devices resulting in a shifting stereo image.

Caskeid is proven to have class-leading performance with guaranteed microsecond accuracy of synchronisation across all wireless speakers to deliver an accurate and static stereo image, and faithful reproduction of the soundstage.

A real world implementation

It all sounds like good theory but how about a real world example? Fortunately Imagination doesn’t just license the IP, we also build products to prove the concepts and further refine the technology.

Imagination’s consumer electronics division, Pure, use Caskeid IP in their Jongo wireless speaker systems and internet-connected radios. Here’s how it works…

Figure 4: Caskeid versus three leading competitive technology solutions for synchronisation

All Jongo-enabled products are connected via Wi-Fi. Caskeid patented technology immediately synchronises all radios and speakers associated with the access point. One device is elected as master, this acts as both a media server (DMS) and renderer (DMR) and becomes responsible for both acquiring and ‘broadcasting’ the stream, which can be stored locally or be sourced from a cloud-based music service. A smartphone, tablet or even a connected (Wi-Fi enabled) radio acts as the control point (DMC). The other devices become clients and act as media renderers only (DMRs). The master sets its clock from the access point, compensates for network latency by calculating a future timestamp incorporating sufficient delay to allow the audio stream time to arrive at the client devices over standard TCP/IP protocols, then stamps the audio stream with the calculated ‘trigger’ timestamp. The clients buffer the audio stream arriving from the media server, and use the timestamp broadcast from the access point as the regulation mechanism to guarantee microsecond accuracy of synchronisation across all devices. The result is perfectly synchronised audio.

Caskeid-enabled products are proven to yield microsecond precision between devices, capable of perfectly reproducing left/right stereo channel separation and timing, or precisely replicating 5.1 and 7.1 surround sound systems wirelessly using standard Wi-Fi access points. They achieve this because Imagination’s Ensigma communications IP uniquely maintains privileged access to the different network layers, affording the fine degree of control necessary to sustain flawless synchronicity.

Add to this Imagination’s FlowCloud platform technology for cloud-based services and device management and you have a fully featured, end-to-end solution for connected, synchronous, wireless audio.

Figure 5: Jongo multiroom audio products, using Caskeid for perfect synchronisation

There’s an app for that...

Imagination also supplies the software framework and services necessary to build applications to control multiroom audio. Examples exist for both Android and iOS devices, using Imagination’s FlowCloud APIs to access cloud-based music services and control wireless speakers and radios that form the multiroom audio solution.

The application provides a full internet radio service, called FlowRadio, with instant access to over 20,000 stations worldwide, in addition to ‘listen again’ services and around 270,000 podcast episodes online. To complement internet radio, FlowMusic technology provides subscribers with a full streaming music service and offers the ability to purchase music tracks directly from any FlowCloud-enabled audio device. The content can be identified using audio fingerprinting technology and a single button press used to confirm the purchase. Tracks are made available online for download through a content portal and then may then be played on any compatible audio device.

The entire package, including the APIs to interact with devices and the backend management tools, is available as a white-label service to our customers, providing a comprehensive suite of connected audio services with full back office support for account management, portal hosting and processing of online payments. Overall it’s a perfect complement to Caskeid-powered connected audio products.

Figure 6: Pure Connect, an example of a cloud-based FlowCloud service to control Jongo multiroom audio products

Further information

Want to know more? For further information on Caskeid, FlowCloud and Ensigma IP, please go to our website at www.imgtec.com.