The essentials of AV-over-IP

image representing av-over-ip technology

The term ‘AV-over-IP,’ or AVoIP, is used to describe the distribution of audio-visual content across a private network, using IP-like switching and configuration protocols. It’s also commonly referred to as ‘HDMI-over-IP’, because HDMI is the most common input medium. Either way it’s fast becoming a preferred method for AV distribution around the home and commercial environments.

There are several approaches to AVoIP, the utopia of which comes with the trio of low network load (bandwidth), low - or preferably no - latency and high AV quality. However of these three, Justin Kennington, president of the SDVoE Alliance says “pick two.”

For example, low compression yields the lowest latency and highest AV quality, but needs high network speed. Increasing compression lowers the network load, but may increase latency and reduce image quality. It’s all a balance, and understanding the variables is key to choosing and successfully implementing an AVoIP system for your clients. A good place to start is the infrastructure, the network.

Bandwidth & network speed

1000Base-T Ethernet (1GbE) is ubiquitous, optimally supporting up to 1Gbps of shared traffic to/from any given node. By comparison, the data rate of a 1080p60 video stream is around 3.6Gbps uncompressed (~4.5Gbps over HDMI), and 4K30 steps this up to more than 7Gbps – way too high for the average network!

There are two ways around this - 1) increase the network speed to 10GBase-T (10GbE / 10Gbps), or compress the signal to bring it in under the 1Gbps capacity of 1GbE. “When you're lighter on the wire, you have more options and less hassles”, says Miravue founder, Robert Bishop. But upgrading the network to 10GbE gives you even more options again, but that may be restricted by the cabling infrastructure in an existing network.

CAT5e won’t support 10GbE, and CAT6 UTP will only support it up to 55m; that’s total distance between endpoints, including switch and patch panel, etc. For new installations you’d be doing your clients a service by running CAT6a and/or fibre and going straight to 10GbE, where the only downside currently is cost.

What about WiFi, you may ask? Some of the best wireless access points on the market today use 802.11ac, which may support up to 1.3Gbps, subject to noise and construction etc. The emerging 802.11ad standard will boost this up considerably (~7Gbps), but still insufficient for uncompressed 4K. The general consensus is that wireless is not regarded as a good idea for AVoIP applications, but check with your vendor.


Compression is a mechanism used to reduce file size or bit rate. Greg Schlechter, technology marketing strategist at Intel Corporation says “AV on an IP network won’t require one compression method, but instead by being on IP foundation allows the compression (and bandwidth) to fit the application/need”. It’s simple logic - a single 7Gbps 4K stream over a 10GbE network may not require any compression, but will over 1GbE.

“…by being on IP foundation allows the compression (and bandwidth) to fit the application”

Compression ratio is the difference between the uncompressed file/stream and the compressed output. E.g.; A compression ratio of 10:1 produces a stream which is 1/10th the size of its uncompressed original. There are very light compression schemes, say up to 2:1 ratio, that claim to be “mathematically lossless”. Then there are several “visually lossless” solutions, meaning that any degradation in picture quality would not be overly evident to the user, and then there’s lossy types in which the compromise is evident. The impact of each is dependent on several factors, not least of which is the user’s own expectations and standards. It’s all relative.

The algorithm used to compress and decompress a signal, be it audio, video or both, is called a ‘codec’. This is short for Code-decode. As mentioned in the intro, there are several approaches – none are right or wrong, just different. For video, we’ll break these down to two main methods; intra-frame and inter-frame.

Intra-frame compression individually executes on each video frame or part thereof. JPEG2000 is by far the most common example, but in itself can vary markedly. The most rudimentary is a simple JPEG of every complete frame (known as Motion JPEG, or M-JPEG), producing perhaps the lowest quality, right up to the flagship intoPIX advanced JPEG2000, which has its roots in broadcast. Such schemes typically work on compression ratios from around 5:1 up to 20:1.

Advances in codec algorithms and chipset speed have seen intra-frame compression solutions work really well, achieving ‘visually lossless’ video with impressively low latency. Joel Mulpeter, fusion architect at Crestron states that this approach “utilises a 1gb network to ensure a system of any size… remains scalable on standard infrastructure.”

resolution comparison of uncompressed (left) vs compressed (right) of apple image

Above: Visual example of uncompressed (left) vs compressed (right) with no change to resolution

Another completely different intra-frame approach is line code compression, which compresses line-by-line, or groups of lines. Line code compression is by far the fastest, usually mere microseconds of latency, but is also the lightest compression. A leading example in AVoIP application is the SDVoE Alliance’s BlueRiver NT+, which employs either no compression or as little as 1.3:1. This is truly lossless for the very best quality, but insists a 10GbE network. Another example of line code is VESA’s Display Stream Compression (DSC), but that’s not for AVoIP; it’s deployed in DisplayPort 1.4, HDBaseT’s visually lossless compression, and the upcoming HDMI 2.1 specification.

Inter-frame methods factor data in the frames before and after the current frame when calculating compression. This results in a hugely more efficient file or stream size, but requires reading ahead, and therefore takes longer to process.

The most common inter-frame example is Advanced Video Codec (AVC), also known as H.264. This is the scheme used on Blu-ray Disc, achieving bit rates off the disc of less than 25Mbps. Remember we said that 1080p60 is around 3.6Gbps uncompressed, so the ratio achieved with H.264 is something like 150:1, combined with arguably higher picture quality than other methods. H.265, as used by Ultra HD Blu-ray, is up to 60% more efficient again than H.264, but requires some serious number crunching.

The outstanding advantage of any H.26x based solution is its VERY low bitrate; as low as 15Mbps for 1080p60, so even through an old 100Base-T modem or over wifi is a piece of cake. However, be aware of the higher latency of any inter-frame method. Miravue’s Robert Bishop adds “H.26x provides a better picture using 1/10th the bandwidth compared to others… Consider which technology would you want to install and maintain – the one that runs on anything/anywhere, even wireless?”

Another contender which has recently been announced is HDBaseT-IP™, being a hybrid system of HDBaseT and AVoIP through a bi-directional bridge. This solution adds HDBaseT’s established 5Play feature set onto a flexible AVoIP platform, capable of adapting compression ratios depending on input signal, to optimise for network speeds ranging from 1GbE, through 2.5, 5 or 10GbE.

diagram representing HDBaseT and HBaseT-IP bridged system

Above: HDBaseT & HBaseT-IP bridged system supports 1-10GbE networks. Source: HDBaseT Alliance


All of this number crunching takes time, just how long we count in fractions of seconds. Latency causes user discomfort. More to the point, it’s annoying! Think about using a remote control with 100ms of latency, navigating TV channel or Netflix menu…

The term ‘low latency’ typically means about 100ms or less. ‘Ultra-low’ is under 30ms. Really light compression such as line code methods can boast under 20 microseconds (µs), low enough to be undetectable by us mere mortals.

Some manufacturers may refer to frames of latency rather than milliseconds. This is somewhat more ambiguous as it depends on the source frame rate. One frame of latency at 30fps is twice as long as it is at 60fps. Just know what you’re dealing with by asking the vendor.

So what’s the ideal number? Well that depends on the application and user tolerance. Gamers want ZERO latency, but a sports bar distributing content to all their TVs may not care.

AV-over-IP codec comparison diagram

Above: Performance comparison guide between compression approaches. Source: CEDIA

Features & deployment

AVoIP can be set up as point-to-point in which case it doesn’t even need a network switch, or it can work on standardised IP Multicast. The latter requires the use of an Internet Group Management Protocol (IGMP) supporting switch, and plenty of backplane speed to support the traffic. Whether you opt to implement Video LANs (VLANs) is optional – not recommended by some, mandated by others through configuration tools. If you’re not sure what any of that means, I’d recommend looking into CEDIA’s Networking education pathways, from introductory to advanced level.

One fantastic aspect of AVoIP is the software-defined nature of the tech. A popular example is Multiview, whereby multiple video images can be displayed simultaneously in various configurations. The thing to keep in mind with this is the additive bandwidth of multiple simultaneous images. For example, a 500Mbps stream can fit (with headroom) through a 1GbE network, but try getting five of them to a display at the same time! The maths simply doesn’t work.

Cisco report that by 2020, 82% of global consumer Internet traffic will be video. The rise of AV-over-IP is an annex to this, for the distribution of in-home content. The increase in Internet connection speeds and download demands, along with the increase in domestic IP traffic makes the progression to faster networking inevitable. In addition to this, video formats are constantly evolving. HDMI is already delivering us 4K60 4:4:4 at 18Gbps, and is leaping up to a new peak of 48Gbps! Even with the step up to 10GbE networking, some form of compression is still an inevitability, but that’s not a bad thing.

David is a 23-year veteran of the industry, with experience spanning photographics and imaging, home theatre retail and custom installation, then product design, manufacturing, and distribution as founder of the Kordz Group. He’s worked as a CEDIA volunteer for a decade, establishing a reputation as one of the most prolific Subject Matter Experts, authors and educators on the topics of HDMI, HDBaseT and UHD. He has authored and presented numerous courses, presenting in 3 continents, along with white papers and technical webinars, and is now a full-time member of CEDIA’s professional staff.

Click below for more on: