Latency from Microphone to Headphone in AoIP Solutions

22-06-2017 kl. 14:19


Latency has been a hot topic on and off ever since digital audio emerged a few decades ago. The phenomenon, however, is by no means new…

In a digital world, the audio you listen to is delayed by a certain amount, which can be caused by a number of factors, but often the delay is so subtle that you will never notice it. In fact the term ‘delay’ is precisely what happens when we talk about ‘latency’, and audio delay has always existed. Quite simply, sound waves have to travel over distance and it takes time to get from A to B. Period. 

stage-monitorsWhen we talk about sounds reinforcement, for example, various speakers are often deliberately delayed by a few milliseconds (ms) to compensate with regard to the listening position, and if we take a strict analog example, latency is there, but has never been encountered as a problem. 

On a live music stage, you will often see wedges – or monitors – placed on the floor. They are most likely triangular in shape, designed to play upwards and towards the musician. This is necessary because if the stage is fairly large, the main speakers would be so far away that the artists would experience a very noticeable amount of delay. With monitors nearby, on the other hand, they are perfectly fine. But considering the distance even from the floor and to the ear of the musician, if the distance is 2 meters, there is actually a latency of 6ms.

‘Traditional’ Points of Latency

Now, the above example was strictly analog, but as mentioned, once we started working with digital audio, latency became an issue. So, even before introducing potential latency on an IP-based audio network, it is important to clarify that there are numerous points where latency occurs – in the both the analog and digital domains. Let’s take a closer look in the case of a vocal recording in a typical studio environment:

• Distance from the vocalists mouth to the microphone – 25cm of distance equals approximately 0.8ms of analog latency.

• AD conversion – Filters typically introduce around 20 samples of digital delay, which translates into 0.4ms of latency.

• ASIO / DAW – Getting in and out of your DAW also introduces digital latency. Typically 64 samples, which equals 1.3ms in either direction. 2.6ms in total.

• Software Plugins – In many cases added plugins will also add to the total latency, but since these are not necessarily added during tracking (latency is less of an issue when mixing, of course), we will not add a specific amount of latency in this example.

• DA conversion – Filters typically introduce between 10-40 samples of digital delay. If we use 30 samples as an example, this equals 0.6ms of digital latency.

• Monitoring – In this case a vocalist would wear headphones and there would be no analog latency worth mentioning, but if for instance a guitarist or bassist would be tracking in the control room, there would most likely be at least 1-2ms of analog latency from the monitors and to his/her ears.  

OK. If we add up those points of latency, you would experience a total of 4.4ms of latency. This would be at a sample rate of 48 kHz, and in case you record at 96 kHz the latency introduced digitally (AD/DA conversion and ASIO/DAW) would be halved. The distance from mount to mic would of course be the same, so in total the latency at 96 kHz would be 2.6ms.

AoIP Latency

ethernet-cableNow, let’s find out how much additional latency you could expect in case you decide to establish an IP-based audio solution such as Audinate’s Dante network protocol.

There is not a fixed value of ms of latency being introduced on a Dante network. It depends on a number of things, including your computer, the amount of switches in the total network, etc. But if you have less than 4 switches in your network, you could likely go as low as 0.25 ms of latency on either side of the ASIO/DAW instance.

In that case, you should add 0.5ms of latency that is caused by the network: 

Total latency @48 kHz: 4.9ms

Total latency @96 kHz: 3.1ms

The important thing to keep in mind, though, is that if we look at the percentage of latency introduced by adding IP audio, the difference is very subtle: 

Percentage of latency caused by the network:

48 kHz: approx.. 10%

96 kHz: approx. 16%


We think that it is fair to conclude that adding AoIP is not what is going to make or break your recording in terms of latency. It is a myth that you cannot track live performances that need to be monitored in real time on an audio network. It is indeed perfectly possible, and most artists would be completely fine with a latency below 6ms. Again just consider the musician performing live on the stage, listening to monitor wedges on the floor or an amplifier several meters away.

As a closing statement, we would like to stress that the numbers we used in the above examples are not set in stone. They are examples and may therefore vary slightly depending on the manufacturer of the equipment you use, your computer’s performance, etc. But we hope that you now have a better idea of how your overall latency would become affected in case you consider – or decide – to move on to the world of networked audio.

Finally, if you would like to learn more about Audinate’s Dante protocol, we have compiled 10 of the most commonly asked questions in this article… 

Even More Details

Our Business Development Manager, Jan Lykke, gave a speech on this very topic at the AES conference and exhibition in Berlin, Germany, in May 2017. If you want even more details, please watch his presentation right here: 


ntp-icon-2 NTP Technology | Nybrovej 99 | 2820 Gentofte | Denmark |
Phone (+45) 45 96 88 80 | Fax (+45) 44 53 11 70 | Email:

Accept cookies