In previous newsletters, we talked about the effect that packet loss and network latency would have on IP video. In both cases, we found that any deterioration of the video depended on whether the transport protocol in use was TCP or UDP. Today, we will study network jitter and see what problems it might cause.
First, we should clarify what the term network jitter means. Suppose a source sends packets into the network intended for a specific destination device and sends them at the uniform rate of x packets/second. They will usually arrive at the destination with an average rate of x packets/second. However, due to delays in network buffers, competing traffic in the network and possible quality of service policies, they will probably not be delivered uniformly. The amount of deviation from the expected arrival time is called jitter.
VoIP not only introduced network administrators to the concept of jitter, it provides an excellent example of its importance. By default, most IP phones are configured to transmit 50 packets/second of digitized voice. That’s one packet every 20 ms. However, on a timeline, the arrival times might be t= 18, 41, 58, 84, 97, … (ms). To accommodate this somewhat erratic delivery times of the packets, there will be a jitter buffer at the receiver to capture the packets. This allows the receiving phone to playout the packets uniformly at exactly 50 packets/sec.
But what happens if the jitter level becomes too high? The receive buffer might empty, which requires silence to be played. Or, the buffer might overfill, which would compel packets to be dropped. A little known fact is that if just one packet is dropped, the receiver can repeat playing the previous sound and the user will not be aware of the missing sound. Our listening skills aren’t capable of detecting corrupted sound if it lasts only 1/50 of a second. We might also ask, to prevent loss, why not make the receive buffer very large? Unfortunately, that introduces latency and if the delay exceeds about 150 ms, it becomes noticeable as a pause by the other party on the call.
Both VoIP and most video conferencing systems use RTP (Real-time Protocol) and UDP as the transport protocol. So, there are no retransmissions of lost packets. As a result, the engineering of the proper size of receiver jitter buffer is critical for both VoIP and video conferencing systems.
But, as we might expect, video carried by TCP is much different. It can generally tolerate moderate amounts of jitter in the network. The most common form of TCP video is ABR (adaptive bit rate) used in streaming video. The TCP algorithm will make slight adjustments to the retransmission timer when jitter is high. The net effect on the play out of the video is minor. What can be a larger issue is that jitter can cause network buffers to become congested, significantly increasing delay. If this happens, requests for the next chunk of video (generally ten seconds play time) might be delayed, and the user will see a pause in the playing of the video.