Never Use Datagrams: Here's Why

BY Mark Howell 23 June 20248 MINS READ
article cover

So you’re reading this blog over the internet. I would wager you do a lot of things over the internet. If you’ve built an application on the internet, you’ve undoubtedly had to decide whether to use TCP or UDP. Maybe you’re trying to make, oh I dunno, a live video protocol or something. There are more choices than just those two but let’s pretend like we’re a networking textbook from the 90s.
Common Wisdom: What the heck does that mean? Who wants unreliability? Nobody wants memory corruption or deadzones or artifacts or cosmic rays. Unreliability is a consequence, not a goal. Unless you’re making some cursed GIF art.

Source Properties

So what do we actually want? If you go low enough level, you can use electrical impulses to do neat stuff. But we don’t want to deal with electrical impulses. We want higher level functionality. Fortunately, software engineering is all about standing on the shoulders of others. There are layers on top of layers on top of layers of abstraction. Each layer provides properties so you don’t have to reinvent the personal computer every time. Our job as developers is to decide which shoulders we want to stand on. But some shoulders are awful, so we have to be selective.
Over-abstraction is bad but so is under-abstraction. What user experience are we trying to build, and how can we leverage the properties of existing layers to achieve that?

“Unreliable”

There was a recent MoQ interim in Denver. For those unaware, it’s basically a meetup of masochistic super nerds who want to design a live video protocol. We spent hours debating the semantic differences between FETCH and SUBSCRIBE among other riveting topics. I’m the one in the back right corner, the one with the stupid grin on their face. A few times, it was stated that SUBSCRIBE should be unreliable. The room cringed, and I hard cringed enough to write this blog post. What I actually want is timeliness.
If the internet can choose between delivering two pieces of data, I want it to deliver the newest one. In the live video scenario, this is the difference between buffering and skipping ahead. If you’re trying to have a conversation with someone on the internet, there can’t be a delay. You don’t want a buffering spinner on top of their face, nor do you want to hear what they said 5 seconds ago. To accomplish timeliness, the live video industry often uses UDP datagrams instead of TCP streams. As does the video game industry apparently. But why?

Description: Packets being exchanged over a network

Datagrams

A datagram, aka an IP packet, is an envelope of 0s and 1s that gets sent from a source address to a destination address. Each device has a different maximum size allowed, which is super annoying, but 1200 bytes is generally safe. Of course, they can be silently lost or even arrive out of order. But the physical world doesn’t work in discrete packets; it’s yet another layer of abstraction. I’m not a scientist-man, but the data is converted to analog signals and sent through some medium. It all gets serialized and deserialized and buffered and queued and retransmitted and dropped and corrupted and delayed and reordered and duplicated and lost and all sorts of other things. So why does this abstraction exist?

Internet of Queues

It’s pretty simple actually: something’s got to give.
Let the packets hit the FLOOR. When there’s too much data sent over the network, the network has to decide what to do. In theory, it could drop random bits but oh lord that is a nightmare, as evidenced by over-the-air TV. So instead, a bunch of smart people got together and decided that routers should drop at packet boundaries. But why drop packets again?
Why can’t we just queue and deliver them later? Well yeah, that’s what a lot of routers do these days since RAM is cheap. It’s a phenomenon called bufferbloat, and my coworkers can attest that it’s my favorite thing to talk about. 🐷 But RAM is a finite resource so the packets will eventually get dropped. Then you finally get the unreliability you wanted all along…
Oh no. Oh shit, I forgot, I actually want timeliness and bufferbloat is the worst possible scenario.
Naively, you would expect the internet to deliver packets immediately, with some random packets getting dropped. However, bufferbloat causes all packets to get queued, possibly for seconds, ruling out any hope of timely delivery. How do you avoid this? Basically, the only way to avoid queuing is to detect it, and then send less. The sender uses some feedback from the receiver to determine how long it took a packet to arrive. We can use that signal to infer when routers are queuing packets, and back off to drain any queues. This is called congestion control and it’s a huge, never-ending area of research. I briefly summarized it in the Replacing WebRTC post if you want more CONTENT. But all you need to know is that sending packets at unlimited rate is a recipe for disaster.

You, The Application Developer

Speaking of a recipe for disaster. Let’s say you made the mistake of using UDP directly because you want them datagrams. You’re bound to mess up, and you won’t even realize why. If you want to build your own transport protocol on top of UDP, you “need” to implement:

  • Congestion Control

  • Reliability

  • Ordering
    And if you want a great protocol, you also need:

  • Encryption

  • Multiplexing

  • Prioritization
    And if you want an AMAZING protocol, you also need:

  • Flow control

  • Stateless Resets

  • Packet Pacing
    Let’s be honest, you don’t even know what half of those are, nor why they are worth implementing. Just use a QUIC library instead. But if you still insist on UDP, you’re actually in good company with a lot of the video industry. Building a live video protocol on top of UDP is all the rage; for example, WebRTC, SRT, Sye, RIST, etc. With the exception of Google, it’s very easy to make a terrible protocol on top of UDP. Look forward to the upcoming Replacing RTMP but please not with SRT blog post!

Timeliness

But remember, I ultimately want to achieve timeliness. How can we do that with QUIC? Avoid bloating the buffers 🐷. Use a delay-based congestion controller like BBR that will detect queueing and back off. There are better ways of doing this, like how WebRTC uses transport-wide-cc (Transport-wide Congestion Control), which I’ll personally make sure gets added to QUIC.
Split data into streams. The bytes within each stream are ordered, reliable, and can be any size; it’s nice and convenient. Each stream could be a video frame, or a game update, or a chat message, or a JSON blob, or really any atomic unit.
Prioritize the streams. Streams are independent and can arrive in any order. But you can tell the QUIC stack to focus on delivering important streams first. The low priority streams will be starved, and can be closed to avoid wasting bandwidth.
That’s it. That’s the secret behind Media over QUIC. Now all that’s left is to bikeshed the details. And guess what? This approach works with higher latency targets too. It turns out that the fire-and-forget nature of datagrams only works when you need real-time latency. For everything else, there’s QUIC streams. You don’t need datagrams.

In Defense of Datagrams

Never use Datagrams got you to click, but the direction of QUIC and MoQ seems to tell another story: Like all things designed by committee, there’s going to be some compromise. There are some folks who think datagram support is important. And frankly, it’s trivial to support and allow people to experiment. For example, OPUS has FEC support built-in, which is why MoQ supports the ability to send each audio “frame” as a datagram. But it’s a trap. Designed to lure in developers who don’t know any better. Who wouldn’t give up their precious UDP datagrams otherwise.
If you want some more of my hot-takes: There is no conclusion. This is a rant. Please don’t design your application on top of datagrams. Old protocols like DNS get a pass, but be like DNS over HTTPS instead. And please, please don’t make yet another video protocol on top of UDP.

Remember these 3 key ideas for your startup:

  1. Timeliness Over Reliability: When developing live video protocols or similar applications, prioritize timeliness rather than just aiming for unreliability. In real-time communication, timely delivery of the latest data trumps ensuring every single packet is delivered.

  2. Leverage Existing Solutions: Instead of reinventing the wheel by developing your own transport protocol on top of UDP, consider leveraging existing libraries like QUIC which already have optimized features for congestion control, encryption, and data prioritization.

  3. Avoid Bufferbloat: Detect and mitigate network queuing to avoid bufferbloat, which can significantly delay data delivery. Implement delay-based congestion controllers to manage network traffic efficiently and ensure timely data transmission.
    Edworking is the best and smartest decision for SMEs and startups to be more productive. Edworking is a FREE superapp of productivity that includes all you need for work powered by AI in the same superapp, connecting Task Management, Docs, Chat, Videocall, and File Management. Save money today by not paying for Slack, Trello, Dropbox, Zoom, and Notion.
    For more details, see the original source.

article cover
About the Author: Mark Howell Linkedin

Mark Howell is a talented content writer for Edworking's blog, consistently producing high-quality articles on a daily basis. As a Sales Representative, he brings a unique perspective to his writing, providing valuable insights and actionable advice for readers in the education industry. With a keen eye for detail and a passion for sharing knowledge, Mark is an indispensable member of the Edworking team. His expertise in task management ensures that he is always on top of his assignments and meets strict deadlines. Furthermore, Mark's skills in project management enable him to collaborate effectively with colleagues, contributing to the team's overall success and growth. As a reliable and diligent professional, Mark Howell continues to elevate Edworking's blog and brand with his well-researched and engaging content.

Trendy NewsSee All Articles
Try EdworkingA new way to work from  anywhere, for everyone for Free!
Sign up Now