Somewhat ironically I worked for several years at a company that's in the online video business (Brightcove), but didn't feel like I've really starting digging into the details of video until several years after leaving. My time there was instead spent on the data that surrounds and enables accessing the Video, and while I acquired a fair amount of familiarity with some of those proximal concerns, dealing with the binary content of the videos themselves were handled by other teams and not of particular interest to me (and I had more than enough to keep me busy).

It is likely because I was at a company surrounded by people with relevant expertise that I was able to focus elsewhere, and am now left to venture down more of those alleys myself while tackling a video project. Hopefully the project itself will be able to produce some shareable artifacts.

I'm currently experimenting a bite with (Low-Latency) LL-HLS. I was drawn to HLS due to my recollection of it being a wonderfully simple and widely supported approach (which is somewhat of a given for an Apple Web standard). LL-HLS is a more recent extension and so far it seems as though support outside of the official Apple tools remains fairly unpolished and so I'll be working through some of the plumbing as I go.

WebRTC is likely the cool new kid on the block (was Donny the cool one?), and while there's a fair amount of considerations that I've weighed suffice it to say in the short term that peer-to-peer is not a clear match for the problem I'm exploring nor is there any identified need for ultra low-latency such as in support of interactivity.

The LL HLS in-depth blog post provides a decent summary of the overall solution (in-depth is certainly a relative term). In particular this page provides a good background on the relationship between (I) key frames and independent/regular segments which leads in to wider considerations of being able to send less bits through batching which could not only yield throughput benefits but also offset transfer time.

This AWS blog post digs deeper into the different types of frames and how they compose into Groups of Pictures (GOPs).

Apple provides nicely consumable documentation for HLS and its range of variations. Some of the HLS support tools that it provides are behind the Apple Developer login for whatever (so I had to go through my periodic search for valid credentials).

A key discovery while chasing down some of the LL-HLS details which is touched on very briefly in the Apple docs but often glossed over elsewhere is that it will typically rely on less widely supported HTTP features such as Chunked Transfer Encoding). LL-HLS in particular (compared to LL-DASH/CMAF-CTE or HESP) offers a solution that does not require this support through supporting distinct files for partial segments and their aggregation into complete segments, but this either requires additional server-side logic to handle that aggregation without redundant transfer or what amounts to transferring all of the content twice. None of these considerations are particularly hairy problems but represent a significant shift compared to how readily non-LL counterparts can be delivered using commodity protocols with typical configurations.