A Special Session on ‘Video Coding for Large Scale HTTP Adaptive Streaming Deployments‘ was organized by Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria), Mohammad Ghanbari (University of Essex, UK), and Alex Giladi (Comcast, USA) on July 2 at the 35th Picture Coding Symposium (PCS) 2021.
Four papers were presented during this session as shown below:
1. VMAF-based Bitrate Ladder Estimation for Adaptive Streaming
Authors: Angeliki Katsenou (University of Bristol); Fan Zhang (University of Bristol); Kyle Swanson (Netflix); Mariana Afonso (Netflix); Joel Sole (Netflix); David Bull (University of Bristol)
Abstract: In HTTP Adaptive Streaming, video content is conventionally encoded by adapting its spatial resolution and quantization level to best match the prevailing network state and display characteristics. It is well known that the traditional solution, of using a fixed bitrate ladder, does not result in the highest quality of experience for the user. Hence, in this paper, we introduce a content-driven approach for estimating the bitrate ladder, based on spatio-temporal features extracted from the uncompressed content. The method implements a content-driven interpolation. It uses the extracted features to train a
machine learning model to infer the curvature points of the Rate-VMAF curves in order to guide a set of initial encodings. We employ the VMAF quality metric as a means of perceptually conditioning the estimation. When compared to the generation of a reference ladder using exhaustive encoding, 76.63% the estimated ladder’s Rate-VMAF points are identical to those of the reference ladder. The proposed method benefits from a significant
(77.4%) reduction in the number of encodes required with only a small (1.04%) average Bjøntegaard Delta Rate increase.
2. Efficient Multi-Encoding Algorithms for HTTP Adaptive Bitrate Streaming
Authors: Vignesh V Menon (Alpen-Adria-Universitat Klagenfurt); Hadi Amirpour (Alpen-Adria-Universität Klagenfurt); Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria); Mohammad Ghanbari (University of Essex, UK)
Abstract: Since video accounts for the majority of today’s internet traffic, the popularity of HTTP Adaptive Streaming (HAS) is increasing steadily. In HAS, each video is encoded at multiple bitrates and spatial resolutions (i.e., representations) to adapt to a heterogeneity of network conditions, device characteristics, and end-user preferences. Most of the streaming services utilize cloud-based encoding techniques which enable a fully parallel encoding process to speed up the encoding and consequently to reduce the overall time complexity. State-of-the-art approaches further improve the encoding process by utilizing encoder analysis information from already encoded representation(s) to improve the encoding time complexity of the remaining representations. In this paper, we investigate various multi-encoding algorithms (i.e., multi-rate and multi-resolution) and propose novel multi-encoding algorithms for large-scale HTTP Adaptive Streaming deployments. Experimental results demonstrate that the proposed multi-encoding algorithm optimized for the highest compression efficiency reduces the overall encoding time by 39% with a 1.5% bitrate increase compared to stand-alone encodings. Its optimized version for the highest time savings reduces the overall encoding time by 50% with a 2.6% bitrate increase compared to standalone encodings.
More details on x265 can be found here.
3. Open GOP Resolution Switching in HTTP Adaptive Streaming with VVC
Authors: Robert Skupin (Fraunhofer HHI); Christian Bartnik (Fraunhofer HHI); Adam Wieckowski (HHI); Yago Sanchez de la Fuente (Fraunhofer HHI); Benjamin Bross (HHI); Cornelius Hellge (Fraunhofer HHI); Thomas Schierl (Fraunhofer HHI)
Abstract: The user experience in adaptive HTTP streaming relies on offering bitrate ladders with suitable operation points for all users and typically involves multiple resolutions. While open GOP coding structures are generally known to provide
substantial coding efficiency benefit, their use in HTTP streaming has been precluded through lacking support of reference picture resampling (RPR) in AVC and HEVC. The
newly emerging Versatile Video Coding (VVC) standard supports RPR, but only conversational scenarios were primarily investigated during the design of VVC. This paper aims at enabling usage of RPR in HTTP streaming scenarios through analysing the drift potential of VVC coding tools and presenting a constrained encoding method that avoids severe drift artefacts in resolution switching with open GOP coding in VVC. In
typical live streaming configurations, the presented method achieves -8.7% BD-rate reduction compared to closed GOP coding while in a typical Video on Demand configuration, -1.89% BD-rate reduction is reported. The constraints penalty
compared to regular open GOP coding is 0.65% BD-rate in the worst case. The presented method was integrated into the publicly available open source VVC encoder VVenC v0.3.
The source code of VVenc can be accessed here. The source code of the VVC reference software (VTM) can be accessed here.
4. Towards Understanding of the Behavior of Web Streaming
Authors: Yuriy Reznik (Brightcove, Inc.); Karl Lillevold (Brightcove, Inc.); Abhijith Jagannath (Brightcove, Inc.); Xiangbo Li (Brightcove, Inc.
Abstract: We study the behavior of a modern-era adaptive streaming system delivering videos embedded in web-pages. In such an application, the size of videos rendered on the screen may depend on user preferences, such as the position and size of a browser window. Moreover, the stream selection logic in such a system appears to be influenced not only by the available network bandwidth but also by the output video size, which, in many cases, limits the selection of higher quality streams. To explain this behavior, in this paper we introduce a simple analytical model of a client adapting to both bandwidth and player size. Using this model, we then compute stream selection probabilities and show
that they are sufficiently close to respective statistics observed in practical experiments. Possible uses of this proposed client model are also suggested. Specifically, we show how it can be used to derive formulae for the average performance parameters of the system and also for posing related optimization problems.
The dataset as mentioned during the presentation can be accessed here.
We thank the organizers of PCS’21, the authors, and reviewers of the presented papers, the attendees who participated with probing questions, making the session successful.