E-WISH: An Energy-aware ABR Algorithm For Green HTTP Adaptive Video Streaming

Posted on January 22, 2024 by

ACM Mile-High Video 2024

February 11-14, 2024, Marriott DTC, Denver, US

Daniele Lorenzi (AAU, Austria), Minh Nguyen (AAU, Austria), Farzad Tashtarian (AAU, Austria), and Christian Timmerer (AAU, Austria)

Abstract:

HTTP Adaptive Streaming (HAS) is the de-facto solution for delivering video content over the Internet. The climate crisis has highlighted the environmental impact of information and communication technologies (ICT) solutions and the need for green solutions to reduce ICT’s carbon footprint. As video streaming dominates Internet traffic, research in this direction is vital now more than ever. HAS relies on Adaptive BitRate (ABR) algorithms, which dynamically choose suitable video representations to accommodate device characteristics and network conditions. ABR algorithms typically prioritize video quality, ignoring the energy impact of their decisions. Consequently, they often select the video representation with the highest bitrate under good network conditions, thereby increasing energy consumption. This is problematic, especially for energy-limited devices, because it affects the device’s battery life and the user experience. To address the aforementioned issues, we propose E-WISH, a novel energy-aware ABR algorithm, which extends the already-existing WISH algorithm to consider energy consumption while selecting the quality for the next video segment. According to the experimental findings, E-WISH shows the ability to improve Quality of Experience (QoE) by up to 52% according to the ITU-T P.1203 model (mode 0) while simultaneously reducing energy consumption by up to 12% with respect to state-of-the-art approaches.

Keywords: HTTP adaptive streaming, Energy, Adaptive Bitrate (ABR), DASH

Posted in ATHENA | Comments Off

Optimal Quality and Efficiency in Adaptive Live Streaming with JND-Aware Low Latency Encoding

Posted on January 22, 2024 by

MHV 2024: ACM Mile High Video

11 – 14 Feb 2024 | Denver, United States

Conference Website

[PDF][Slides]

Vignesh V Menon (Fraunhofer HHI), Jingwen Zhu (École Centrale Nantes), Prajit T Rajendran (Université Paris-Saclay), Samira Afzal (Alpen-Adria-Universität Klagenfurt), Klaus Schoeffmann (Alpen-Adria-Universität Klagenfurt), Patrick Le Callet (École Centrale Nantes), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)

Abstract: In HTTP adaptive live streaming applications, video segments are encoded at a fixed set of bitrate-resolution pairs known as bitrate ladder. Live encoders use the fastest available encoding configuration, referred to as preset, to ensure the minimum possible latency in video encoding. However, an optimized preset and optimized number of CPU threads for each encoding instance may result in (i) increased quality and (ii) efficient CPU utilization while encoding. For low latency live encoders, the encoding speed is expected to be more than or equal to the video framerate. To this light, this paper introduces a Just Noticeable Difference (JND)-Aware Low latency Encoding Scheme (JALE), which uses random forest-based models to jointly determine the optimized encoder preset and thread count for each representation, based on video complexity features, the target encoding speed, the total number of available CPU threads, and the target encoder. Experimental results show that, on average, JALE yield a quality improvement of 1.32 dB PSNR and 5.38 VMAF points with the same bitrate, compared to the fastest preset encoding of the HTTP Live Streaming (HLS) bitrate ladder using x265 HEVC open-source encoder with eight CPU threads used for each representation. These enhancements are achieved while maintaining the desired encoding speed. Furthermore, on average, JALE results in an overall storage reduction of 72.70%, a reduction in the total number of CPU threads used by 63.83%, and a 37.87% reduction in the overall encoding time, considering a JND of six VMAF points.

Keywords: Live streaming, low latency, encoder preset, CPU threads, HEVC.

Posted in GAIA | Comments Off

Content-adaptive Video Coding for HTTP Adaptive Streaming

Posted on January 17, 2024 by

Klagenfurt, January 15, 2024

Congratulations to Dr. Vignesh V Menon for successfully defending his dissertation on “Content-adaptive Video Coding for HTTP Adaptive Streaming” at Universität Klagenfurt in the context of the Christian Doppler Laboratory ATHENA.

Abstract

In today’s dynamic streaming landscape, where viewers access content on various devices and en- counter fluctuating network conditions, optimizing video delivery for each unique scenario is imperative. Video content complexity analysis, content-adaptive video coding, and multi-encoding methods are fundamental for the success of adaptive video streaming, as they serve crucial roles in delivering high-quality video experiences to a diverse audience. Video content complexity analysis allows us to comprehend the video content’s intricacies, such as motion, texture, and detail, providing valuable insights to enhance encoding decisions. By understanding the content’s characteristics, we can efficiently allocate bandwidth and encoding resources, thereby improving compression efficiency without compromising quality. Content-adaptive video coding techniques built upon this analysis involve dynamically adjusting encoding parameters based on the content complexity. This adaptability ensures that the video stream remains visually appealing and artifacts are minimized, even under challenging network conditions. Multi-encoding methods further bolster adaptive streaming by offering faster encoding of multiple representations of the same video at different bitrates. This versatility reduces computational overhead and enables efficient resource allocation on the server side. Collectively, these technologies empower adaptive video streaming to deliver optimal visual quality and uninterrupted viewing experiences, catering to viewers’ diverse needs and preferences across a wide range of devices and network conditions. Embracing video content complexity analysis, content-adaptive video coding, and multi-encoding methods is essential to meet modern video streaming platforms’ evolving demands and create immersive experiences that captivate and engage audiences. In this light, this dissertation proposes contributions categorized into four classes:

Video complexity analysis: For the online analysis of video content complexity, selecting low- complexity features is critical to ensure low-latency video streaming without disruptions. The spatial information (SI) and temporal information (TI) are state-of-the-art spatial and temporal complexity features. However, these features are not optimized for online analysis in live-streaming applications. Moreover, the correlation of the features to the video coding parameters like bitrate and encoding time is not significant. This thesis proposes discrete cosine transform (DCT)-energy-based spatial and temporal complexity features to overcome these limitations and provide an efficient video com- plexity analysis regarding accuracy and speed for every video (segment). The proposed features are determined at an average rate of 370 frames per second for ultra high definition (UHD) video content and used in estimating encoding parameters online.

Content-adaptive encoding optimizations: Content-adaptive encoding algorithms enable bet- ter control of codec-specific parameters and mode decisions inside the encoder to achieve higher bitrate savings and/or save encoding time. The contributions of this class are listed as follows:

A scene detection algorithm is proposed using the video complexity analysis features. The proposed algorithm yields a true positive rate of 78.26% and a false positive rate of 0.01%, compared to the state-of-the-art algorithm’s true positive rate of 53.62% and false positive rate of 0.03%.
An intra coding unit depth prediction (INCEPT) algorithm is proposed, which limits rate- distortion optimization for each coding tree unit (CTU) in high efficiency video coding (HEVC) by utilizing the spatial correlation with the neighboring CTUs, which is computed using the luma texture complexity feature introduced in the first contribution class. Experimental results show that INCEPT achieves a 23.24% reduction in the overall encoding time with a negligible loss in compression efficiency.

Online per-title encoding optimizations: Per-title encoding has gained attraction over recent years in adaptive streaming applications. Each video is segmented into multiple scenes, and optimal encoding parameters are selected. The contributions in this category are listed as follows:

Online resolution prediction scheme (ORPS), which predicts optimized resolution using the video content complexity of the video segment and the predefined set of target bitrates, is proposed. ORPS yields an average bitrate reduction of 17.28% and 22.79% for the same PSNR and VMAF, respectively, compared to the standard HTTP live streaming (HLS) bitrate ladder using x265 constant bitrate (CBR) encoding.
Online framerate prediction scheme (OFPS) is proposed to predict optimized framerate using the video content complexity of the video segment and the predefined set of target bitrates. OFPS yields an average bitrate reduction of 15.87% and 18.20% for the same PSNR and VMAF, respectively, compared to the original framerate CBR encoding of UHD 120fps sequences using x265, accompanied by an overall encoding time reduction of 21.82%.
Just noticeable difference (JND)-aware bitrate ladder prediction scheme (JBLS) is proposed, which predicts optimized bitrate-resolution pairs such that there is a perceptual quality differ- ence of one JND between representations. An average bitrate reduction of 12.94% and 17.94% for the same PSNR and VMAF, respectively, is observed, compared to the HLS CBR bitrate ladder encoding using x265. For a target JND of 6 VMAF points, JBLS achieves a storage reduction of 42.48% and 25.35% reduction in encoding time.
Online encoding preset prediction scheme (OEPS) is proposed, which predicts the optimized encoder preset based on the target bitrate, resolution, and video framerate for every video segment. OEPS yields consistent encoding speed across various representations with an overall quality improvement of 0.83 dB PSNR and 5.81 VMAF points with the same bitrate, compared to the fastest preset encoding of the HLS CBR bitrate ladder using x265.
A JND-aware two-pass per-title encoding scheme, named live variable bitrate encoding (LiveVBR) is proposed, which predicts perceptually-aware bitrate-resolution-framerate-rate factor tuples for the bitrate ladder of each video segment. LiveVBR yields an average bitrate reduction of 18.80% and 32.59% for the same PSNR and VMAF, respectively, compared to the HLS CBR bitrate ladder encoding using x265. For a target JND of six VMAF points, LiveVBR also resulted in a 68.96% reduction in storage space and an 18.58% reduction in encoding time, with a negligible impact on streaming latency.

Multi-encoding optimizations: Presently, most streaming services utilize cloud-based encoding techniques, enabling a fully parallel encoding process to reduce the overall encoding time. This dissertation comprehensively proposes various multi-rate and multi-encoding schemes in serial and parallel encoding scenarios. Furthermore, it introduces novel heuristics to limit the rate-distortion optimiza- tion (RDO) process across multiple representations. Based on these heuristics, three multi-encoding schemes are proposed, which rely on encoder analysis sharing across different representations: (i) optimized for the highest compression efficiency, (ii) optimized for the best compression efficiency- encoding time savings trade-off and (iii) optimized for the best encoding time savings. Experimental results demonstrate that the proposed multi-encoding schemes (i), (ii), and (iii) reduce the overall serial encoding time by 34.71%, 45.27%, and 68.76% with a 2.3%, 3.1%, and 4.5% bitrate increase to maintain the same VMAF, respectively compared to stand-alone encodings. The overall parallel encoding time is reduced by 22.03%, 20.72%, and 76.82% compared to stand-alone encodings for schemes (i), (ii), and (iii), respectively.

Slides available here: https://www.slideshare.net/slideshows/contentadaptive-video-coding-for-http-adaptive-streaming/265462304

Posted in ATHENA | Comments Off

Exploring Bitrate Costs for Enhanced User Satisfaction: A Just Noticeable Difference (JND) Perspective

Posted on January 8, 2024 by

Data Compression Conference (DCC)

19-22 March 2024, Snowbird, Utah, USA

[PDF]

Hadi Amirpour (AAU, Austria), Jingwen Zhu (University of Nantes, France), Raimund Schatz (AIT, Austria), Patrick Le Callet (University of Nantes, France), and Christian Timmerer (AAU, Austria)

Abstract: The evolving landscape of the delivery of multimedia content requires a deep understanding of how the design of the bitrate ladder (for HTTP adaptive streaming) influences cost and quality. This paper explores the use of Just Noticeable Differences (JND) to select bitrate-resolution pairs for constructing a bitrate ladder with respect to the proportion of satisfied user ratio (SUR). To expand the investigation to various codecs, first, a method is explained that transfers the JND points obtained through subjective testing from one codec (e.g., AVC) to other codecs (e.g., HEVC, VVC). This approach helps avoid the additional costs associated with conducting subjective tests to obtain JND points for a wide range of different codecs. To achieve this objective, we investigate the codec-agnostic nature of various video quality metrics, followed by the transfer of JND between two codecs, taking into account the most suitable codec-agnostic video quality metric. Secondly, we delve into the analysis of the bitrate cost of a given bitrate ladder from a JND perspective, i.e., as a function of the SUR. Among others, our experimental results demonstrate that to increase the SUR from 75% to 90%, it is necessary to double the video bitrate.

Posted in ATHENA | Comments Off

Merry Christmas 2024

Posted on December 20, 2023 by

Posted in News | Comments Off

A Real-time Video Quality Metric for HTTP Adaptive Streaming

Posted on December 14, 2023 by

IEEE International Conference on

Acoustics, Speech, and Signal Processing (ICASSP)

14-19 April 2024, Seoul, Korea

[PDF]

Hadi Amirpour (AAU, Austria), Jingwen Zhu (University of Nantes, France), Patrick Le Callet (University of Nantes, France), and Christian Timmerer (AAU, Austria)

Abstract: In HTTP Adaptive Streaming (HAS), a video is encoded at multiple bitrate-resolution pairs, referred to as representations, which enables users to choose the most suitable representation based on their network connection. To optimize the set of bitrate-resolution pairs and improve the Quality of Experience (QoE) for users, it is of utmost importance to measure the quality of the representations. VMAF is a highly reliable metric used in HAS to assess the quality of representations. However, in practice, using it for optimization can be a very time-consuming process, and it is infeasible for live streaming applications. To tackle its high complexity, our paper introduces a new method called VQM4HAS, which extracts low-complexity features including (i) video complexity features, (ii) bitstream features logged during the encoding process, and (iii) basic video quality metrics. These extracted features are then fed into a regression model to predict VMAF.
Our experimental results demonstrate that VQM4HAS achieves a high Pearson correlation coefficients (PCC) with VMAF from 0.95 to 0.96 depending on the resolution, but exhibits significantly less complexity, making it suitable for live streaming scenarios.

Posted in ATHENA | Comments Off

Energy-aware Resolution Selection for Per-title Encoding

Posted on December 14, 2023 by

IEEE International Conference on

Acoustics, Speech, and Signal Processing (ICASSP)

14-19 April 2024, Seoul, Korea

[PDF]

Mohammad Ghasempour (AAU, Austria), Hadi Amirpour (AAU, Austria), Mohammad Ghanbari (University of Essex, UK), and Christian Timmerer (AAU, Austria)

Abstract: With the ubiquity of video streaming, optimizing the delivery of video content while reducing energy consumption has become increasingly critical. Traditional adaptive streaming relies on a fixed set of bitrate-resolution pairs, known as bitrate ladders, for encoding. However, this one-size-fits-all approach is suboptimal for diverse video content. As a result, per-title encoding approaches dynamically select the bitrate ladder for each content. In this paper, we address the pressing issue of increasing energy consumption in video streaming by introducing GreenRes, a novel approach that goes beyond traditional quality-centric resolution selection. Instead, GreenRes considers both video quality and energy consumption to construct an optimal bitrate ladder tailored to the unique characteristics of each video content.
To achieve this, GreenRes, similar to per-title encoding, encodes each video content at various resolutions, each with a set of bitrates. It then establishes a maximum acceptable quality drop threshold and selects resolutions that not only maintain video quality above this threshold, but also minimize energy consumption. Our experimental results demonstrate a 30.82% reduction in energy consumption on average, while ensuring a maximum quality drop of 0.53 Video Multimethod Assessment Fusion (VMAF) points.