LALISA: Adaptive Bitrate Ladder Optimization in HTTP-based Adaptive Live Streaming

Posted on December 25, 2022 by

IEEE/IFIP Network Operations and Management Symposium (NOMS)

8-12 May 2023 | Miami, FL, USA

Farzad Tashtarian (AAU, Austria), Abdelhak Bentaleb (Concordia University, Canada), Hadi Amirpour (AAU, Austria), Babak Taraghi (AAU, Austria), Christian Timmerer (AAU, Austria), Hermann Hellwagner (AAU, Austria), Roger Zimmermann (National University of Singapore, Singapore)

Video content in Live HTTP Adaptive Streaming (HAS) is typically encoded using a pre-defined, fixed set of bitrate-resolution pairs (termed Bitrate Ladder), allowing playback devices to adapt to changing network conditions using an adaptive bitrate (ABR) algorithm. However, using a fixed one-size-fits-all solution when faced with various content complexities, heterogeneous network conditions, viewer device resolutions and locations, does not result in an overall maximal viewer quality of experience (QoE). Here, we consider these factors and design LALISA, an efficient framework for dynamic bitrate ladder optimization in live HAS. LALISA dynamically changes a live video session’s bitrate ladder, allowing improvements in viewer QoE and savings in encoding, storage, and bandwidth costs. LALISA is independent of ABR algorithms and codecs, and is deployed along the path between viewers and the origin server. In particular, it leverages the latest developments in video analytics to collect statistics from video players, content delivery networks and video encoders, to perform bitrate adder tuning. We evaluate the performance of LALISA against existing solutions in various video streaming scenarios using a trace-driven testbed. Evaluation results demonstrate significant improvements in encoding computation (24.4%) and bandwidth (18.2%) costs with an acceptable QoE.

Posted in ATHENA | Comments Off

CD-LwTE: Cost- and Delay-aware Light-weight Transcoding at the Edge

Posted on December 15, 2022 by

CD-LwTE: Cost- and Delay-aware Light-weight Transcoding at the Edge

IEEE Transactions on Network and Service Management (TNSM)

[PDF]

Alireza Erfanian (Alpen-Adria-Universität Klagenfurt, Austria), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), Farzad Tashtarian (Alpen-Adria-Universität Klagenfurt, Austria), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria), and Hermann Hellwagner.

Abstract—The edge computing paradigm brings cloud capabilities close to the clients. Leveraging the edge’s capabilities can improve video streaming services by employing the storage capacity and processing power at the edge for caching and transcoding tasks, respectively, resulting in video streaming services with higher quality and lower latency. In this paper, we propose CD-LwTE, a Cost- and Delay-aware Light-weight Transcoding approach at the Edge, in the context of HTTP Adaptive Streaming (HAS). The encoding of a video segment requires computationally intensive search processes. The main idea of CD-LwTE is to store the optimal search results as metadata for each bitrate of video segments and reuse it at the edge servers to reduce the required time and computational resources for transcoding. Aiming at minimizing the cost and delay of Video-on-Demand (VoD) services, we formulate the problem of selecting an optimal policy for serving segment requests at the edge server, including (i) storing at the edge server, (ii) transcoding from a higher bitrate at the edge server, and (iii) fetching from the origin or a CDN server, as a Binary Linear Programming (BLP) model. As a result, CD-LwTE stores the popular video segments at the edge and serves the unpopular ones by transcoding using metadata or fetching from the origin/CDN server. In this way, in addition to the significant reduction in bandwidth and storage costs, the transcoding time of a requested segment is remarkably decreased by utilizing its corresponding metadata. Moreover, we prove the proposed BLP model is an NP-hard problem and propose two heuristic algorithms to mitigate the time complexity of CD-LwTE. We investigate the performance of CD-LwTE in comprehensive scenarios with various video contents, encoding software, encoding settings, and available resources at the edge. The experimental results show that our approach (i) reduces the transcoding time by up to 97%, (ii) decreases the streaming cost, including storage, computation, and bandwidth costs, by up to 75%, and (iii) reduces delay by up to 48% compared to state-of-the-art approaches.

Posted in ATHENA | Comments Off

Special Session on “Optimized Media Delivery” at ICME’23

Posted on December 12, 2022 by

ICME 2023 Special Session on

“Optimized Media Delivery”

July, 2023, Brisbane, Australia

Link

Organizers:

Hadi Amirpour, University of Klagenfurt
Angeliki Katsenou, Trinity College Dublin, IE and University of Bristol, UK

Abstracts

Video streaming in the context of HTTP Adaptive Streaming (HAS) is replacing legacy media platforms and its market share is growing rapidly due to its simplicity, reliability, and standard support (e.g., MPEG-DASH). It results in an increasing number of video content, where nowadays, video accounts for the vast majority of today’s internet traffic either in the form of user-generated content (UGC) or pristine cinematic content. For HAS, the video is usually encoded in multiple versions (i.e., representations) of different resolutions, bitrates, codecs, etc. and each representation is divided into chunks (i.e., segments) of equal length (e.g., 2-10 second) to enable dynamic, adaptive switching during streaming based on the user’s context conditions (e.g., network conditions, device characteristics, user preferences).

The optimized media delivery requires optimization of streaming from an end-to-end aspect, including content provisioning, and content consumption. In content provisioning, the quality of the video to be streamed is vital; for example cinematic content is pristine, while UGC content is already distorted. Thus video coding/transcoding is crucial for both the efficient distribution to the end-user (real-time or on demand) and the high quality of experience. There is a plethora of different techniques for a smooth visual experience. Many researchers focus on improving the compression efficiency of the standardised video codecs (e.g., HEVC, VVC, VP9, AV1, AVS3, etc.). Other researchers are focused on driving the video codecs using perceptual models to improve the delivery. At a HAS streaming level, the video service providers focus on the construction of optimized bitrate ladders per content that can also reduce the streaming cost. New immersive media formats add to the complexity of the optimization required for an end-to-end quality of experience.

The goal of this special session is to provide a forum for sharing and discussing cutting-edge research in Media Streaming and Quality Assessment. Possible topics that would be a good fit for this session include but are not limited to:

video coding parameter selection for optimized streaming;
transcoding techniques for the improved delivery of media;
perceptual evaluation of immersive media;
end-to-end video adaptive streaming methods;
pre- and post-processing for improved compression and delivery.

Posted in ATHENA | Comments Off

DeepStream: Video Streaming Enhancements using Compressed Deep Neural Networks

Posted on December 11, 2022 by

IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT)

[PDF]

Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), Mohammad Ghanbari (University of Essex, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

Abstract: In HTTP Adaptive Streaming (HAS), each video is divided into smaller segments, and each segment is encoded at multiple pre-defined bitrates to construct a bitrate ladder. To optimize bitrate ladders, per-title encoding approaches encode each segment at various bitrates and resolutions to determine the convex hull. From the convex hull, an optimized bitrate ladder is constructed, resulting in an increased Quality of Experience (QoE) for end-users. With the ever-increasing efficiency of deep learning-based video enhancement approaches, they are more and more employed at the client-side to increase the QoE, specifically when GPU capabilities are available. Therefore, scalable approaches are needed to support end-user devices with both CPU and GPU capabilities (denoted as CPU-only and GPU-available end-users, respectively) as a new dimension of a bitrate ladder.
To address this need, we propose DeepStream, a scalable content-aware per-title encoding approach to support both CPU-only and GPU-available end-users. (i) To support backward compatibility, DeepStream constructs a bitrate ladder based on any existing per-title encoding approach. Therefore, the video content will be provided for legacy end-user devices with CPU-only capabilities as a base layer (BL). (ii) For high-end end-user devices with GPU capabilities, an enhancement layer (EL) is added on top of the base layer comprising lightweight video super-resolution deep neural networks (DNNs) for each bitrate-resolution pair of the bitrate ladder. A content-aware video super-resolution approach leads to higher video quality, however, at the cost of bitrate overhead. To reduce the bitrate overhead for streaming content-aware video super-resolution DNNs, DeepCABAC, context-adaptive binary arithmetic coding for DNN compression, is used. Furthermore, the similarity among (i) segments within a scene and (ii) frames within a segment are used to reduce the training costs of DNNs.
Experimental results show bitrate savings of 34% and 36% to maintain the same PSNR and VMAF, respectively, for GPU-available end-users, while the CPU-only users get the desired video content as usual.

Keywords—HTTP adaptive streaming, per-title encoding, video streaming, video super-resolution.

Posted in ATHENA | Comments Off

MPEC2: Multilayer and Pipeline Video Encoding on the Computing Continuum

Posted on November 29, 2022 by

MPEC2: Multilayer and Pipeline Video Encoding on the Computing Continuum

conference website: IEEE NCA 2022

Samira Afzal (Alpen-Adria-Universität Klagenfurt), Zahra Najafabadi Samani (Alpen-Adria-Universität Klagenfurt), Narges Mehran (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt), and Radu Prodan (Alpen-Adria-Universität Klagenfurt)

Abstract:

Video streaming is the dominating traffic in today’s data-sharing world. Media service providers stream video content for their viewers, while worldwide users create and distribute videos using mobile or video system applications that significantly increase the traffic share. We propose a multilayer and pipeline encoding on the computing continuum (MPEC2) method that addresses the key technical challenge of high-price and computational complexity of video encoding. MPEC2 splits the video encoding into several tasks scheduled on appropriately selected Cloud and Fog computing instance types that satisfy the media service provider and user priorities in terms of time and cost.
In the first phase, MPEC2 uses a multilayer resource partitioning method to explore the instance types for encoding a video segment. In the second phase, it distributes the independent segment encoding tasks in a pipeline model on the underlying instances.
We evaluate MPEC2 on a federated computing continuum encompassing Amazon Web Services (AWS) EC2 Cloud and Exoscale Fog instances distributed on seven geographical locations. Experimental results show that MPEC2 achieves 24% faster completion time and 60% lower cost for video encoding compared to resource allocation related methods. When compared with baseline methods, MPEC2 yields 40%-50% lower completion time and 5-60% reduced total cost.

Posted in APOLLO, ATHENA, GAIA | Comments Off

Elsevier Signal Processing: Reversible Data Hiding for Color Images Based on Pixel Value Order of Overall Process Channel Correlation

Posted on November 18, 2022 by

Elsevier Signal Processing

[PDF]

Ningxiong Mao (Southwest Jiaotong University), Hongjie Hea (Southwest Jiaotong University), Fan Chenb (Southwest Jiaotong University), Lingfeng Qu (Southwest Jiaotong University), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

Abstract:

Color image Reversible Data Hiding (RDH) is getting more and more important since the number of its applications is steadily growing. This paper proposes an efficient color image RDH scheme based on pixel value ordering (PVO), in which the channel correlation is fully utilized to improve the embedding performance. In the proposed method, the channel correlation is used in the overall process of data embedding, including prediction stage, block selection and capacity allocation. In the prediction stage, since the pixel values in the co-located blocks in different channels are monotonically consistent, the large pixel values are collected preferentially by pre-sorting the intra-block pixels. This can effectively improve the embedding capacity of RDH based on PVO. In the block selection stage, the description accuracy of block complexity value is improved by exploiting the texture similarity between the channels. The smoothing the block is then preferentially used to reduce invalid shifts. To achieve low complexity and high accuracy in capacity allocation, the proportion of the expanded prediction error to the total expanded prediction error in each channel is calculated during the capacity allocation process. The experimental results show that the proposed scheme achieves significant superiority in fidelity over a series of state-of-the-art schemes. For example, the PSNR of the Lena image reaches 62.43dB, which is a 0.16dB gain compared to the best results in the literature with a 20,000bits embedding capacity.

Keywords—Reversible data hiding, color image, pixel value ordering, channel correlation

Posted in ATHENA | Comments Off

Advanced Scalability for Light Field Image Coding

Posted on November 9, 2022 by

IEEE Transactions on Image Processing (TIP)

[PDF]

Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), Christine Guillemot (INRIA, France), Mohammad Ghanbari (University of Essex, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

Abstract: Light field imaging, which captures both spatial and angular information, improves user immersion by enabling post-capture actions, such as refocusing and changing view perspective. However, light fields represent very large volumes of data with a lot of redundancy that coding methods try to remove. State-of-the-art coding methods indeed usually focus on improving compression efficiency and overlook other important features in light field compression such as scalability. In this paper, we propose a novel light field image compression method that enables (i) viewport scalability, (ii) quality scalability, (iii) spatial scalability, (iv) random access, and (v) uniform quality distribution among viewports, while keeping compression efficiency high. To this end, light fields in each spatial resolution are divided into sequential viewport layers, and viewports in each layer are encoded using the previously encoded viewports. In each viewport layer, the available viewports are used to synthesize intermediate viewports using a video interpolation deep learning network. The synthesized views are used as virtual reference images to enhance the quality of intermediate views. An image super-resolution method is applied to improve the quality of the lower spatial resolution layer. The super-resolved images are also used as virtual reference images to improve the quality of the higher spatial resolution layer.
The proposed structure also improves the flexibility of light field streaming, provides random access to the viewports, and increases error resiliency. The experimental results demonstrate that the proposed method achieves a high compression efficiency and it can adapt to the display type, transmission channel, network condition, processing power, and user needs.

Keywords—Light field, compression, scalability, random access, deep learning.

Posted in ATHENA | Comments Off

Enter your email Address

ATHENA Christian Doppler (CD) Laboratory

LALISA: Adaptive Bitrate Ladder Optimization in HTTP-based Adaptive Live Streaming

IEEE/IFIP Network Operations and Management Symposium (NOMS)

8-12 May 2023 | Miami, FL, USA

CD-LwTE: Cost- and Delay-aware Light-weight Transcoding at the Edge

CD-LwTE: Cost- and Delay-aware Light-weight Transcoding at the Edge

IEEE Transactions on Network and Service Management (TNSM)

Special Session on “Optimized Media Delivery” at ICME’23

ICME 2023 Special Session on

“Optimized Media Delivery”

July, 2023, Brisbane, Australia

Organizers:

Hadi Amirpour, University of Klagenfurt

Angeliki Katsenou, Trinity College Dublin, IE and University of Bristol, UK

Abstracts

DeepStream: Video Streaming Enhancements using Compressed Deep Neural Networks

IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT)

MPEC2: Multilayer and Pipeline Video Encoding on the Computing Continuum

Elsevier Signal Processing: Reversible Data Hiding for Color Images Based on Pixel Value Order of Overall Process Channel Correlation

Elsevier Signal Processing

Advanced Scalability for Light Field Image Coding

IEEE Transactions on Image Processing (TIP)

Project Funding

Archives

Links

Multimedia Communication

ITEC Homepage

Recent Posts