Patent Approval for “Fast Multi-rate Encoding for Adaptive Streaming using Machine Learning”

Fast Multi-rate Encoding for Adaptive Streaming using Machine Learning

US Patent

[PDF]

Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), Ekrem Çetinkaya (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

 

Abstract: According to embodiments of the disclosure, fast multi-rate encoding may be performed using machine learning by encoding a lowest quality representation to determine encoding parameters, processing raw data of the video using a neural network to obtain an intermediate output comprising encoding features, augmenting the intermediate output with additional encoding features to form a final tensor, and processing the final tensor with another neural network to obtain a classification output comprising a split or not split decision for an image data block. The classification output may be used to encode a highest quality representation, and then other representations of the video.

Posted in ATHENA | Comments Off on Patent Approval for “Fast Multi-rate Encoding for Adaptive Streaming using Machine Learning”

MEDUSA: A Dynamic Codec Switching Approach in HTTP Adaptive Streaming

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)

Journal website

[PDF]

Daniele Lorenzi (AAU, Austria), Farzad Tashtarian (AAU, Austria), Hermann Hellwagner (AAU, Austria), and Christian Timmerer (AAU, Austria)

HTTP Adaptive Streaming (HAS) solutions utilize various Adaptive BitRate (ABR) algorithms to dynamically select appropriate video representations, aiming to adapt to fluctuations in network bandwidth. However, current ABR implementations have a limitation in that they are designed to function with one set of video representations, i.e., the bitrate ladder, which differ in bitrate and resolution, but are encoded with the same video codec. When multiple codecs are available, current ABR algorithms select one of them prior to the streaming session and stick to it throughout the entire streaming session. Although newer codecs are generally preferred over older ones, their compression efficiencies differ depending on the content’s complexity, which varies over time. Therefore, it is necessary to select the appropriate codec for each video segment to reduce the requested data while delivering the highest possible quality. In this paper, we first provide a practical example where we compare compression efficiencies of different codecs on a set of video sequences. Based on this analysis, we formulate the optimization problem of selecting the appropriate codec for each user and video segment (on a per-segment basis in the outmost case), refining the selection of the ABR algorithms by exploiting key metrics, such as the perceived segment quality and size. Subsequently, to address the scalability issues of this centralized model, we introduce a novel distributed plug-in ABR algorithm for Video on Demand (VoD) applications called MEDUSA to be deployed on top of existing ABR algorithms. MEDUSA enhances the user’s Quality of Experience (QoE) by utilizing a multi-objective function that considers the quality and size of video segments when selecting the next representation. Using quality information and segment size from the modified Media Presentation Description (MPD), MEDUSA utilizes buffer occupancy to prioritize quality or size by assigning specific weights in the objective function. To show the impact of MEDUSA, we compare the proposed plug-in approach on top of state-of-the-art techniques with their original implementations and analyze the results for different network traces, video content, and buffer capacities. According to the experimental findings, MEDUSA shows the ability to improve QoE for various test videos and scenarios. The results reveal an impressive improvement in the QoE score of up to 42% according to the ITU-T P.1203 model (mode 0). Additionally, MEDUSA can reduce the transmitted data volume by up to more than 40% achieving a QoE similar to the techniques compared, reducing the burden on streaming service providers for delivery costs.

 

 

Posted in ATHENA | Comments Off on MEDUSA: A Dynamic Codec Switching Approach in HTTP Adaptive Streaming

Generative AI for Adaptive Video Streaming

The 15th ACM Multimedia Systems Conference (Doctoral Symposium)

15-18 April, 2024 | Bari, Italy

Conference website

[PDF]

Emanuele Artioli (AAU, Austria)

Video streaming stands as the cornerstone of telecommunication networks, constituting over 60% of mobile data traffic as of June 2023. The paramount challenge faced by video streaming service providers is ensuring high Quality of Experience (QoE) for users. In HTTP Adaptive Streaming (HAS), including DASH and HLS, video content is encoded at multiple quality versions, with an Adaptive Bitrate (ABR) algorithm dynamically selecting versions based on network conditions. Concurrently, Artificial Intelligence (AI) is revolutionizing the industry, particularly in content recommendation and personalization. Leveraging user data and advanced algorithms, AI enhances user engagement, satisfaction, and video quality through super-resolution and denoising techniques. However, challenges persist, such as real-time processing on resource-constrained devices, the need for diverse training datasets, privacy concerns, and model interpretability. Despite these hurdles, the promise of Generative Artificial Intelligence emerges as a transformative force. Generative AI, capable of synthesizing new data based on learned patterns, holds vast potential in the video streaming landscape. In the context of video streaming, it can create realistic and immersive content, adapt in real time to individual preferences, and optimize video compression for seamless streaming in low-bandwidth conditions. This research proposal outlines a comprehensive exploration at the intersection of advanced AI algorithms and digital entertainment, focusing on the potential of generative AI to elevate video quality, user interactivity, and the overall streaming experience. The objective is to integrate generative models into video streaming
pipelines, unraveling novel avenues that promise a future of dynamic, personalized, and visually captivating streaming experiences for viewers

Posted in ATHENA | Comments Off on Generative AI for Adaptive Video Streaming

SPIRIT Project: Open Call 1 is live!

SPIRIT – Open Call 1

Scalable Platform for Innovations on Real-time Immersive Telepresence

https://www.spirit-project.eu/open-call-1/

SPIRIT’s 1st wave of Open Call (SPIRIT-OC1) is now open. It provides up to 200 thousand € to financially support the involvement of third parties to develop and test a wide variety of collaborative telepresence applications on the first release of the SPIRIT platform. OC1 aims to engage different organisations to test, develop further, and validate their specific use cases (applications) or to contribute components that enhance/extend the SPIRIT platform.

The applications run till 27 May, 2024, 17:00 CET. 10 third-party projects will be selected and will be expected to have a total duration of 9 months. For further information about the OC1 and the technical results of the SPIRIT project, please refer to the OC1 webpage: https://www.spirit-prject.eu/open-call-1/.

Index Terms: Telepresence, Point Clouds, Augmented Reality

Posted in SPIRIT | Comments Off on SPIRIT Project: Open Call 1 is live!

Generative AI for HTTP Adaptive Streaming

15th ACM Multimedia Systems Conference (MMSys)
15 – 18 April 2024 | Bari, Italy.
[PDF][Slides][Poster]

Emanuele Artioli (Alpen-Adria-Universität Klagenfurt)

Abstract:

Video streaming stands as the cornerstone of telecommunication networks, constituting over 60% of mobile data traffic as of June 2023. The paramount challenge faced by video streaming service providers is ensuring high Quality of Experience (QoE) for users. In HTTP Adaptive Streaming (HAS), including DASH and HLS, video content is encoded at multiple quality versions, with an Adaptive Bitrate (ABR) algorithm dynamically selecting versions based on network conditions. Concurrently, Artificial Intelligence (AI) is revolutionizing the industry, particularly in content recommendation and personalization. Leveraging user data and advanced algorithms, AI enhances user engagement, satisfaction, and video quality through super-resolution and denoising techniques.

However, challenges persist, such as real-time processing on resource-constrained devices, the need for diverse training datasets, privacy concerns, and model interpretability. Despite these hurdles, the promise of Generative Artificial Intelligence emerges as a transformative force. Generative AI, capable of synthesizing new data based on learned patterns, holds vast potential in the video streaming landscape. In the context of video streaming, it can create realistic and immersive content, adapt in real time to individual preferences, and optimize video compression for seamless streaming in low-bandwidth conditions.

This research proposal outlines a comprehensive exploration at the intersection of advanced AI algorithms and digital entertainment, focusing on the potential of generative AI to elevate video quality, user interactivity, and the overall streaming experience. The objective is to integrate generative models into video streaming pipelines, unraveling novel avenues that promise a future of dynamic, personalized, and visually captivating streaming experiences for viewers.

Posted in ATHENA | Comments Off on Generative AI for HTTP Adaptive Streaming

MMSys ’24: ComPEQ–MR: Compressed Point Cloud Dataset with Eye Tracking and Quality Assessment in Mixed Reality

15th ACM Multimedia Systems Conference

April 15-18, 2024 – Bari, Italy

https://2024.acmmmsys.org/

[PDF],[Dataset]

Minh Nguyen (Fraunhofer Fokus, Germany), Shivi Vats (Alpen-Adria-Universität Klagenfurt, Austria), Xuemei Zhou (Centrum Wiskunde & Informatica and TU Delft, Netherlands), Irene Viola (Centrum Wiskunde & Informatica, Netherlands), Pablo Cesar (Centrum Wiskunde & Informatica, Netherlands), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria) Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt, Austria)

Abstract: Point clouds (PCs) have attracted researchers and developers due to their ability to provide immersive experiences with six degrees of freedom (6DoF). However, there are still several open issues in understanding the Quality of Experience (QoE) and visual attention of end users while experiencing 6DoF volumetric videos. First, encoding and decoding point clouds require a significant amount of both time and computational resources. Second, QoE prediction models for dynamic point clouds in 6DoF have not yet been developed due to the lack of visual quality databases. Third, visual attention in 6DoF is hardly explored, which impedes research into more sophisticated approaches for adaptive streaming of dynamic point clouds. In this work, we provide an open-source Compressed Point cloud dataset with Eye-tracking and Quality assessment in Mixed Reality (ComPEQ–MR). The dataset comprises four compressed dynamic point clouds processed by Moving Picture Experts Group (MPEG) reference tools (i.e., VPCC and GPCC), each with 12 distortion levels. We also conducted subjective tests to assess the quality of the compressed point clouds with different levels of distortion. The rating scores are attached to ComPEQ–MR so that they can be used to develop QoE prediction models in the context of MR environments. Additionally, eye-tracking data for visual saliency is included in this dataset, which is necessary to predict where people look when watching 3D videos in MR experiences. We collected opinion scores and eye-tracking data from 41 participants, resulting in 2132 responses and 164 visual attention maps in total. The dataset is available at https://ftp.itec.aau.at/datasets/ComPEQ-MR/.

Index Terms: Point Clouds, Quality of Experience, Subjective Tests, Augmented Reality

Posted in SPIRIT | Comments Off on MMSys ’24: ComPEQ–MR: Compressed Point Cloud Dataset with Eye Tracking and Quality Assessment in Mixed Reality

MHV ’24: No-Reference Quality of Experience Model for Dynamic Point Clouds in Augmented Reality

ACM Mile High Video (MHV) 2024

February 11-14, 2024 – Denver, USA

https://www.mile-high.video/

[PDF],[GitHub]

Minh Nguyen (Alpen-Adria-Universität Klagenfurt, Austria), Shivi Vats (Alpen-Adria-Universität Klagenfurt, Austria), Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt, Austria)

Abstract: Point cloud streaming is becoming increasingly popular due to its ability to provide six degrees of freedom (6DOF) for immersive media. Measuring the quality of experience (QoE) is essential to evaluate the performance of point cloud applications. However, most existing QoE models for point cloud streaming are complicated and/or not open source. Therefore, it is desirable to provide an opensource QoE model for point cloud streaming.

(…)

In this work, we provide a fine-tuned ITU-T P.1203 model for dynamic point clouds in Augmented Reality (AR) environments. We re-train the P.1203 model with our dataset to get the optimal coefficients in this model that achieves the lowest root mean square error (RMSE). The dataset was collected in a subjective test in which the participants watched dynamic point clouds from the 8i lab database with Microsoft’s HoloLens 2 AR glasses. The dynamic point clouds have static qualities or a quality switch in the/ middle of the sequence. We split this dataset into a training set and a validation set. We train the coefficients of the P.1203 model with the former set and validate its performance with the latter one.

The trained model is available on Github: https://github.com/minhkstn/itu-p1203-point-clouds.

Index Terms: Point Clouds, Quality of Experience, Subjective Tests, Augmented Reality

Posted in SPIRIT | Comments Off on MHV ’24: No-Reference Quality of Experience Model for Dynamic Point Clouds in Augmented Reality