ATHENA, GAIA, and SPIRIT contributions to ACM MMSys 2024

15th ACM Multimedia Systems Conference (MMSys)
15 – 18 April 2024 | Bari, Italy

Posted in ATHENA, GAIA, SPIRIT | Comments Off on ATHENA, GAIA, and SPIRIT contributions to ACM MMSys 2024

IEEE IoT: IoT Privacy Protection: JPEG-TPE with Lower File Size Expansion and Lossless Decryption

IoT Privacy Protection: JPEG-TPE with Lower File Size Expansion and Lossless Decryption

IEEE Internet of Things Journal (IEEE IoT)

Journal Website

[PDF]

Hongjie He (Southwest Jiaotong University, China), Yuan Yuan (Southwest Jiaotong University, China), Hadi Amirpour (AAU, Klagenfurt, Austria),  Lingfeng Qu (Southwest Jiaotong University, China), Christian Timmerer (AAU, Klagenfurt, Austria), Fan Chen (Southwest Jiaotong University, China)

 

 

Abstract: With the development of Internet of Things (IoT) and cloud services, many images generated from IoT devices are stored in the cloud, calling for efficient data encryption methods. To balance the security and usability, the thumbnail preserving encryption (TPE) has emerged. However, existing JPEG image-based TPE (JPEG-TPE) schemes face challenges in achieving low file extension, lossless decryption and better privacy protect of detailed information. To solve these challenges, we propose a novel JPEG-TPE scheme.
Firstly, to achieve a smaller file size expansion and preserve the thumbnail, we reallocate the values, maintaining the sum for the DC difference instead of the DC coefficient. To ensure that the coefficients do not overflow, the valid range of reallocated difference is constrained not only by the sum but also by the neighborhood difference.
Secondly, to preserve file size of AC encryption while improve the security of detailed information, the AC coefficient groups with undivided RSV are permuted adaptively.
Besides, the intra TPE block swapping of DC difference, quantization table modification,
non-zero AC coefficients mapping, and block permutation are used to further encrypt the image. The experimental results show that the proposed JPEG-TPE scheme achieves lossless decryption, reducing the file size expansion of encrypted images from 15.41% to 0.64% compared to the state-of-the-art scheme. Additionally, it is observed that the proposed method can effectively resist against various attacks, including the deep-learning based super-resolution attack.

Posted in ATHENA | Comments Off on IEEE IoT: IoT Privacy Protection: JPEG-TPE with Lower File Size Expansion and Lossless Decryption

IEEE TCSVT: DeepVCA: Deep Video Complexity Analyzer

DeepVCA: Deep Video Complexity Analyzer

IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

Journal Website

[PDF]

 Hadi Amirpour (AAU, Klagenfurt, Austria), Klaus Schoeffmann (AAU, Klagenfurt, Austria), Mohammad Ghanbari (University of Essex, UK), Christian Timmerer (AAU, Klagenfurt, Austria)

 

Abstract: Video streaming and its applications are growing rapidly, making video optimization a primary target for content providers looking to enhance their services. Enhancing the quality of videos requires the adjustment of different encoding parameters such as bitrate, resolution, and frame rate. To avoid brute force approaches for predicting optimal encoding parameters, video complexity features are typically extracted and utilized. To predict optimal encoding parameters effectively, content providers traditionally use unsupervised feature extraction methods, such as ITU-T’s Spatial Information ( SI ) and Temporal Information ( TI ) to represent the spatial and temporal complexity of video sequences. Recently, Video Complexity Analyzer (VCA) was introduced to extract DCT-based features to represent the complexity of a video sequence (or parts thereof). These unsupervised features, however, cannot accurately predict video encoding parameters. To address this issue, this paper introduces a novel supervised feature extraction method named DeepVCA, which extracts the spatial and temporal complexity of video sequences using deep neural networks. In this approach, the encoding bits required to encode each frame in intra-mode and inter-mode are used as labels for spatial and temporal complexity, respectively. Initially, we benchmark various deep neural network structures to predict spatial complexity. We then leverage the similarity of features used to predict the spatial complexity of the current frame and its previous frame to rapidly predict temporal complexity. This approach is particularly useful as the temporal complexity may depend not only on the differences between two consecutive frames but also on their spatial complexity. Our proposed approach demonstrates significant improvement over unsupervised methods, especially for temporal complexity. As an example application, we verify the effectiveness of these features in predicting the encoding bitrate and encoding time of video sequences, which are crucial tasks in video streaming. The source code and dataset are available at https://github.com/cd-athena/ DeepVCA.

Posted in ATHENA | Comments Off on IEEE TCSVT: DeepVCA: Deep Video Complexity Analyzer

Patent Approval for “Variable Framerate Encoding using Content-aware Framerate Prediction for High Framerate Videos”

Variable Framerate Encoding using Content-aware Framerate Prediction for High Framerate Videos

US Patent

[PDF]

Vignesh Menon (Alpen-Adria-Universität Klagenfurt, Austria), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

 

Abstract: The technology described herein relates to variable framerate encoding. A method for variable framerate encoding includes receiving shots, as segmented from a video input, extracting features for each of the shots, the features including at least a spatial energy feature and an average temporal energy, predicting a frame dropping factor for each of the shots based on the spatial energy feature and the average temporal energy, predicting an optimized framerate for each of the shots based on the frame dropping factor, downscaling and encoding each of the shots using the optimized framerate. The encoded shots may then be decoded and upscaled back to their original framerates.

 

Posted in ATHENA | Comments Off on Patent Approval for “Variable Framerate Encoding using Content-aware Framerate Prediction for High Framerate Videos”

Patent Approval for “Fast Multi-rate Encoding for Adaptive Streaming using Machine Learning”

Fast Multi-rate Encoding for Adaptive Streaming using Machine Learning

US Patent

[PDF]

Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), Ekrem Çetinkaya (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

 

Abstract: According to embodiments of the disclosure, fast multi-rate encoding may be performed using machine learning by encoding a lowest quality representation to determine encoding parameters, processing raw data of the video using a neural network to obtain an intermediate output comprising encoding features, augmenting the intermediate output with additional encoding features to form a final tensor, and processing the final tensor with another neural network to obtain a classification output comprising a split or not split decision for an image data block. The classification output may be used to encode a highest quality representation, and then other representations of the video.

Posted in ATHENA | Comments Off on Patent Approval for “Fast Multi-rate Encoding for Adaptive Streaming using Machine Learning”

MEDUSA: A Dynamic Codec Switching Approach in HTTP Adaptive Streaming

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)

Journal website

[PDF]

Daniele Lorenzi (AAU, Austria), Farzad Tashtarian (AAU, Austria), Hermann Hellwagner (AAU, Austria), and Christian Timmerer (AAU, Austria)

HTTP Adaptive Streaming (HAS) solutions utilize various Adaptive BitRate (ABR) algorithms to dynamically select appropriate video representations, aiming to adapt to fluctuations in network bandwidth. However, current ABR implementations have a limitation in that they are designed to function with one set of video representations, i.e., the bitrate ladder, which differ in bitrate and resolution, but are encoded with the same video codec. When multiple codecs are available, current ABR algorithms select one of them prior to the streaming session and stick to it throughout the entire streaming session. Although newer codecs are generally preferred over older ones, their compression efficiencies differ depending on the content’s complexity, which varies over time. Therefore, it is necessary to select the appropriate codec for each video segment to reduce the requested data while delivering the highest possible quality. In this paper, we first provide a practical example where we compare compression efficiencies of different codecs on a set of video sequences. Based on this analysis, we formulate the optimization problem of selecting the appropriate codec for each user and video segment (on a per-segment basis in the outmost case), refining the selection of the ABR algorithms by exploiting key metrics, such as the perceived segment quality and size. Subsequently, to address the scalability issues of this centralized model, we introduce a novel distributed plug-in ABR algorithm for Video on Demand (VoD) applications called MEDUSA to be deployed on top of existing ABR algorithms. MEDUSA enhances the user’s Quality of Experience (QoE) by utilizing a multi-objective function that considers the quality and size of video segments when selecting the next representation. Using quality information and segment size from the modified Media Presentation Description (MPD), MEDUSA utilizes buffer occupancy to prioritize quality or size by assigning specific weights in the objective function. To show the impact of MEDUSA, we compare the proposed plug-in approach on top of state-of-the-art techniques with their original implementations and analyze the results for different network traces, video content, and buffer capacities. According to the experimental findings, MEDUSA shows the ability to improve QoE for various test videos and scenarios. The results reveal an impressive improvement in the QoE score of up to 42% according to the ITU-T P.1203 model (mode 0). Additionally, MEDUSA can reduce the transmitted data volume by up to more than 40% achieving a QoE similar to the techniques compared, reducing the burden on streaming service providers for delivery costs.

 

 

Posted in ATHENA | Comments Off on MEDUSA: A Dynamic Codec Switching Approach in HTTP Adaptive Streaming

Generative AI for Adaptive Video Streaming

The 15th ACM Multimedia Systems Conference (Doctoral Symposium)

15-18 April, 2024 | Bari, Italy

Conference website

[PDF]

Emanuele Artioli (AAU, Austria)

Video streaming stands as the cornerstone of telecommunication networks, constituting over 60% of mobile data traffic as of June 2023. The paramount challenge faced by video streaming service providers is ensuring high Quality of Experience (QoE) for users. In HTTP Adaptive Streaming (HAS), including DASH and HLS, video content is encoded at multiple quality versions, with an Adaptive Bitrate (ABR) algorithm dynamically selecting versions based on network conditions. Concurrently, Artificial Intelligence (AI) is revolutionizing the industry, particularly in content recommendation and personalization. Leveraging user data and advanced algorithms, AI enhances user engagement, satisfaction, and video quality through super-resolution and denoising techniques. However, challenges persist, such as real-time processing on resource-constrained devices, the need for diverse training datasets, privacy concerns, and model interpretability. Despite these hurdles, the promise of Generative Artificial Intelligence emerges as a transformative force. Generative AI, capable of synthesizing new data based on learned patterns, holds vast potential in the video streaming landscape. In the context of video streaming, it can create realistic and immersive content, adapt in real time to individual preferences, and optimize video compression for seamless streaming in low-bandwidth conditions. This research proposal outlines a comprehensive exploration at the intersection of advanced AI algorithms and digital entertainment, focusing on the potential of generative AI to elevate video quality, user interactivity, and the overall streaming experience. The objective is to integrate generative models into video streaming
pipelines, unraveling novel avenues that promise a future of dynamic, personalized, and visually captivating streaming experiences for viewers

Posted in ATHENA | Comments Off on Generative AI for Adaptive Video Streaming