Patent Approval for “Content-adaptive encoder preset prediction for adaptive live streaming”

Content-adaptive encoder preset prediction for adaptive live streaming

US Patent

[PDF]

Vignesh Menon (Alpen-Adria-Universität Klagenfurt, Austria), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

 

Abstract: Techniques for content-adaptive encoder preset prediction for adaptive live streaming are described herein. A method for content-adaptive encoder preset prediction for adaptive live streaming includes performing video complexity feature extraction on a video segment to extract complexity features such as an average texture energy, an average temporal energy, and an average lumiscence. These inputs may be provided to an encoding time prediction model, along with a bitrate ladder, a resolution set, a target video encoding speed, and a number of CPU threads for the video segment, to predict an encoding time, and an optimized encoding preset may be selected for the video segment by a preset selection function using the predicted encoding time. The video segment may be encoded according to the optimized encoding preset.

Posted in ATHENA | Comments Off on Patent Approval for “Content-adaptive encoder preset prediction for adaptive live streaming”

ICCV VQualA’25: VQualA 2025 Challenge on Image Super-Resolution Generated Content Quality Assessment: Methods and Results

VQualA 2025 Challenge on Image Super-Resolution Generated Content Quality Assessment: Methods and Results

ICCV VQualA 2025

October 19 – October 23, 2025

Hawai’i, USA

[PDF]

Hadi Amirpour (AAU, Austria), et al.

Abstract: This paper presents the ISRGC-Q Challenge, built upon the Image Super-Resolution Generated Content Quality Assessment (ISRGen-QA) dataset, and organized as part of the Visual Quality Assessment (VQualA) Competition at the ICCV 2025 Workshops. Unlike existing Super-Resolution Image Quality Assessment (SR-IQA) datasets, ISRGen-QA places greater emphasis on SR images generated by the latest generative approaches, including Generative Adversarial Networks (GANs) and diffusion models. The primary goal of this challenge is to analyze the unique artifacts introduced by modern super-resolution techniques and to evaluate their perceptual quality effectively. A total of 108 participants registered for the challenge, with 4 teams submitting valid solutions and fact sheets for the final testing phase. These submissions demonstrated state-of-the-art (SOTA) performance on the ISRGen-QA dataset. The project is publicly available at: https://github.com/Lighting-YXLI/ISRGen-QA.

Posted in ATHENA | Comments Off on ICCV VQualA’25: VQualA 2025 Challenge on Image Super-Resolution Generated Content Quality Assessment: Methods and Results

ICCV VQualA’25: VQualA 2025 Challenge on Face Image Quality Assessment: Methods and Results

VQualA 2025 Challenge on Face Image Quality Assessment: Methods and Results

ICCV VQualA 2025

October 19 – October 23, 2025

Hawai’i, USA

[PDF]

MohammadAli Hamidi (University of Cagliari, Italy), Hadi Amirpour (AAU, Austria), et al.

Abstract: Face images have become integral to various applications. but real-world capture conditions often lead to degradations such as noise, blur, compression artifacts, and poor lighting. These degradations negatively impact image quality and downstream tasks. To promote advancements in face image quality assessment (FIQA), we introduce the VQualA 2025 Challenge on Face Image Quality Assessment, part of ICCV 2025 Workshops. Participants developed efficient models (≤0.5 GFLOPs, ≤5M parameters) predicting Mean Opinion Scores (MOS) under realistic degradations. Submissions were rigorously evaluated using objective metrics and human perceptual judgments. The challenge attracted 127 participants, resulting in 1519 valid final submissions. Detailed methodologies and results are presented, contributing to practical FIQA solutions.

 

Posted in ATHENA | Comments Off on ICCV VQualA’25: VQualA 2025 Challenge on Face Image Quality Assessment: Methods and Results

ICCV VQualA’25: A Lightweight Ensemble-Based Face Image Quality Assessment Method with Correlation-Aware Loss

A Lightweight Ensemble-Based Face Image Quality Assessment Method with Correlation-Aware Loss

ICCV VQualA 2025

October 19 – October 23, 2025

Hawai’i, USA

[PDF]

MohammadAli Hamidi (University of Cagliari, Italy), Hadi Amirpour (AAU, Austria), Luigi Atzori (University of Cagliari, Italy), Christian Timmerer (AAU, Austria),

Abstract:Face image quality assessment (FIQA) plays a critical role in face recognition and verification systems, especially in uncontrolled, real-world environments. Although several methods have been proposed, general-purpose no-reference image quality assessment techniques often fail to capture face-specific degradations. Meanwhile, state-of-the-art FIQA models tend to be computationally intensive, limiting their practical applicability. We propose a lightweight and efficient method for FIQA, designed for the perceptual evaluation of face images in the wild. Our approach integrates an ensemble of two compact convolutional neural networks, MobileNetV3-Small and ShuffleNetV2, with prediction-level fusion via simple averaging. To enhance alignment with human perceptual judgments, we employ a correlation-aware loss (MSECorrLoss), combining mean squared error (MSE) with a Pearson correlation regularizer. Our method achieves a strong balance between accuracy and computational cost, making it suitable for real-world deployment. Experiments on the VQualA FIQA benchmark demonstrate that our model achieves a Spearman rank correlation coefficient (SRCC) of 0.9829 and a Pearson linear correlation coefficient (PLCC) of 0.9894, remaining within competition efficiency constraints.

Posted in ATHENA | Comments Off on ICCV VQualA’25: A Lightweight Ensemble-Based Face Image Quality Assessment Method with Correlation-Aware Loss

ACM MM’25 Demo: SDART: Spatial Dart AR Simulation with Hand-Tracked Input

SDART: Spatial Dart AR Simulation with Hand-Tracked Input

ACM Multimedia 2025

October 27 – October 31, 2025

Dublin, Ireland

[PDF]

Milad Ghanbari (AAU, Austria), Wei Zhou (Cardiff, UK), Cosmin Stejerean (Meta, US), Christian Timmerer (AAU, Austria), Hadi Amirpour (AAU, Austria)

Abstract: We present a physics-driven 3D dart-throwing interaction system for Apple Vision Pro (AVP), developed using Unity 6 engine and running in augmented reality (AR) mode on the device. The system utilizes the PolySpatial and Apple’s ARKit software development kits (SDKs) to ensure hand input and tracking in order to intuitively spawn, grab, and throw virtual darts similar to real darts. The application benefits from physics simulations alongside the innovative no-controller input system of AVP to manipulate objects realistically in an unbounded spatial volume. By implementing spatial distance measurement, scoring logic, and recording user performance, this project enables user studies on quality of experience in interactive experiences. To evaluate the perceived quality and realism of the interaction, we conducted a subjective study with 10 participants using a structured questionnaire. The study measured various aspects of the user experience, including visual and spatial realism, control fidelity, depth perception, immersiveness, and enjoyment. Results indicate high mean opinion scores (MOS) across key dimensions. Link to video: Link

Posted in ATHENA | Comments Off on ACM MM’25 Demo: SDART: Spatial Dart AR Simulation with Hand-Tracked Input

ACM MM’25 Demo: Depth-Enabled Inspection of Medical Videos

Depth-Enabled Inspection of Medical Videos

ACM Multimedia 2025

October 27 – October 31, 2025

Dublin, Ireland

[PDF]

Hadi Amirpour (AAU, Austria), Doris Putzgruber-Adamitsch (AAU, Austria), Yosuf El-Shabrawi (Kabeg, Austria), Klaus Schoeffmann (AAU, Austria)

Abstract: Cataract surgery is the most frequently performed surgical procedure worldwide, involving the replacement of a patient’s clouded eye lens with a synthetic intraocular lens to restore visual acuity. Although typically brief, the operation consists of distinct phases that demand precision and extensive training, traditionally constrained by the limitations of real-time observation under a microscope. To enhance learning and procedural accuracy, modern advancements in stereoscopic video capture and head-mounted displays (HMDs) offer a promising solution. This paper demonstrates the application of stereoscopic cataract surgery videos, visualized through Apple Vision Pro (AVP) and Meta Quest 3, to provide immersive 3D perspectives that enhance depth perception and spatial awareness. An expert evaluation study with experienced surgeons indicates that stereoscopic visualization significantly improves comprehension of spatial relationships and procedural maneuvers, suggesting its potential to revolutionize surgical education and real-time guidance in ophthalmic surgery. Demo video: Link

Posted in ATHENA | Comments Off on ACM MM’25 Demo: Depth-Enabled Inspection of Medical Videos

 A Tutorial at  ACM SIGCOMM 2025

A Tutorial at  ACM SIGCOMM 2025

Optimizing Low-Latency Video Streaming: AI-Assisted Codec-Network Coordination

[Link]

Coimbra, Portugal, September 8 – 11, 2025.


Tutorial speakers:

  • Farzad Tashtarian (Alpen-Adria-Universität – AAU)
  • Zili Meng (Hong Kong University of Science and Technology – HKUST)
  • Abdelhak Bentaleb (Concordia University)
  • Mahdi Dolati (Sharif University of Technology)

This tutorial focuses on the emerging need for ultra-low-latency video streaming and how AI-assisted coordination between codecs and network infrastructure can significantly improve performance. Traditional end-to-end streaming pipelines are often disjointed, leading to inefficiencies under tight latency constraints. We present a cross-layer approach that leverages AI for real-time encoding parameter adaptation, network-aware bitrate selection, and joint optimization across codec behavior and transport protocols. The tutorial examines the integration of AI models with programmable network architectures (e.g., SDN, P4) and modern transport technologies such as QUIC and Media over QUIC (MoQ) to minimize startup delay, stall events, and encoding overhead. Practical use cases and experimental insights illustrate how aligning codec dynamics with real-time network conditions enhances both QoE and system efficiency. Designed for both researchers and engineers, this session provides a foundation for developing next-generation intelligent video delivery systems capable of sustaining low-latency performance in dynamic environments.

Posted in ATHENA | Comments Off on  A Tutorial at  ACM SIGCOMM 2025