A Tutorial at ACM SIGCOMM 2025

Posted on July 8, 2025 by

A Tutorial at ACM SIGCOMM 2025

Optimizing Low-Latency Video Streaming: AI-Assisted Codec-Network Coordination

Coimbra, Portugal, September 8 – 11, 2025.

Tutorial speakers:

Farzad Tashtarian (Alpen-Adria-Universität – AAU)
Zili Meng (Hong Kong University of Science and Technology – HKUST)
Abdelhak Bentaleb (Concordia University)
Mahdi Dolati (Sharif University of Technology)

This tutorial focuses on the emerging need for ultra-low-latency video streaming and how AI-assisted coordination between codecs and network infrastructure can significantly improve performance. Traditional end-to-end streaming pipelines are often disjointed, leading to inefficiencies under tight latency constraints. We present a cross-layer approach that leverages AI for real-time encoding parameter adaptation, network-aware bitrate selection, and joint optimization across codec behavior and transport protocols. The tutorial examines the integration of AI models with programmable network architectures (e.g., SDN, P4) and modern transport technologies such as QUIC and Media over QUIC (MoQ) to minimize startup delay, stall events, and encoding overhead. Practical use cases and experimental insights illustrate how aligning codec dynamics with real-time network conditions enhances both QoE and system efficiency. Designed for both researchers and engineers, this session provides a foundation for developing next-generation intelligent video delivery systems capable of sustaining low-latency performance in dynamic environments.

Posted in ATHENA | Comments Off

Real-Time AI-Driven Avatar Generation for Sign Language in HTTP Adaptive Streaming

Posted on June 25, 2025 by

Real-Time AI-Driven Avatar Generation for Sign Language in HTTP Adaptive Streaming

The 3rd ACM SIGCOMM Workshop on Emerging Multimedia Systems (ACM EMS 2025)

https://conferences.sigcomm.org/sigcomm/2025/workshop/ems/

8 September 2025 // Coimbra, Portugal

[PDF]

Daniele Lorenzi (AAU, Austria), Emanuele Artioli (AAU, Austria), Farzad Tashtarian (AAU, Austria), Christian Timmerer (AAU, Austria)

Abstract: As digital media consumption over the Internet surges globally, ensuring accessibility for all users becomes paramount. For people with hearing impairments, this means providing inclusion beyond classic captioning, which does not convey the full emotional and contextual depth of spoken content. This work addresses this accessibility gap by exploring the use of AI-generated avatars capable of translating speech into sign language in real-time. After defining the multifaceted challenges in this domain, we propose a novel AI-driven task partition to animate avatars for accurate and expressive sign language interpretations in live streaming.

Posted in ATHENA | Comments Off

Elsevier Displays: Unlocking Implicit Motion for Evaluating Image Complexity

Posted on June 23, 2025 by

Unlocking Implicit Motion for Evaluating Image Complexity

Displays

[PDF]

Yixiao Lia (Beihang University, China), Xiaoyuan Yang (Beihang University, China), Yanda Meng (University of Exeter, UK), Hadi Amirpour (AAU, AT), Jiang Liu (Cardiff University, UK), Yuqing Luo (Cardiff University, UK), Hantao Liu (Cardiff University, UK), and Wei Zhou (Cardiff University, UK)

Abstract: Image complexity (IC) plays a critical role in both cognitive science and multimedia computing, influencing visual aesthetics, emotional responses, and tasks such as image classification and enhancement. However, defining and quantifying IC remains challenging due to its multifaceted nature, which encompasses both objective attributes (e.g., detail, structure) and subjective human perception. While traditional methods rely on entropy-based or multidimensional approaches, and recent advances employ machine learning and shallow neural networks, these techniques often fail to fully capture the subjective aspects of IC. Inspired by the fact that the human visual system inherently perceives implicit motion in static images, we propose a novel approach to address this gap by explicitly incorporating hidden motion into IC assessment. We introduce the motion-inspired image complexity assessment metric (MICM) as a new framework for this purpose. MICM introduces a dual-branch architecture: One branch extracts spatial features from static images, while the other generates short video sequences to analyze latent motion dynamics. To ensure meaningful motion representation, we design a hierarchical loss function that aligns video features with text prompts derived from image-to-text models, refining motion semantics at both local (i.e., frame and word) and global levels. Experiments on three public image complexity assessment (ICA) databases demonstrate that our approach, MICM, significantly outperforms state-of-the-art methods, validating its effectiveness. The code will be publicly available upon acceptance of the paper.

Posted in ATHENA | Comments Off

Up to 4 Predoc Scientist Positions (all genders welcome)

Posted on June 18, 2025 by

The University of Klagenfurt, with approximately 1,700 employees and over 13,000 students, is located in the Alps-Adriatic region and consistently achieves excellent placements in rankings. The motto “per aspera ad astra” underscores our firm commitment to the pursuit of excellence in all activities in research, teaching, and university management. The principles of equality, diversity, health, sustainability, and compatibility of work and family life serve as the foundation for our work at the university.

The University of Klagenfurt is in the process of establishing a Karl Popper Kolleg (graduate school) entitled “FruitScope: A DroneScope for Smart Agriculture”. The following positions are open for applicants at this school with an anticipated starting date of October 1, 2025:

Up to 4 Predoc Scientist Positions (all genders welcome)

Level of employment: 75 % (30 hours per week) each
Minimum salary: € 39,005.40 per annum (gross); classification according to collective bargaining agreement: B1
Limited to: 3 years
Application deadline: August 20, 2025
Reference code: 338/25

Tasks and responsibilities:

Independent research and scientific qualification within the Karl Popper Kolleg FruitScope with the aim to acquire the Doctoral Degree in Technical Sciences
Peer-reviewed publication of scientific results in journals and at conferences
Team work and student mentoring
Active participation in public relations activities

This graduate school seeks to push the current bounds of state-of-the-art in navigation, coordination, sensing, and communication of multi agent unmanned aerial vehicles (UAVs). The groups of the involved faculty publish in international top journals and conference proceedings. Successful applicants will be encouraged and supported to publish and present their work in such journals and proceedings and will have the opportunity to cooperate with our world-renowned international partners in science and industry. We currently cooperate with partners worldwide, mainly in the USA/Canada and Europe. We specifically encourage close and open collaboration with our peers both internationally and at the University and support international exchanges with the universities and research institutions affiliated to the graduate school (e.g., ETH Zurich, MIT, CMU, NASA, UofT, U-Mich, UPenn, Georgia Tech). Our young research groups provide a dynamic, familiar, and friendly attitude and thus a collaborative and inspiring work environment with very modern infrastructure (e.g., one of the largest indoor drone halls in Europe), which is continuously updated and upgraded (e.g., soon, with one of the largest outdoor drone test fields in the world).

Prerequisites for the appointment:

Completed Master’s or Diploma degree in electrical engineering, information and communication engineering, mechanical engineering, computer science or related fields. This requirement has an extended deadline and must be fulfilled two weeks before the starting date at the latest; hence, the last possible deadline for meeting this requirement is September 17, 2025.
Proven knowledge and experience in at least one of the following areas: mobile robotics, wireless communications or sensing, multimedia communication, signal processing for communications, or machine learning
Proven programming skills in at least one of the following languages: Matlab, C/C++, Java, Python, ROS or similar
Fluency in English (both written and spoken)

Additional desired qualifications:

Good knowledge of cooperative software development (e.g., with GIT)
First scientific publication (apart from Master’s or Diploma thesis) in the area of mobile robotics, wireless sensing, or multimedia communication technology
Relevant international or practical experience
Good scientific communication and presentation skills
German language skills or willingness to acquire German language skills within the first two years of service
Social skills and ability to work independently

Our offer:

The employment contract is concluded for the position as predoc scientist and stipulates a starting salary of € 2,786.10 gross per month (14 times a year; previous experience deemed relevant to the job can be recognized).

The University of Klagenfurt also offers:

Personal and professional advanced training courses, management and career coaching, including bespoke training for women in science
Numerous attractive additional benefits, see also https://jobs.aau.at/en/the-university-as-employer/
Diversity- and family-friendly university culture
The opportunity to live and work in the attractive Alps-Adriatic region with a wide range of leisure activities in the spheres of culture, nature and sports

The application:

If you are interested in this position, please apply in English providing the following documents:

Letter of application explaining the motivation and including a statement of interest in research (indicating an idea for the research for your own doctoral degree)
Curriculum vitae (please do not include a photo)
Copies of degree certificates (Bachelor and Master)
Copies of official transcripts (Bachelor and Master) containing a list of all courses and grades
Master’s thesis. If the thesis is not available, the candidate should provide a draft or an explanation.
If an applicant has not received the Master’s degree by the application deadline, the applicant should provide a declaration, written either by a supervisor or by the candidate themselves, on the feasibility of finishing the Master’s degree before September 17, 2025.

To apply, please select the position with the reference code 338/25 in the category “Scientific Staff” using the link “Apply for this position” in the job portal at https://jobs.aau.at/en/.

Candidates must provide proof that they meet the required qualifications by August 20, 2025, at the latest. However, candidates who fulfil the required qualifications but do not yet possess the required Master’s degree can apply, provided they are able to meet this requirement at least two weeks before the starting date. Therefore, the latest possible deadline for meeting this requirement is September 17, 2025.

General information about the university as an employer can be found at https://jobs.aau.at/en/the-university-as-employer/. At the University of Klagenfurt, recruitment and staff matters are accompanied not only by the authority responsible for the recruitment procedure but also by the Equal Opportunities Working Group and, if applicable, by the Representative for Disabled Persons.

For further information on this specific vacancy, please contact:

Prof Dr. Stephan Weiss, +43 463 2700 3571, Stephan.Weiss@aau.at
Prof Dr. Christian Bettstetter, +43 463 2700 3640, Christian.Bettstetter@aau.at
Prof Dr. Bernhard Rinner, +43 463 2700 3671, Bernhard.Rinner@aau.at
Prof Dr. Christian Timmerer +43 463 2700 3621, Christian.Timmerer@aau.at

The University of Klagenfurt aims to increase the proportion of women and therefore specifically invites qualified women to apply for the position. Where the qualification is equivalent, women will be given preferential consideration.

People with disabilities or chronic diseases, who fulfil the requirements, are particularly encouraged to apply. Travel and accommodation costs incurred during the application process will not be refunded. Under exceptional circumstances online hearings may be possible. Translations into other languages serve informational purposes only. Solely the version advertised in the University Bulletin (Mitteilungsblatt) shall be legally binding.

Posted in News | Comments Off

Elsevier SPIC: 360-Degree Video Super Resolution and Quality Enhancement Challenge: Methods and Results

Posted on June 10, 2025 by

Signal Processing: Image Communication

[PDF]

Ahmed Telili (TII, UAE), Wassim Hamidouche (TII, UAE), Brahim Farhat (TII, UAE), Hadi Amirpour (AAU, Austria), Christian Timmerer (AAU, Austria), Ibrahim Khadraoui (TII, UAE), Jiajie Lu (Politecnico di Milano, Italy), The Van Le (IVCL, South Korea), Jeonneung Baek (IVCL, South Korea), Jin Young Lee (IVCL, South Korea), Yiying Wei (AAU, Austria), Xiaopeng Sun (Meituan Inc. China), Yu Gao (Meituan Inc. China), JianCheng Huang (Meituan Inc. China) and Yujie Zhong (Meituan Inc. China)

Omnidirectional (360-degree) video is rapidly gaining popularity due to advancements in immersive technologies like virtual reality (VR) and extended reality (XR). However, real-time streaming of such videos, particularly in live mobile scenarios such as unmanned aerial vehicles (UAVs), is hindered by limited bandwidth and strict latency constraints. While traditional methods such as compression and adaptive resolution are helpful, they often compromise video quality and introduce artifacts that diminish the viewer’s experience. Additionally, the unique spherical geometry of 360-degree video, with its wide field of view, presents challenges not encountered in traditional 2D video. To address these challenges, we initiated the 360-degree Video Super Resolution and Quality Enhancement challenge. This competition encourages participants to develop efficient machine learning (ML)-powered solutions to enhance the quality of low-bitrate compressed 360-degree videos, under two tracks focusing on 2× and 4× super-resolution (SR). In this paper, we outline the challenge framework, detailing the two competition tracks and highlighting the SR solutions proposed by the top-performing models. We assess these models within a unified framework, (i) considering quality enhancement, (ii) bitrate gain, and (iii) computational efficiency. Our findings show that lightweight single-frame models can effectively balance visual quality and runtime performance under constrained conditions, setting strong baselines for future research. These insights offer practical guidance for advancing real-time 360-degree video streaming, particularly in bandwidth-limited immersive applications.

Posted in ATHENA | Comments Off

EUVIP’25 Tutorial: From Subjective Ratings to Objective Metrics

Posted on June 5, 2025 by

EUVIP 2025
October 13-16, 2025

Malta

Link

Tutorial speakers:

Wei Zhou (Cardiff University)
Hadi Amirpour (University of Klagenfurt)

Tutorial description:

As multimedia services like video streaming, video conferencing, virtual reality (VR), and online gaming continue to evolve, ensuring high perceptual visual quality is crucial for enhancing user experience and maintaining competitiveness. However, multimedia content inevitably undergoes various distortions during acquisition, compression, transmission, and storage, leading to quality degradation. Therefore, perceptual visual quality assessment, which evaluates multimedia quality from a human perception perspective, plays a vital role in optimizing user experience in modern communication systems. This tutorial provides a comprehensive overview of perceptual visual quality assessment, covering both subjective methods, where human observers directly rate their experience, and objective methods, where computational models predict perceptual quality based on measurable factors such as bitrate, frame rate, and compression levels. The session also explores quality assessment metrics tailored to different types of multimedia content, including images, videos, VR, point clouds, meshes, and AI-generated media. Furthermore, we discuss challenges posed by diverse multimedia characteristics, complex distortion scenarios, and varying viewing conditions. By the end of this tutorial, attendees will gain a deep understanding of the principles, methodologies, and latest advancements in perceptual visual quality assessment for multimedia communication.

Posted in ATHENA | Comments Off

JVCIR Special Issue

Posted on May 28, 2025 by

Journal of Visual Communication and Image Representation Special Issue on

Multimodal Learning for Visual Intelligence: From Emerging Techniques to Real-World Applications

Link to the Special Issue

In recent years, the integration of vision with complementary modalities such as language, audio, and sensor signals has emerged as a key enabler for intelligent systems that operate in unstructured environments. The emergence of foundation models and cross-modal pretraining has brought a paradigm shift to the field, making it timely to revisit the core challenges and innovative techniques in multimodal visual understanding.

This Special Issue aims to collect cutting-edge research and engineering practices that advance the understanding and development of visual intelligence systems through multimodal learning. The focus is on the deep integration of visual information with complementary modalities such as text, audio, and sensor data, enabling more comprehensive perception and reasoning in real-world environments. We encourage contributions from both academia and industry that address current challenges and propose novel methodologies for multimodal visual understanding.

Topics of interest include, but are not limited to:

Multimodal data alignment and fusion strategies with a focus on visual-centric modalities
Foundation models for multimodal visual representation learning
Generation and reconstruction techniques in visually grounded multimodal scenarios
Spatiotemporal modeling and relational reasoning of visual-centric multimodal data
Lightweight multimodal visual models for resource-constrained environments
Key technologies for visual-language retrieval and dialogue systems
Applications of multimodal visual computing in healthcare, transportation, robotics, and surveillance

Guest editors:

Guanghui Yue, PhD
Shenzhen University, Shenzhen, China
Email: yueguanghui@szu.edu.cn

Weide Liu, PhD
Harvard University, Cambridge, Massachusetts, USA
Emai: weide001@e.ntu.edu.sg

Ziyang Wang, PhD
The Alan Turing Institute, London, UK
Emai: zwang@turing.ac.uk

Hadi Amirpour, PhD
Alpen-Adria University, Klagenfurt, Austria
Emai: hadi.amirpour@aau.at

Zhedong Zheng, PhD
University of Macau, Macau, China
Email: zhedongzheng@um.edu.mo

Wei Zhou, PhD
Cardiff University, Cardiff, UK
Email: zhouw26@cardiff.ac.uk

Timeline:

Submission Open Date 30/05/2025

Final Manuscript Submission Deadline 30/11/2025

Editorial Acceptance Deadline 30/05/2026

Keywords:

Multimodal Learning, Visual-Language Models, Cross-Modal Pretraining, Multimodal Fusion and Alignment, Spatiotemporal Reasoning, Lightweight Multimodal Models, Applications in Healthcare and Robotics

Posted in ATHENA | Comments Off

Enter your email Address

ATHENA Christian Doppler (CD) Laboratory

A Tutorial at ACM SIGCOMM 2025

Optimizing Low-Latency Video Streaming: AI-Assisted Codec-Network Coordination

Real-Time AI-Driven Avatar Generation for Sign Language in HTTP Adaptive Streaming

Real-Time AI-Driven Avatar Generation for Sign Language in HTTP Adaptive Streaming

The 3rd ACM SIGCOMM Workshop on Emerging Multimedia Systems (ACM EMS 2025)

8 September 2025 // Coimbra, Portugal

Elsevier Displays: Unlocking Implicit Motion for Evaluating Image Complexity

Unlocking Implicit Motion for Evaluating Image Complexity

Displays

[PDF]

Up to 4 Predoc Scientist Positions (all genders welcome)

Elsevier SPIC: 360-Degree Video Super Resolution and Quality Enhancement Challenge: Methods and Results

Signal Processing: Image Communication

EUVIP’25 Tutorial: From Subjective Ratings to Objective Metrics

JVCIR Special Issue

Multimodal Learning for Visual Intelligence: From Emerging Techniques to Real-World Applications

Project Funding

Archives

Links

Multimedia Communication

ITEC Homepage

Recent Posts