Real-Time AI-Driven Avatar Generation for Sign Language in HTTP Adaptive Streaming

Real-Time AI-Driven Avatar Generation for Sign Language in HTTP Adaptive Streaming

The 3rd ACM SIGCOMM Workshop on Emerging Multimedia Systems (ACM EMS 2025)

https://conferences.sigcomm.org/sigcomm/2025/workshop/ems/

8 September 2025 // Coimbra, Portugal

[PDF]

Daniele Lorenzi (AAU, Austria), Emanuele Artioli (AAU, Austria), Farzad Tashtarian (AAU, Austria), Christian Timmerer (AAU, Austria)

Abstract: As digital media consumption over the Internet surges globally, ensuring accessibility for all users becomes paramount. For people with hearing impairments, this means providing inclusion beyond classic captioning, which does not convey the full emotional and contextual depth of spoken content. This work addresses this accessibility gap by exploring the use of AI-generated avatars capable of translating speech into sign language in real-time. After defining the multifaceted challenges in this domain, we propose a novel AI-driven task partition to animate avatars for accurate and expressive sign language interpretations in live streaming.

Posted in ATHENA | Comments Off on Real-Time AI-Driven Avatar Generation for Sign Language in HTTP Adaptive Streaming

Elsevier Displays: Unlocking Implicit Motion for Evaluating Image Complexity

Unlocking Implicit Motion for Evaluating Image Complexity

Displays

[PDF]

Yixiao Lia (Beihang University, China), Xiaoyuan Yang (Beihang University, China), Yanda Meng (University of Exeter, UK), Hadi Amirpour (AAU, AT), Jiang Liu (Cardiff University, UK), Yuqing Luo (Cardiff University, UK), Hantao Liu (Cardiff University, UK), and Wei Zhou (Cardiff University, UK)

Abstract: Image complexity (IC) plays a critical role in both cognitive science and multimedia computing, influencing visual aesthetics, emotional responses, and tasks such as image classification and enhancement. However, defining and quantifying IC remains challenging due to its multifaceted nature, which encompasses both objective attributes (e.g., detail, structure) and subjective human perception. While traditional methods rely on entropy-based or multidimensional approaches, and recent advances employ machine learning and shallow neural networks, these techniques often fail to fully capture the subjective aspects of IC. Inspired by the fact that the human visual system inherently perceives implicit motion in static images, we propose a novel approach to address this gap by explicitly incorporating hidden motion into IC assessment. We introduce the motion-inspired image complexity assessment metric (MICM) as a new framework for this purpose. MICM introduces a dual-branch architecture: One branch extracts spatial features from static images, while the other generates short video sequences to analyze latent motion dynamics. To ensure meaningful motion representation, we design a hierarchical loss function that aligns video features with text prompts derived from image-to-text models, refining motion semantics at both local (i.e., frame and word) and global levels. Experiments on three public image complexity assessment (ICA) databases demonstrate that our approach, MICM, significantly outperforms state-of-the-art methods, validating its effectiveness. The code will be publicly available upon acceptance of the paper.

 

Posted in ATHENA | Comments Off on Elsevier Displays: Unlocking Implicit Motion for Evaluating Image Complexity

Up to 4 Predoc Scientist Positions (all genders welcome)

The University of Klagenfurt, with approximately 1,700 employees and over 13,000 students, is located in the Alps-Adriatic region and consistently achieves excellent placements in rankings. The motto “per aspera ad astra” underscores our firm commitment to the pursuit of excellence in all activities in research, teaching, and university management. The principles of equality, diversity, health, sustainability, and compatibility of work and family life serve as the foundation for our work at the university.

The University of Klagenfurt is in the process of establishing a Karl Popper Kolleg (graduate school) entitled “FruitScope: A DroneScope for Smart Agriculture”. The following positions are open for applicants at this school with an anticipated starting date of October 1, 2025:

Up to 4 Predoc Scientist Positions (all genders welcome)

  • Level of employment: 75 % (30 hours per week) each
  • Minimum salary: € 39,005.40 per annum (gross); classification according to collective bargaining agreement: B1
  • Limited to: 3 years
  • Application deadline: August 20, 2025
  • Reference code: 338/25

Tasks and responsibilities:

  • Independent research and scientific qualification within the Karl Popper Kolleg FruitScope with the aim to acquire the Doctoral Degree in Technical Sciences
  • Peer-reviewed publication of scientific results in journals and at conferences
  • Team work and student mentoring
  • Active participation in public relations activities

This graduate school seeks to push the current bounds of state-of-the-art in navigation, coordination, sensing, and communication of multi agent unmanned aerial vehicles (UAVs). The groups of the involved faculty publish in international top journals and conference proceedings. Successful applicants will be encouraged and supported to publish and present their work in such journals and proceedings and will have the opportunity to cooperate with our world-renowned international partners in science and industry. We currently cooperate with partners worldwide, mainly in the USA/Canada and Europe. We specifically encourage close and open collaboration with our peers both internationally and at the University and support international exchanges with the universities and research institutions affiliated to the graduate school (e.g., ETH Zurich, MIT, CMU, NASA, UofT, U-Mich, UPenn, Georgia Tech). Our young research groups provide a dynamic, familiar, and friendly attitude and thus a collaborative and inspiring work environment with very modern infrastructure (e.g., one of the largest indoor drone halls in Europe), which is continuously updated and upgraded (e.g., soon, with one of the largest outdoor drone test fields in the world).

Prerequisites for the appointment:

  • Completed Master’s or Diploma degree in electrical engineering, information and communication engineering, mechanical engineering, computer science or related fields. This requirement has an extended deadline and must be fulfilled two weeks before the starting date at the latest; hence, the last possible deadline for meeting this requirement is September 17, 2025.
  • Proven knowledge and experience in at least one of the following areas: mobile robotics, wireless communications or sensing, multimedia communication, signal processing for communications, or machine learning
  • Proven programming skills in at least one of the following languages: Matlab, C/C++, Java, Python, ROS or similar
  • Fluency in English (both written and spoken)

Additional desired qualifications:

  • Good knowledge of cooperative software development (e.g., with GIT)
  • First scientific publication (apart from Master’s or Diploma thesis) in the area of mobile robotics, wireless sensing, or multimedia communication technology
  • Relevant international or practical experience
  • Good scientific communication and presentation skills
  • German language skills or willingness to acquire German language skills within the first two years of service
  • Social skills and ability to work independently

Our offer:

The employment contract is concluded for the position as predoc scientist and stipulates a starting salary of € 2,786.10 gross per month (14 times a year; previous experience deemed relevant to the job can be recognized).

The University of Klagenfurt also offers:

  • Personal and professional advanced training courses, management and career coaching, including bespoke training for women in science
  • Numerous attractive additional benefits, see also https://jobs.aau.at/en/the-university-as-employer/
  • Diversity- and family-friendly university culture
  • The opportunity to live and work in the attractive Alps-Adriatic region with a wide range of leisure activities in the spheres of culture, nature and sports

The application:

If you are interested in this position, please apply in English providing the following documents:

  • Letter of application explaining the motivation and including a statement of interest in research (indicating an idea for the research for your own doctoral degree)
  • Curriculum vitae (please do not include a photo)
  • Copies of degree certificates (Bachelor and Master)
  • Copies of official transcripts (Bachelor and Master) containing a list of all courses and grades
  • Master’s thesis. If the thesis is not available, the candidate should provide a draft or an explanation.
  • If an applicant has not received the Master’s degree by the application deadline, the applicant should provide a declaration, written either by a supervisor or by the candidate themselves, on the feasibility of finishing the Master’s degree before September 17, 2025.

To apply, please select the position with the reference code 338/25 in the category “Scientific Staff” using the link “Apply for this position” in the job portal at https://jobs.aau.at/en/.

Candidates must provide proof that they meet the required qualifications by August 20, 2025, at the latest. However, candidates who fulfil the required qualifications but do not yet possess the required Master’s degree can apply, provided they are able to meet this requirement at least two weeks before the starting date. Therefore, the latest possible deadline for meeting this requirement is September 17, 2025.

General information about the university as an employer can be found at https://jobs.aau.at/en/the-university-as-employer/. At the University of Klagenfurt, recruitment and staff matters are accompanied not only by the authority responsible for the recruitment procedure but also by the Equal Opportunities Working Group and, if applicable, by the Representative for Disabled Persons.

For further information on this specific vacancy, please contact:

The University of Klagenfurt aims to increase the proportion of women and therefore specifically invites qualified women to apply for the position. Where the qualification is equivalent, women will be given preferential consideration.

People with disabilities or chronic diseases, who fulfil the requirements, are particularly encouraged to apply. Travel and  accommodation costs incurred during the application process will not be refunded. Under exceptional circumstances online hearings may be possible. Translations into other languages serve informational purposes only. Solely the version advertised in the University Bulletin (Mitteilungsblatt) shall be legally binding.

Posted in News | Comments Off on Up to 4 Predoc Scientist Positions (all genders welcome)

Elsevier SPIC: 360-Degree Video Super Resolution and Quality Enhancement Challenge: Methods and Results

Signal Processing: Image Communication

[PDF]

Ahmed Telili (TII, UAE), Wassim Hamidouche (TII, UAE), Brahim Farhat (TII, UAE), Hadi Amirpour (AAU, Austria), Christian Timmerer (AAU, Austria), Ibrahim Khadraoui (TII, UAE), Jiajie Lu (Politecnico di Milano, Italy), The Van Le (IVCL, South Korea), Jeonneung Baek (IVCL, South Korea), Jin Young Lee (IVCL, South Korea), Yiying Wei (AAU, Austria), Xiaopeng Sun (Meituan Inc. China), Yu Gao (Meituan Inc. China), JianCheng Huang (Meituan Inc. China) and Yujie Zhong (Meituan Inc. China)

Omnidirectional (360-degree) video is rapidly gaining popularity due to advancements in immersive technologies like virtual reality (VR) and extended reality (XR). However, real-time streaming of such videos, particularly in live mobile scenarios such as unmanned aerial vehicles (UAVs), is hindered by limited bandwidth and strict latency constraints. While traditional methods such as compression and adaptive resolution are helpful, they often compromise video quality and introduce artifacts that diminish the viewer’s experience. Additionally, the unique spherical geometry of 360-degree video, with its wide field of view, presents challenges not encountered in traditional 2D video. To address these challenges, we initiated the 360-degree Video Super Resolution and Quality Enhancement challenge. This competition encourages participants to develop efficient machine learning (ML)-powered solutions to enhance the quality of low-bitrate compressed 360-degree videos, under two tracks focusing on 2× and 4× super-resolution (SR). In this paper, we outline the challenge framework, detailing the two competition tracks and highlighting the SR solutions proposed by the top-performing models. We assess these models within a unified framework, (i) considering quality enhancement, (ii) bitrate gain, and (iii) computational efficiency. Our findings show that lightweight single-frame models can effectively balance visual quality and runtime performance under constrained conditions, setting strong baselines for future research. These insights offer practical guidance for advancing real-time 360-degree video streaming, particularly in bandwidth-limited immersive applications.

Posted in ATHENA | Comments Off on Elsevier SPIC: 360-Degree Video Super Resolution and Quality Enhancement Challenge: Methods and Results

EUVIP’25 Tutorial: From Subjective Ratings to Objective Metrics

EUVIP 2025
October 13-16, 2025

Malta

Link

Tutorial speakers:

  • Wei Zhou (Cardiff University)
  • Hadi Amirpour (University of Klagenfurt)

Tutorial description:

As multimedia services like video streaming, video conferencing, virtual reality (VR), and online gaming continue to evolve, ensuring high perceptual visual quality is crucial for enhancing user experience and maintaining competitiveness. However, multimedia content inevitably undergoes various distortions during acquisition, compression, transmission, and storage, leading to quality degradation. Therefore, perceptual visual quality assessment, which evaluates multimedia quality from a human perception perspective, plays a vital role in optimizing user experience in modern communication systems. This tutorial provides a comprehensive overview of perceptual visual quality assessment, covering both subjective methods, where human observers directly rate their experience, and objective methods, where computational models predict perceptual quality based on measurable factors such as bitrate, frame rate, and compression levels. The session also explores quality assessment metrics tailored to different types of multimedia content, including images, videos, VR, point clouds, meshes, and AI-generated media. Furthermore, we discuss challenges posed by diverse multimedia characteristics, complex distortion scenarios, and varying viewing conditions. By the end of this tutorial, attendees will gain a deep understanding of the principles, methodologies, and latest advancements in perceptual visual quality assessment for multimedia communication.

Posted in ATHENA | Comments Off on EUVIP’25 Tutorial: From Subjective Ratings to Objective Metrics

JVCIR Special Issue

Journal of Visual Communication and Image Representation Special Issue on

Multimodal Learning for Visual Intelligence: From Emerging Techniques to Real-World Applications

Link to the Special Issue

In recent years, the integration of vision with complementary modalities such as language, audio, and sensor signals has emerged as a key enabler for intelligent systems that operate in unstructured environments. The emergence of foundation models and cross-modal pretraining has brought a paradigm shift to the field, making it timely to revisit the core challenges and innovative techniques in multimodal visual understanding.

This Special Issue aims to collect cutting-edge research and engineering practices that advance the understanding and development of visual intelligence systems through multimodal learning. The focus is on the deep integration of visual information with complementary modalities such as text, audio, and sensor data, enabling more comprehensive perception and reasoning in real-world environments. We encourage contributions from both academia and industry that address current challenges and propose novel methodologies for multimodal visual understanding.

Topics of interest include, but are not limited to:

  • Multimodal data alignment and fusion strategies with a focus on visual-centric modalities
  • Foundation models for multimodal visual representation learning
  • Generation and reconstruction techniques in visually grounded multimodal scenarios
  • Spatiotemporal modeling and relational reasoning of visual-centric multimodal data
  • Lightweight multimodal visual models for resource-constrained environments
  • Key technologies for visual-language retrieval and dialogue systems
  • Applications of multimodal visual computing in healthcare, transportation, robotics, and surveillance

Guest editors:

Guanghui Yue, PhD
Shenzhen University, Shenzhen, China
Email: yueguanghui@szu.edu.cn

Weide Liu, PhD
Harvard University, Cambridge, Massachusetts, USA
Emai: weide001@e.ntu.edu.sg

Ziyang Wang, PhD
The Alan Turing Institute, London, UK
Emai: zwang@turing.ac.uk

Hadi Amirpour, PhD
Alpen-Adria University, Klagenfurt, Austria
Emai: hadi.amirpour@aau.at

Zhedong Zheng, PhD
University of Macau, Macau, China
Email: zhedongzheng@um.edu.mo

Wei Zhou, PhD
Cardiff University, Cardiff, UK
Email: zhouw26@cardiff.ac.uk

Timeline:

Submission Open Date 30/05/2025

Final Manuscript Submission Deadline 30/11/2025

Editorial Acceptance Deadline 30/05/2026

Keywords:

Multimodal Learning, Visual-Language Models, Cross-Modal Pretraining, Multimodal Fusion and Alignment, Spatiotemporal Reasoning, Lightweight Multimodal Models, Applications in Healthcare and Robotics

Posted in ATHENA | Comments Off on JVCIR Special Issue

ICCV 2025 Workshop: Visual Quality Assessment Competition

Visual Quality Assessment Competition

VQualA

co-located with ICCV 2025

https://vquala.github.io/

VQualA Logo

Visual quality assessment plays a crucial role in computer vision, serving as a fundamental step in tasks such as image quality assessment (IQA), image super-resolution, document image enhancement, and video restoration. Traditional visual quality assessment techniques often rely on scalar metrics like Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM), which, while effective in certain contexts, fall short in capturing the perceptual quality experienced by human observers. This gap emphasizes the need for more perceptually aligned and comprehensive evaluation methods that can adapt to the growing demands of applications such as medical imaging, satellite remote sensing, immersive media, and document processing. In recent years, advancements in deep learning, generative models, and multimodal large language models (MLLMs) have opened up new avenues for visual quality assessment. These models offer capabilities that extend beyond traditional scalar metrics, enabling more nuanced assessments through natural language explanations, open-ended visual comparisons, and enhanced context awareness. With these innovations, VQA is evolving to better reflect human perceptual judgments, making it a critical enabler for next-generation computer vision applications.

The VQualA Workshop aims to bring together researchers and practitioners from academia and industry to discuss and explore the latest trends, challenges, and innovations in visual quality assessment. We welcome original research contributions addressing, but not limited to, the following topics:

  • Image and video quality assessment
  • Perceptual quality assessment techniques
  • Multi-modal quality evaluation (image, video, text)
  • Visual quality assessment for immersive media (VR/AR)
  • Document image enhancement and quality analysis
  • Quality assessment under adverse conditions (low light, weather distortions, motion blur)
  • Robust quality metrics for medical and satellite imaging
  • Perceptual-driven image and video super-resolution
  • Visual quality in restoration tasks (denoising, deblurring, upsampling)
  • Human-centric visual quality assessment
  • Learning-based quality assessment models (CNNs, Transformers, MLLMs)
  • Cross-domain visual quality adaptation
  • Benchmarking and datasets for perceptual quality evaluation
  • Integration of large language models for quality explanation and assessment
  • Open-ended comparative assessments with natural language reasoning
  • Emerging applications of VQA in autonomous driving, surveillance, and smart cities
Posted in ATHENA | Comments Off on ICCV 2025 Workshop: Visual Quality Assessment Competition