VisAH: Audio Highlighting Demo Gallery

Movie Comparisons

V2A Refinement

Real Video Refinement

VisAH is a novel approach that transforms audio to deliver appropriate highlighting effects guided by the accompanying video. This gallery showcases examples of VisAH in action, comparing our results with other methods and demonstrating applications.

Movie Comparison Examples

Examples from our Muddy Mixed Dataset, showcasing: input poorly mixed video, LCE highlighting results, VisAH model outputs, and the original movie clips for comparison.

Example 1: Movie "No way out"

Input

LCE

VisAH (Ours)

Original Movie

In this example, the speech is not highlighted properly in the input, and our model resolves this issue.

Example 2: Movie "Shooter"

Input

LCE

VisAH (Ours)

Original Movie

In this video, our model highlights the sound effect properly.

Example 3: Movie "The Amazing Spider Man"

Input

LCE

VisAH (Ours)

Original Movie

In this video, our model highlights the speech properly.

V2A Refinement Application

Our VisAH model can refine video-to-audio generation by rebalancing audio sources in alignment with the video, resulting in improved audio-visual coherence.

MovieGen Examples

Note: Videos are sourced from the MovieGen website. All videos are adjusted to the same loudness level.

Original MovieGen

MovieGen + VisAH
(Highlights skateboard sound at 2~4s)

Original MovieGen

MovieGen + VisAH
(Highlights sound effects)

OpenAI Sora + Seeing-and-Hearing Examples

The generated videos are from OpenAI Sora, and the corresponding audios are generated by Seeing-and-Hearing.

Seeing-and-Hearing

Seeing-and-Hearing + VisAH
(Highlights sound effects)

Real Video Refinement

Our VisAH model can also be applied to real-world videos, where audio is often recorded with suboptimal quality and may require rebalancing.
The videos are sourced from the AudioCaps dataset.

Original Video

Enhanced with VisAH
(Highlights laughing sounds)

Original Video

Enhanced with VisAH
(Highlights ambient sounds)

Original Video

Enhanced with VisAH
(Highlights people learning seal calls)