Sparse Global Matching for Video Frame Interpolation with Large Motion

1Nanjing University 2SenseTime Research 3Shanghai AI Lab

Abstract

The challenge posed by large motion plays a crucial role in the task of Video Frame Interpolation (VFI). Existing methods are often constrained by limited receptive fields, resulting in suboptimal performance when dealing with scenarios with large motion. In this paper, we introduce a novel pipeline, which effectively integrating global-level information, to alleviate dilemmas associated with large motion.

Specifically, we first estimate a pair of initial intermediate flows using high-resolution feature map for extracting local details. Then, we incorporate a sparse global matching branch to compensate for flow estimation, which consists of identifying flaws in initial flows and generating sparse flow compensation with a global receptive field. Finally, we adaptively merge the initial flow estimation with global flow compensation, yielding a more accurate intermediate flow.

To evaluate the model's effectiveness in handling large motion, we carefully curated the most challenging subset from commonly used benchmarks. As a result, our method demonstrates state-of-the-art performance on these demanding large motion subsets.

1. Difference Map Generation

Difference Map is to help us locate the flaws in the initial estimated intermediate flow.

Difference Map.

2. Sparse Global Matching

We use sparse global matching to introduce global information into our estimated intermediate flows.

Sparse Global Matching.

3. Flow Shifting

The flow needs to be shifted to the correct coordinate.

4. Flow Merging

The output of sparse global matching may be incorrect, so we merge it with the initial estimated flow to create a more accurate flow.

Sparse Global Matching.

Visualization

Cropped View

Teaser image. Blue frames places a greater emphasis on demonstrating large motion, while green frames is more inclined to demonstrate the effect on local details.

Full Resolution View