HEIR: Learning Graph-Based Motion Hierarchies

We propose a general hierarchical motion modeling method that learns structured, interpretable motion relationships directly from data. Our method represents observed motions using graph-based hierarchies, explicitly decomposing global absolute motions into parent-inherited patterns and local motion residuals. We formulate hierarchy inference as a differentiable graph learning problem, where vertices represent elemental motions and directed edges capture learned parent-child dependencies through graph neural networks. We evaluate our hierarchical reconstruction approach on three examples: 1D translational motion, 2D rotational motion, and dynamic 3D scene deformation via Gaussian splatting. Experimental results show that our method reconstructs the intrinsic motion hierarchy in 1D and 2D cases, and produces more realistic and interpretable deformations compared to the baseline on dynamic 3D Gaussian splatting scenes.

HEIR: Learning Graph-Based Motion Hierarchies

Cheng Zheng*, William Koch*, Baiang Li, Felix Heide

NeurIPS 2025

Translational Motion Hierarchy – Toy Example

We evaluate the proposed hierarchical learning method for a 1D motion trajectory where individual nodes are moving in a hierarchical manner (see Ground Truth motion hierarchy in bottom left inset), but each adding its own unknown motion. Top left to bottom right: (1) raw node positions $X_t$ of the hierarchical trajectories over time, (2) absolute node velocities $\Delta_t$, (3) reconstructed hierarchy from inferred relationships with ground-truth hierarchy in the inset, and (4) relative velocities $\delta_t$ with respect to each node parent, given the reconstructed hierarchy (3). We find that the method is able to correctly identify all motions (bottom left) with the two core motions through the orange and green nodes. Note: the reconstruction succeeds even in the presence of noise, which can be seen in the video.

Rotational Motion Hierarchy – Synthetic Planetary Orbit

We evaluate our method on a synthetic planetary system to validate the rotational extension of our approach. The dataset consists of 11 nodes representing a star (root), planets and moons organized in a hierarchical structure. The visualization shows the training progression: (left) angular and radial velocity components, relative to their potential parent over time; (center) learned edge weights matrix, where green-bordered entries indicate correctly reconstructed parent-child relationships; (right) the final reconstructed hierarchy overlaid on node positions in 2D space. Our method successfully captures the correct rotational dependencies, with “moons” properly inheriting motion from their respective “planets” – even with noise present.

Motion Hierarchy Gaussian Splatting Deformation


We apply our hierarchical motion learning to dynamic scenes from the D-NeRF dataset – with thousands of motion elements (Gaussian Splats) – to showcase scene editing capabilities. We compare against SC-GS across four scenes (Excavator, Hook, Jumpingjacks, Warrior) with two different user-specified deformations each. Our method produces more realistic and physically plausible deformations: it maintains structural rigidity in the excavator’s geometry and preserves natural body postures and limb alignment in human figures – limiting unwanted distortions. Quantitative evaluation (PSNR, SSIM, CLIP-I, LPIPS) validates improved perceptual quality and structural fidelity in scene deformation tasks.

Evaluation of Learned GS Hierarchy on D-NeRF dataset

Our method achieves more realistic and coherent deformations, effectively preserving meaningful structural relationships and maintaining scene integrity. Specifically, in the Excavator example, SC-GS introduces unnatural bending and distortion on the shovel-body connections and the ground, while our method realistically adjusts the shovel’s position only to preserve the excavator’s rigid geometry. For the Hook example and Warrior example, SC-GS produces exaggerated body distortions, whereas our method maintains a plausible body posture and natural limb alignment. Similarly, in the Jumpingjacks example, SC-GS generates physically unrealistic leg and arm deformations, whereas our method produces smooth, physically plausible limb movements consistent with human body and motion constraints.

Related Publications

[1] Yi-Hua Huang, Yang-Tian Sun, Ziyi Yang, Xiaoyang Lyu, Yan-Pei Cao, Xiaojuan Qi, SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes, CVPR, 2024.
[2] Yiming, Liang, Tianhan Xu, Yuta Kikuchi, HiMoR: Monocular Deformable Gaussian Reconstruction with Hierarchical Motion Representation, CVPR, 2025