Where Is The Ball Lab Logo : 3D Ball Trajectory Estimation From 2D Monocular Tracking

VISTEC, Thailand

11th CVsports @ CVPR 2025

overview teaser
Grid Grid Grid Duplicate Grid Duplicate
Paper GitHub Logo Code (coming soon) Unity Logo Simulation (coming soon) Dataset (coming soon) GitHub Logo Visualizer

* Simulation was created using the Unity engine. Unity is not affiliated with or endorsing this project.

Abstract

We present a method for 3D ball trajectory estimation from a 2D tracking sequence. To overcome the ambiguity in 3D from 2D estimation, we design an LSTM-based pipeline that utilizes a novel canonical 3D representation that is independent of the camera's location to handle arbitrary views and a series of intermediate representations that encourage crucial invariance and reprojection consistency. We evaluated our method on four synthetic and three real datasets and conducted extensive ablation studies on our design choices. Despite training solely on simulated data, our method achieves state-of-the-art performance and can generalize to real-world scenarios with multiple trajectories, opening up a range of applications in sport analysis and virtual replay.


Key distinctions

  1. Canonical plane-point representation enables multi-view training and inference within a single network.
  2. Only trained on simulations, generalizes to the real world — no real 3D or real multi-view supervision!
  3. Enforces reprojection consistency by predicting height rather than regressing full 3D coordinates.
  4. Relative-absolute input encoding is designed to improve generalization to spatial shifts and achieve location equivariance.


Our pipeline

Our goal is to recover the 3D trajectory of a ball from a sequence of 2D tracked positions. One naive approach is to directly regress 3D coordinates from the 2D tracking pixels. However, this method does not ensure reprojection consistency with the original 2D inputs. Additionally, using 2D tracking pixels as input implicitly ties the model to camera parameters (e.g., focal length, position, and orientation), which limits generalization across different viewpoints. To overcome these limitations, we transform each 2D input into a plane-point representation that removes dependency on the camera setup, enabling training on multiple cameras and inference within a single network. Rather than predicting full 3D coordinates, we estimate the ballโ€™s height over time to maintain reprojection consistency with the original 2D observations. We also introduce a relative-absolute input encoding, which improves generalization to spatial shifts and helps the model achieve location equivariance.

Our pipeline consists of three main LSTM-based components: 1) the End-of-Trajectory (EoT) Network, which predicts whether the ball is ending its current motion or changing direction (e.g., after a playerโ€™s hit); 2) the Height Network, which estimates the ballโ€™s height over time and is later used to reconstruct the full 3D trajectory; and 3) the Refinement Network, which further adjusts the predicted 3D coordinates for improved accuracy and smoothness.

Pipeline diagram

๐Ÿ‘† Click to enlarge


Results

Interactive 3D Visualizer

Explore the results of our method in an interactive visualizer. You can view the 3D trajectories and ground truth data for each test scenario.

Note: Best viewed in a desktop browser.


Tennis Synthetic
๐Ÿ”
Synthetic

Example videos

We present example video results across four test scenarios, covering both real-world and synthetic settings: tennis (TrackNet), MoCap, football (IPL), and a single-launch test set. These results can also be explored using the interactive visualizer.
Ours
Ground Truth
Ours
Ground Truth
Ours
Ground Truth
Ours
Ground Truth
Ours
Ground Truth
Ours
Ground Truth
Ours
Ground Truth
Prior Work

Acknowledgements

We thank Dr. Konstantinos Rematas for his valuable feedback, guidance, and assistance with revisions and figures. His work, Soccer On Your Tabletop, and earlier explorations greatly inspired us and helped shape our approach.

Citation

(Note: This work has been accepted to CVSports 2025 and will appear in the proceedings. In the meantime, please cite this page.)

@misc{ponglertnapakorn2025whereistheball,
  title={Where Is The Ball: 3D Ball Trajectory Estimation From 2D Monocular Tracking},
  author={Ponglertnapakorn, Puntawat and Suwajanakorn, Supasorn},
  howpublished={\url{https://where-is-the-ball.github.io/}},
  year={2025}
}