Explainable Heuristic Multi-Object Tracking

Explainable Heuristic Multi-Object Tracking

Overivew

Motivation

Multi-Object Tracking (MOT) comprises a wide range of methods that often differ only in subtle design choices. A large fraction of existing approaches fall into the category of heuristic Tracking-by-Detection (TbD). These methods associate object detections across frames using handcrafted similarity functions, followed by a matching procedure. While the general TbD pipeline is largely shared, individual methods mainly differ in (i) the choice of features used for similarity computation (e.g. geometric, motion, or appearance-based features), (ii) how these features are combined, and (iii) how their relative importance is weighted. In practice, these design choices are typically fixed manually, and feature weights are tuned via grid search.

Project

This project aims to systematically analyze and learn the composition of similarity functions used in heuristic MOT. The central idea is to train a lightweight, interpretable model that selects relevant features from a predefined set and learns their relative weightings directly from data. In addition to static features, the project also considers temporal aspects of tracking. Many modern features—such as appearance embeddings—implicitly rely on temporal aggregation (e.g. exponential moving averages over past frames). The proposed approach seeks to learn not only which features are important, but also over which temporal horizon they should be integrated.

Furthermore, this work provides a unifying framework to analyze and compare existing heuristic MOT approaches in a fair and controlled manner. Current comparisons are often confounded by differences in ReID models, post-processing steps, or auxiliary heuristics, making it difficult to isolate the actual contributions of individual innovations.

Thesis/Internship

Within the context of this project, several research questions could be explored. Depending on personal interests or your own ideas, the focus of the thesis/internship can be shifted accordingly.

  • Which features are most informative for different MOT datasets?
  • Are certain features consistently superior to others (e.g. HMIoU vs. IoU)?
  • To what extent is temporal information necessary for robust tracking?
  • Can a learned similarity function outperform established handcrafted heuristics?
  • Is it possible to construct novel, more effective features by combining primitive ones in a learned manner?

Prerequisites

The following skills are required:

  • Familiar with Python (Pytorch, Numpy, Scipy would be helpful)
  • Basic machine learning and deep learning knowledge

Contact

To apply please email Jan Frederik Meier stating your interest in this project and detailing your relevant skills. A part of this project could be also a lab rotation.

Neural Data Science Group
Institute of Computer Science
University of Goettingen