Key result

Released a synthetic labeled dataset with 365,383 sports frames covering seven standardized game situations.

Why it matters

  • Public sports datasets with consistent tactical labels remain limited, especially outside top-tier football.
  • Model quality depends on standardized, scalable annotation pipelines.

Approach

  • Combined expert clip-level labeling with automated frame-level label propagation.
  • Produced a standardized action-space spanning handball and basketball tactical states.

Results

  • Delivered a reusable data resource for training and evaluating sports situation classifiers.
  • Enabled stronger reproducibility for future action-spotting and tactical-analysis studies.

This paper presents a synthetic dataset of labeled game situations in recordings of federated handball and basketball matches played in Galicia, Spain. The dataset consists of synthetic data generated from real video frames, including 308,805 labeled handball frames and 56,578 labeled basketball frames extracted from 2105 handball and 383 basketball 5-s video clips. Experts manually labeled the video clips based on the respective sports, while the individual frames were automatically labeled using computer vision and machine learning techniques. The dataset encompasses seven classes of game situations: left attack, left counterattack, left penalty, right attack, right counterattack, right penalty, and timeout. In basketball, the penalty class refers to the free throws attempted by players after they have been fouled by an opposing player.