Vienna 4G/5G Drive-Test Dataset: A New Open Benchmark for Mobile Network AI
Researchers have unveiled a major new resource to advance AI-driven mobile network analysis: the Vienna 4G/5G Drive-Test Dataset. This comprehensive, city-scale open dataset provides synchronized LTE and 5G NR measurements collected across diverse urban and suburban environments in Vienna, Austria. By combining passive network-side scanner data with active user-side handset logs, it offers a holistic, real-world view of deployed radio access networks, directly addressing a critical shortage of large, annotated datasets that has long constrained machine learning progress in telecoms.
Bridging the Network and User Experience Divide
The dataset's unique value lies in its fusion of two complementary perspectives. The passive wideband scanner component provides a network-centric view of radio frequency conditions, while the active handset logs capture the actual user experience, including key performance indicators like signal strength and throughput. All measurements are meticulously georeferenced and time-aligned, enabling consistent, reproducible evaluation of models that aim to predict coverage, capacity, or quality of service.
To further enrich analysis, the release includes inferred deployment descriptors for a representative subset of base stations (BSs). These descriptors provide estimated BS locations, sector azimuths, and antenna heights—critical metadata often unavailable in public datasets but essential for accurate modeling of radio wave propagation and network planning algorithms.
Enabling Geometry-Aware AI and Model Calibration
A standout feature of this dataset is the inclusion of high-resolution 3D building and terrain models of Vienna. This enables geometry-conditioned learning, where AI models can learn the direct impact of physical obstructions like buildings on signal propagation. Furthermore, these models allow for the precise calibration of deterministic approaches such as ray-tracing simulations, a gold-standard technique for predicting radio wave behavior that requires accurate environmental data for validation.
The data is pragmatically organized into four core components: scanner measurements, handset logs, estimated cell information, and the city model. Comprehensive documentation details the available fields and the intended joins between tables, lowering the barrier to entry for researchers and engineers and facilitating practical reuse across different analytical workflows.
Why This Dataset Matters for AI and Telecom Research
The Vienna 4G/5G Drive-Test Dataset is poised to become a standard benchmark for several key research areas at the intersection of AI and wireless communications.
- Reproducible Benchmarking: It provides a common, real-world foundation for comparing algorithms in environment-aware learning, propagation modeling, and coverage analysis, moving the field beyond proprietary or synthetic data.
- Ray-Tracing Validation: The aligned measurement data and high-fidelity city model offer an unprecedented opportunity to validate and calibrate ray-tracing tools, improving their accuracy for network planning.
- Bridging Simulation and Reality: By providing both detailed network measurements and the corresponding ground-truth environment, it allows researchers to train AI models that better generalize from simulation to real-world deployment.
- Open Science in Telecoms: As an open dataset, it democratizes access to high-quality mobile network data, accelerating innovation and fostering collaboration across academia and industry.
This release, detailed in the paper "Vienna 4G/5G Drive-Test Dataset" (arXiv:2603.02638v1), marks a significant step toward data-driven, AI-optimized future networks by providing the essential raw material for next-generation research and development.