A shower of data has started to rain down on academic institutions and others doing research on autonomous vehicles.
In the past few months, some of the largest players in the AV world have decided to take some of the massive amounts of information they've been accumulating, including sophisticated maps, and images collected via cameras and lidar sensors, and make it available for public download.
Among the companies releasing datasets are technology supplier Aptiv, self-driving startup Argo AI and, most recently, ride-hailing company Lyft.
This is more than corporate largesse. The idea is to help advance the development of self-driving technology for the industry as a whole by eliminating what has been a major roadblock for academia.
"We released this dataset because one of the biggest things that holds back university research in our area of self-driving and even more broadly, in AI in general, is the availability of really well-labeled, well-curated data and data that's really relevant to a particular problem," Argo CEO Bryan Salesky said in a recent podcast interview with Automotive News.
In June, Argo released Argoverse, which it calls "the first public data release to include high-definition maps for self-driving vehicle research." Argo says such maps are critical to robotic perception.
In March, Aptiv released a dataset called nuScenes by Aptiv.
Aptiv's dataset is organized into 1,000 "scenes" collected from Boston and Singapore. Aptiv says the data is meant to represent some of the most complex driving scenarios in each urban environment.
And this week, Lyft announced the release of its Level 5 Dataset, saying it wants to "level the playing field" for researchers interested in AV technology.
"Self-driving is too big — and too important — an endeavor for any one team to solve alone," Luc Vincent, Lyft's executive vice president for autonomous technology, writes in a blog on Medium.
"Academic research accelerates innovation, but it requires costly data that is out of reach for most academic teams. Sensor hardware must be built and properly calibrated, a localization stack is needed, and an HD semantic map must be created. Only then can you unlock higher-level functionality like 3D perception, prediction, and planning."
Salesky, whose company is working with Ford Motor Co. and Volkswagen Group, notes that sharing data with the academic world has mutual benefits.
"We think what we're building is pretty special at Argo, but we know that there's going to be new techniques that are even better or maybe performed more efficiently at a lower amount of compute power or whatever it might be. And we want to give [universities] that opportunity to test their algorithms, and then we can take a look at their results and benchmark against what we do internally. And that's how we ultimately move research forward in a meaningful way."
— Leslie J. Allen