Researchers at the University of California, Santa Cruz, are using artificial intelligence and GPU-accelerated computing to convert the vast infrared images captured by the James Webb Space Telescope (JWST) into structured galaxy catalogues and simulations, automating work that would be impossible to complete by hand.
Each of the telescope's deep field images captures light that has travelled more than 13 billion years and contains hundreds of thousands of galaxies, generating volumes of data that exceed the capacity of traditional analysis methods.
"There were galaxies everywhere," said Brant Robertson, professor of astronomy and astrophysics at UCSC.
"So many, and so far away, that we were genuinely shocked."
The team runs its AI pipeline on a campus computing cluster called Lux, funded by a $1.6 million National Science Foundation grant, and tests models on an on-site Nvidia DGX Station, using GPUs across every stage of the workflow from data reduction and catalogue generation to anomaly detection and simulation.
A key tool is Morpheus, an AI system adapted from semantic segmentation techniques originally developed for earlier sky surveys, which labels structures at the pixel level to classify galaxies by type, shape and composition.
The pipeline has already produced unexpected scientific results, including the discovery of rotating disc galaxies appearing far earlier in cosmic history than theoretical models had predicted, a finding Robertson said has since been independently confirmed by other research groups.
A separate tool called GalaxyFriends, built by a graduate student, clusters just under 90,000 objects into similarity neighbourhoods, allowing researchers to identify patterns and outliers across large populations of galaxies without inspecting each one individually.
The team has published its data openly, releasing catalogues covering nearly 500,000 galaxies spanning the observable history of the universe for use by the broader research community.
The same methods are now being adapted for forthcoming surveys that will generate data on a far larger scale.
The Vera C. Rubin Observatory, expected to begin operations in Chile, will produce roughly 20 terabytes of raw data per night, and the UCSC team is developing AI techniques inspired by video game image reconstruction to correct for atmospheric blur in ground-based observations.
Future space-based instruments including NASA's Nancy Grace Roman Space Telescope and the proposed Habitable Worlds Observatory will further increase the need for scalable GPU-accelerated analysis.
The recap
- UCSC applies AI to James Webb Space Telescope data.
- $1.6 million NSF grant funds the on‑campus Lux cluster.
- Team readies tools for Vera C. Rubin Observatory data.