Searching for Unmodeled Signals

Machine learning methods for detecting gravitational-wave transients that don't match existing signal templates, targeting unknown astrophysical phenomena and anomalous events.

Research area

Most gravitational-wave searches assume a specific signal model — compact binary coalescence, continuous waves from pulsars, or a stochastic background. But the most exciting discoveries may come from signals we haven’t predicted. Core-collapse supernovae, cosmic string cusps, post-merger remnant oscillations, phase transitions in the early universe, and entirely unknown phenomena could all produce gravitational waves with morphologies that don’t match existing templates.

The challenge

Template-based searches (matched filtering) are optimal when the signal model is known, but they are blind to anything outside their template bank. Unmodeled — or “burst” — searches must detect excess power or coherent patterns in detector data without prior knowledge of the signal shape. This is fundamentally harder: the search must distinguish genuine astrophysical transients from the detector glitches, environmental disturbances, and non-stationary noise artifacts that contaminate real LIGO data at a rate of roughly one per minute.

Existing burst search algorithms illustrate the difficulty. Coherent WaveBurst (cWB) identifies clusters of excess power in the time-frequency plane across multiple detectors, using the constraint that a real signal must be consistent with a single sky location and polarization. BayesWave models transients as a superposition of sine-Gaussian wavelets and uses Bayesian model selection to distinguish signal from glitch. Both approaches achieve detection, but at relatively high signal-to-noise thresholds (typically SNR > 10–15 for confident detection) — far above the matched-filter threshold of ~8 that enables the binary merger catalog. Closing this sensitivity gap is the central goal.

Astrophysical targets

Several known source types produce gravitational waves with poorly modeled or entirely unpredicted waveforms:

  • Core-collapse supernovae: The gravitational-wave emission depends on the turbulent dynamics of the collapsing stellar core — convection, the standing accretion shock instability (SASI), and neutrino-driven outflows. Numerical simulations produce diverse waveform morphologies, but no template bank can span the physical parameter space. Detection would probe the engine of the explosion, complementing neutrino and electromagnetic observations.
  • Post-merger remnants: After two neutron stars merge, the remnant (if it doesn’t immediately collapse to a black hole) oscillates at several kHz with quasi-periodic structure set by the nuclear equation of state. These signals are short-lived, broadband, and only partially modeled.
  • Cosmic string cusps and kinks: Topological defects from the early universe would produce gravitational-wave bursts with characteristic power-law spectra. The waveform shape is known analytically, but the rate, amplitude distribution, and stacking properties are uncertain.
  • The truly unknown: Every new observational window in astronomy has revealed unexpected phenomena. Gravitational-wave astronomy is young enough that genuinely novel source types remain plausible.

Our approach

We develop machine learning methods that learn to distinguish astrophysical signals from instrumental artifacts without requiring explicit signal templates:

  • Anomaly detection: Neural networks trained on detector noise learn what “normal” looks like; deviations flag potential signals for follow-up. This complements the noise cleaning work — the same understanding of noise couplings that enables subtraction also enables detection of anomalies that don’t fit the noise model. Autoencoders and variational autoencoders trained on spectrograms of detector noise can identify anomalous time-frequency patterns with reconstruction error serving as a detection statistic.
  • Signal morphology classification: Convolutional networks that classify transient events by their time-frequency structure, separating known glitch types (blips, scratchy, scattered light, tomte) from potential astrophysical candidates. The Gravity Spy project demonstrated this approach for glitch classification; we extend it toward separating glitches from signals of unknown morphology.
  • Multi-detector consistency: Genuine signals must appear consistently across detectors (accounting for antenna patterns and time delays); instrumental artifacts do not. ML methods that exploit multi-detector correlations — training on the joint time-frequency representation from all detectors simultaneously — can improve detection confidence without assuming a signal model.
  • Learned detection statistics: Rather than hand-crafting a detection statistic (like excess power or coherent energy), a neural network can learn the statistic that maximizes detection probability for a broad class of signal morphologies. This is especially powerful when trained on physically diverse waveform catalogs from numerical simulations.

Connection to noise cleaning

The link to neural network noise cleaning is direct and practical. DeepClean and related noise-subtraction networks reduce the rate of non-Gaussian transient artifacts in the data by identifying and removing noise coupled from auxiliary channels. Every glitch removed from the data is one fewer false alarm in the burst search. In O3 data, glitch subtraction improved the effective duty cycle for burst searches by reducing the fraction of data flagged as contaminated. For O4 and beyond, the combination of improved noise subtraction and ML-based burst detection could push unmodeled search sensitivity significantly closer to matched-filter performance.

Open questions

  • False alarm estimation: Without a signal model, how do you estimate the false alarm probability of a detection? Time slides (shifting one detector’s data relative to another) provide empirical background estimates, but the non-stationarity of real detector noise means the background changes on timescales of hours. ML-based approaches to background estimation — learning the time-varying glitch rate and morphology — could improve on fixed time-slide methods.
  • Interpretability: If an ML system flags a candidate, how do we understand why it was flagged? Astrophysical follow-up requires physical interpretation, not just a detection statistic. Saliency maps, attention weights, and prototype-based explanations can help, but the gap between “statistically significant anomaly” and “astrophysical source with physical parameters” remains wide.
  • Training data and domain shift: Anomaly detection systems need training data that spans the full range of non-astrophysical artifacts. Detector glitch populations evolve over time as hardware changes, commissioning progresses, and environmental conditions shift. How do we maintain training sets that reflect the current instrument state? Online learning and continual adaptation may be necessary.
  • Sensitivity vs. generality trade-off: There is an inherent tension between search sensitivity (which improves with signal-specific assumptions) and generality (which requires making fewer assumptions). A hierarchical approach — broad anomaly detection as a first pass, followed by targeted follow-up with signal-specific tools — may navigate this trade-off, but the optimal division of labor between generic and specific stages is an open design problem.