AstroVision
Deep-learning pipeline for galaxy morphological classification with multi-modal fusion and uncertainty quantification
Overview
AstroVision is an open-source deep-learning project for galaxy morphological classification. The project explores how modern computer-vision models, especially self-supervised visual foundation models, can be adapted to astronomical imaging data.
The current pipeline is built around the Galaxy10 DECaLS dataset, containing 17,736 RGB galaxy images across ten morphology classes derived from the Dark Energy Camera Legacy Survey (DECaLS). The project compares several modelling strategies, including a simple convolutional neural network, EfficientNet-B0, DINOv2 visual features, fine-tuned DINOv2 representations, and late-fusion models combining image features with non-parametric morphology and photometric information.
The broader goal of AstroVision is not only to classify galaxies, but to understand what physical information is encoded in learned visual representations.
Scientific Motivation
Galaxy morphology is closely connected to stellar populations, star-formation history, colour, environment, and dynamical evolution. A central question in AstroVision is whether deep visual representations trained outside astronomy can still capture physically meaningful structure when applied to galaxy images.
The current results suggest that DINOv2 representations encode more than low-level image texture. In particular, principal components of the learned feature space correlate with non-parametric morphology indicators such as Gini and with photometric colour such as (g-r). This suggests that self-supervised visual features may partially recover the classical relationship between morphology and stellar population colour.
Rather than treating the model as a black-box classifier, AstroVision uses dimensionality reduction, morphometric diagnostics, photometric correlations, and uncertainty estimation to interpret what the network is learning.
Pipeline
| Stage | Description |
|---|---|
| Image classification | Baseline CNN, EfficientNet-B0, DINOv2 probing, and DINOv2 fine-tuning |
| Representation learning | Analysis of high-dimensional DINOv2 visual embeddings |
| Morphometrics | CAS, Gini, (M_{20}), and related non-parametric galaxy structure descriptors |
| Photometry | SDSS photometric features used for comparison and late-fusion experiments |
| Fusion model | Combination of visual, morphometric, and photometric features |
| Uncertainty | Monte Carlo Dropout and conformal prediction for reliability analysis |
| Segmentation | Weakly supervised U-Net experiments based on Otsu pseudo-labels |
Current Results
Current experiments show that:
- fine-tuning DINOv2 substantially improves performance over zero-shot visual probing;
- late fusion of visual embeddings, morphometric descriptors, and photometric information gives the strongest classification results;
- some classes, such as smooth and edge-on galaxies, are recovered more reliably than visually ambiguous or disturbed systems;
- the disturbed class remains difficult, suggesting that part of the error is tied to genuine morphological ambiguity rather than only model weakness;
- DINOv2 features show measurable correlations with both morphology and colour, suggesting that the learned visual representation contains physically relevant information.
This makes AstroVision a complementary project to AstroSpectro: both projects ask how machine-learning models organise astrophysical data, and whether their internal representations can be connected back to physical structure.
Technical Stack
Python · PyTorch · DINOv2 · EfficientNet · scikit-learn · UMAP · astroquery · Weights & Biases
Links
Status: active research and development project. The current focus is on improving robustness, uncertainty calibration, physical interpretation of learned representations, and preparation of a reproducible public release.