Saijal Singhal

StyleGAN3 with Variational NN

2025

R&D Project

Uncertainty Estimation

GANs

View Project Details

2025

R&D Project

Uncertainty Estimation

GANs

View Project Details

2025

R&D Project

Uncertainty Estimation

GANs

View Project Details

Integrating hierarchical stochasticity into the StyleGAN3 pipeline: Transforming deterministic latent mapping into an explainable, variational generative process.

Variational Mapping Network

Traditional StyleGAN architectures utilize a deterministic 8-layer MLP to map latent codes to styles. This project re-engineers the mapping network into a Variational Mapping Network, introducing stochasticity at every layer of the transformation. By parameterizing each fully connected layer to output a Gaussian distribution, the model learns a probabilistic style space. This ensures that the resulting style vector is not a fixed point but a sample from a learned distribution, providing the foundational diversity required for high-fidelity generation.

Variational Synthesis Network

The core innovation lies in the Variational Synthesis Layer, which replaces the standard deterministic blocks (L0 to L13). Each block is wrapped in a variational architecture that utilizes parallel paths to simultaneously calculate the mean feature map and the standard deviation. This dual-input system receives the style vector as a control signal for modulated convolutions, ensuring that the stochastic variations such as hair placement or skin texture, are intrinsically tied to the subject's local geometry while maintaining StyleGAN3's alias-free guarantees.

Custom Metrics: Quantifying Stochasticity

To validate the model beyond standard visual inspection, two primary metrics were utilized:

Layer-wise Mean Variance: A custom metric designed to measure the "amount" of stochasticity injected at different spatial resolutions. Unlike the baseline StyleGAN3, which produces zero variance for a fixed latent vector, the VNN maintains a non-zero, tunable variance.
Global_STD Scaling: A control parameter that allows for the real-time adjustment of internal noise. By scaling the predicted, we can quantitatively observe the transition from deterministic reconstruction to high-variance diverse sampling.

Conclusion

The integration of Variational Neural Networks into StyleGAN3 successfully decouples global identity from local structural uncertainty. By shifting from weight-based Bayesian methods to layer-output distributions, the model achieves superior FID scores while providing an explainable framework for stochasticity. The results demonstrate that internalizing variance within the synthesis hierarchy produces more natural, intrinsic details than traditional externally applied noise, paving the way for more robust and diverse generative models.