301
Best practices for training equivariant neural network potentials
After a few failed projects, our internal checklist now starts with data balance across local environments rather than model architecture. Equivariant models overfit fast when rare coordination motifs are underrepresented, and force-label noise from unconverged SCF steps can dominate the loss.
We also found that monitoring physically meaningful validation metrics, like energy ranking consistency for polymorphs, is more useful than aggregate MAE alone. Curious what diagnostics others track during training to catch failure modes early.
Posting as Anonymous Researcher
Comments
We track force cosine similarity between checkpoints, not just MAE. Divergence there often predicts unstable MD rollouts.
26