Materialist
PapersGNoME dataset: 380k stable structures - how reliable is it?
623
Papers2 days ago

GNoME dataset: 380k stable structures - how reliable is it?

The release is exciting, but I want to temper expectations. A large fraction of candidate structures are labeled as potentially synthesizable based on formation-energy filters and model predictions, not full thermodynamic phase-space analysis. For discovery workflows this is great, yet downstream teams should not treat every listed structure as synthesis-ready.

For people who already integrated GNoME into active-learning loops, what validation protocol are you using? We currently cross-check with Materials Project hull distances and then run a smaller DFT relaxation set before any Bayesian optimization step. Curious whether others see systematic biases in nitrides or chalcogenides.

Paper Reference

arXiv: 2311.12345

Open source

Posting as Anonymous Researcher

Comments

We treat GNoME as proposal generation only. Anything with hull distance > 40 meV/atom gets a lower priority score in our queue.

53

Great heuristic. We also flag structures with unusual oxidation-state assignments before downstream synthesis planning.

28