MatterGen
Mattergen is a diffusion ML model developed by Microsoft Research, that generates novel crystal structures with property constraints. Rather than screening existing materials to find ones with desired properties, MatterGen uses inverse materials design—directly generating new structures that meet specified requirements.
The base model was trained on 608,000 stable materials from the Materials Project and Alexandria databases.
Atomic Tessellator adds first class support for Microsoft MatterGen through the Python SDK, enabling seamless integration of generative materials design into your computational workflows. You can generate novel crystal structures with specific property constraints and immediately validate them through our ab-initio simulation capabilities.
1. Creating MatterGen Explorations
Generate novel crystal structures using AI-powered diffusion models.
from atomict.simulation.mattergen import create_mattergen, DEFAULT_BATCH_SIZE, DEFAULT_NUM_BATCHES
exploration = create_mattergen(
project_id="your_project_id",
name="Novel Oxide Generation",
description="Generate novel oxide structures for photovoltaic applications",
batch_size=32, # Generate 32 structures per batch
num_batches=5, # Generate 5 batches (160 total)
diffusion_guidance_factor=10, # Control diversity
action="DRAFT"
)
1.1 Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
project_id | str | Required | Project ID to create the exploration in |
name | str | Required | Name for the exploration |
description | str | "" | Description of the exploration |
batch_size | int | 16 | Number of structures per batch |
num_batches | int | 1 | Number of batches to generate |
diffusion_guidance_factor | int | None | Guidance factor for diffusion process |
action | str | "DRAFT" | "DRAFT" or "LAUNCH" |
extra_kwargs | dict | None | Additional parameters (e.g., cluster config) |
1.2 Parameter Details
Batch Size (batch_size)
Controls the number of crystal structures generated in each batch. The default value of 16 provides a good balance between compute duration and diversity.
Number of Batches (num_batches)
Determines how many separate generation batches to run. Each batch is processed independently. Total structures generated = batch_size × num_batches. For exploratory work, start with num_batches=1 and scale up based on initial results.
Diffusion Guidance Factor (diffusion_guidance_factor)
Optional parameter that controls the diversity and quality trade-off in the diffusion generation process:
- Lower values (1-5): More diverse structures, potentially including unstable configurations
- Higher values (10+): More conservative generation, structures closer to training data distribution
- Default (
None): Uses model's built-in guidance, typically optimized for balanced exploration
1.3 Response Format
{
"id": "exploration_id",
"name": "Novel Oxide Generation",
"description": "Generate novel oxide structures for photovoltaic applications",
"batch_size": 16,
"num_batches": 1,
"diffusion_guidance_factor": null,
"status": 0, # 0=DRAFT, 1=READY, 3=COMPLETED
"project": "your_project_id",
"created_at": "2025-01-15T10:30:00Z",
"updated_at": "2025-01-15T10:30:00Z"
}
2. Retrieving Explorations
Get details about existing MatterGen explorations.
from atomict.simulation.mattergen import get_mattergen
exploration = get_mattergen("exploration_id")
print(f"Status: {exploration['status']}")
print(f"Generated structures: {exploration['batch_size'] * exploration['num_batches']}")
3. Associating Generated Structures
Link generated structures to MatterGen explorations for tracking and analysis.
from atomict.simulation.mattergen import associate_user_upload_with_mattergen
association = associate_user_upload_with_mattergen(
user_upload_id="structure_id",
exploration_id="exploration_id"
)