MatterGen | Atomic Tessellator

Mattergen is a diffusion ML model developed by Microsoft Research, that generates novel crystal structures with property constraints. Rather than screening existing materials to find ones with desired properties, MatterGen uses inverse materials design—directly generating new structures that meet specified requirements.

The base model was trained on 608,000 stable materials from the Materials Project and Alexandria databases.

Atomic Tessellator adds first class support for Microsoft MatterGen through the Python SDK, enabling seamless integration of generative materials design into your computational workflows. You can generate novel crystal structures with specific property constraints and immediately validate them through our ab-initio simulation capabilities.

1. Creating MatterGen Explorations

Generate novel crystal structures using AI-powered diffusion models.

mattergen.py

from atomict.simulation.mattergen import create_mattergen, DEFAULT_BATCH_SIZE, DEFAULT_NUM_BATCHES

exploration = create_mattergen(
    project_id="your_project_id",
    name="Novel Oxide Generation", 
    description="Generate novel oxide structures for photovoltaic applications",
    batch_size=32,                    # Generate 32 structures per batch
    num_batches=5,                    # Generate 5 batches (160 total)
    diffusion_guidance_factor=10,     # Control diversity
    action="DRAFT"
)

1.1 Parameters

Parameter	Type	Default	Description
`project_id`	`str`	Required	Project ID to create the exploration in
`name`	`str`	Required	Name for the exploration
`description`	`str`	`""`	Description of the exploration
`batch_size`	`int`	`16`	Number of structures per batch
`num_batches`	`int`	`1`	Number of batches to generate
`diffusion_guidance_factor`	`int`	`None`	Guidance factor for diffusion process
`action`	`str`	`"DRAFT"`	`"DRAFT"` or `"LAUNCH"`
`extra_kwargs`	`dict`	`None`	Additional parameters (e.g., cluster config)

1.2 Parameter Details

Batch Size (batch_size)

Controls the number of crystal structures generated in each batch. The default value of 16 provides a good balance between compute duration and diversity.

Number of Batches (num_batches)

Determines how many separate generation batches to run. Each batch is processed independently. Total structures generated = batch_size × num_batches. For exploratory work, start with num_batches=1 and scale up based on initial results.

Diffusion Guidance Factor (diffusion_guidance_factor)

Optional parameter that controls the diversity and quality trade-off in the diffusion generation process:

Lower values (1-5): More diverse structures, potentially including unstable configurations
Higher values (10+): More conservative generation, structures closer to training data distribution
Default (None): Uses model's built-in guidance, typically optimized for balanced exploration

Higher guidance factors may reduce structural diversity but increase the likelihood of generating stable, synthesizable materials.

1.3 Response Format

mattergen_response.json

{
    "id": "exploration_id",
    "name": "Novel Oxide Generation",
    "description": "Generate novel oxide structures for photovoltaic applications",
    "batch_size": 16,
    "num_batches": 1,
    "diffusion_guidance_factor": null,
    "status": 0,                     # 0=DRAFT, 1=READY, 3=COMPLETED
    "project": "your_project_id",
    "created_at": "2025-01-15T10:30:00Z",
    "updated_at": "2025-01-15T10:30:00Z"
}

2. Retrieving Explorations

Get details about existing MatterGen explorations.

mattergen.py

from atomict.simulation.mattergen import get_mattergen

exploration = get_mattergen("exploration_id")

print(f"Status: {exploration['status']}")
print(f"Generated structures: {exploration['batch_size'] * exploration['num_batches']}")

3. Associating Generated Structures

Link generated structures to MatterGen explorations for tracking and analysis.

mattergen_associate.py

from atomict.simulation.mattergen import associate_user_upload_with_mattergen

association = associate_user_upload_with_mattergen(
    user_upload_id="structure_id",
    exploration_id="exploration_id"
)