Structures

MatterGen

Generate novel crystal structures using AI-powered generative models.

Mattergen is a diffusion ML model developed by Microsoft Research, that generates novel crystal structures with property constraints. Rather than screening existing materials to find ones with desired properties, MatterGen uses inverse materials design—directly generating new structures that meet specified requirements.

The base model was trained on 608,000 stable materials from the Materials Project and Alexandria databases.

Atomic Tessellator adds first class support for Microsoft MatterGen through the Python SDK, enabling seamless integration of generative materials design into your computational workflows. You can generate novel crystal structures with specific property constraints and immediately validate them through our ab-initio simulation capabilities.

1. Creating MatterGen Explorations

Generate novel crystal structures using AI-powered diffusion models.

mattergen.py
from atomict.simulation.mattergen import create_mattergen, DEFAULT_BATCH_SIZE, DEFAULT_NUM_BATCHES

exploration = create_mattergen(
    project_id="your_project_id",
    name="Novel Oxide Generation", 
    description="Generate novel oxide structures for photovoltaic applications",
    batch_size=32,                    # Generate 32 structures per batch
    num_batches=5,                    # Generate 5 batches (160 total)
    diffusion_guidance_factor=10,     # Control diversity
    action="DRAFT"
)

1.1 Parameters

ParameterTypeDefaultDescription
project_idstrRequiredProject ID to create the exploration in
namestrRequiredName for the exploration
descriptionstr""Description of the exploration
batch_sizeint16Number of structures per batch
num_batchesint1Number of batches to generate
diffusion_guidance_factorintNoneGuidance factor for diffusion process
actionstr"DRAFT""DRAFT" or "LAUNCH"
extra_kwargsdictNoneAdditional parameters (e.g., cluster config)

1.2 Parameter Details

Batch Size (batch_size)

Controls the number of crystal structures generated in each batch. The default value of 16 provides a good balance between compute duration and diversity.

Number of Batches (num_batches)

Determines how many separate generation batches to run. Each batch is processed independently. Total structures generated = batch_size × num_batches. For exploratory work, start with num_batches=1 and scale up based on initial results.

Diffusion Guidance Factor (diffusion_guidance_factor)

Optional parameter that controls the diversity and quality trade-off in the diffusion generation process:

  • Lower values (1-5): More diverse structures, potentially including unstable configurations
  • Higher values (10+): More conservative generation, structures closer to training data distribution
  • Default (None): Uses model's built-in guidance, typically optimized for balanced exploration
Higher guidance factors may reduce structural diversity but increase the likelihood of generating stable, synthesizable materials.

1.3 Response Format

mattergen_response.json
{
    "id": "exploration_id",
    "name": "Novel Oxide Generation",
    "description": "Generate novel oxide structures for photovoltaic applications",
    "batch_size": 16,
    "num_batches": 1,
    "diffusion_guidance_factor": null,
    "status": 0,                     # 0=DRAFT, 1=READY, 3=COMPLETED
    "project": "your_project_id",
    "created_at": "2025-01-15T10:30:00Z",
    "updated_at": "2025-01-15T10:30:00Z"
}

2. Retrieving Explorations

Get details about existing MatterGen explorations.

mattergen.py
from atomict.simulation.mattergen import get_mattergen

exploration = get_mattergen("exploration_id")

print(f"Status: {exploration['status']}")
print(f"Generated structures: {exploration['batch_size'] * exploration['num_batches']}")

3. Associating Generated Structures

Link generated structures to MatterGen explorations for tracking and analysis.

mattergen_associate.py
from atomict.simulation.mattergen import associate_user_upload_with_mattergen

association = associate_user_upload_with_mattergen(
    user_upload_id="structure_id",
    exploration_id="exploration_id"
)