Portrait of Zheng MaHome

Introduction

AI for fusion confinement · 4 min read

Introduction

Installation | Datasets and Pretrained Models | Run DeepRTE | Train DeepRTE | Cite | Acknowledgements | Contact

DeepRTE is a neural operator architecture designed to solve the Radiative Transfer Equation (RTE) in phase space. This repository provides code, configuration, and utilities for training, evaluating, and experimenting with DeepRTE models using provided RTE datasets.

Any publication that discloses findings arising from using this source code, the model parameters or outputs produced by those should cite the DeepRTE: Pre-trained Attention-based Neural Network for Radiative Tranfer paper.

DeepRTE learns the solution operator:

A:(I;μt,μs,p)I,\mathcal{A}: (I_-; \mu_t, \mu_s, p) \to I,

for the following steady-state radiative transfer equation:

ΩI(r,Ω)+μt(r)I(r,Ω)=μs(r)Sd1Sd1p(Ω,Ω)I(r,Ω)dΩ,xDRd,  ΩSd1, \Omega \cdot \nabla I(r,\Omega)+\mu_t(r)I(r, \Omega) = \frac{\mu_s(r)}{S^{d-1}}\int_{\mathbb{S}^{d-1}} p(\Omega,\Omega^*)I(r,\Omega^*)\,\mathrm{d}\Omega^*, \quad x\in D\subset\mathbb{R}^d,\; \Omega\in\mathbb{S}^{d-1},

with the in-flow boundary condition:

I(r,Ω)=I(r,Ω),on Γ={(r,Ω)nrΩ<0}. I(\mathbf{r},\mathbf{\Omega}) = I_{-}(\mathbf{r},\mathbf{\Omega}), \quad\text{on } \Gamma_{-} = \big\{(\mathbf{r},\mathbf{\Omega}) \mid \mathbf{n}_{\mathbf{r}}\cdot\mathbf{\Omega}<0 \big\}.

Installation

1. Clone the Repository

git clone https://github.com/mazhengcn/deeprte.git --branch v1.0.1
cd deeprte

or download directly from the Release

2. Install Dependencies

This project uses JAX AI Stack (JAX, Flax, Optax, Orbax, etc.). The recommended way to install dependencies is with uv:

uv sync

This installs all necessary dependencies, including the project itself. For NVIDIA GPU support, use:

uv sync --extra cuda

For development, run:

uv sync --dev --all-extras

to install all development dependencies.

3. Container

A pre-built runtime container is available. To pull the latest image, run:

docker pull ghcr.io/mazhengcn/deeprte

Start the container with:

docker run -it --gpus=all --shm-size=1g ghcr.io/mazhengcn/deeprte /bin/bash

Alternatively, if you prefer to build the image yourself, use the provided Dockerfile:

docker build -t deeprte .

Dev Container

For development, a devcontainer is provided to ensure a reproducible environment. Simply open the repository root in VSCode, and the container will be built automatically with all required dependencies, development tools, and data volume mounts. Python and its dependencies are managed by uv.

The devcontainer configuration files are located in the .devcontainer/ directory and can be customized as needed.

Datasets and Pretrained Models

Download Datasets

Datasets for training and testing DeepRTE are generated using conventional numerical methods in MATLAB and Python. The source code is available in a separate repository: rte-dataset. For more details, refer to that repository.

Inference (test) and pretraining datasets are hosted on Huggingface: https://huggingface.co/datasets/mazhengcn/rte-dataset. Download datasets to DATA_DIR with (ensure huggingface-cli is installed; if you followed the setup above, it is already included):

huggingface-cli download mazhengcn/rte-dataset \
    --exclude=interim/* \
    --repo-type=dataset \
    --local-dir=${DATA_DIR}

The resulting folder structure should be (for inference, only datasets under raw/test are needed):

tfds # Processed TFDS dataset for pretraining.
test # Raw MATLAB dataset for test/inference.
train # Raw MATLAB dataset for pretraining using grain.
README.md

Download Pretrained Models

Pretrained models can be downloaded to MODEL_DIR from Huggingface:

huggingface-cli download mazhengcn/deeprte \
    --repo-type=model \
    --local-dir=${MODELS_DIR}

The folder structure will be:

README.md
v0 # Pre-release version (deprecated).
v1 # Pre-release version (deprecated).
config.json
params
config.json
params
config.json
params

A convenient shell script scripts/download_dataset_and_models.sh is provided to download datasets to ./data and pretrained models to ./models:

uv run ./scripts/download_dataset_and_models.sh

Run DeepRTE

To run DeepRTE inference:

uv run run_deeprte.py --model_dir=${MODEL_DIR} --data_path=${DATA_PATH} --output_dir=${OUTPUT_DIR}

where ${MODEL_DIR} is the pretrained model directory, ${DATA_PATH} is the data path for inference, and ${OUTPUT_DIR} is the directory to store results.

For example:

DATA_PATH=${1:-"./data/raw/test/sin-rv-g0.5-amplitude5-wavenumber10/sin-rv-g0.5-amplitude5-wavenumber10.mat"}
MODEL_DIR=${2:-"./models/v1.0.1/g0.5"}
OUTPUT_DIR=${3:-"./outputs"}

TIMESTAMP="$(date --iso-8601="seconds")"

python run_deeprte.py \
  --model_dir="${MODEL_DIR}" \
  --data_path="${DATA_PATH}" \
  --output_dir="${OUTPUT_DIR}/${TIMESTAMP%+*}"

A shell script ./scripts/run_deeprte.sh containing above contents is also provided for convenience, you can modify it and run:

uv run ./scripts/run_deeprte.sh

You can also run using the runtime container:

docker run -it \
    --volume <DATA_PATH>:/deeprte/data/data_path \
    --volume <MODEL_DIR>:/deeprte/models \
    --volume <OUTPUT_DIR>:/deeprte/output \
    --gpus all \
    ghcr.io/mazhengcn/deeprte \
    python run_deeprte.py \
    --model_dir=/deeprte/models \
    --data_path=/deeprte/data \
    --output_dir=/deeprte/deeprte_output

Train DeepRTE

To train DeepRTE from scratch, run:

uv run run_train.py --config=${CONFIG_PATH} --workdir=${CKPT_DIR}

After training, checkpoints are saved under ${CKPT_DIR}. To generate an inference checkpoint, run:

uv run generate_param_only_checkpoint.py --train_state_dir=${TRAIN_STATE_DIR} --checkpoint_dir=${CKPT_DIR}

You can also modify and run the convenient scripts:

uv run ./scripts/run_train.sh
uv run ./scripts/generate_param_only_checkpoint.sh

Citing This Work

Any publication that discloses findings arising from using this source code, the model parameters or outputs produced by those should cite:

@misc{zhu2025,
      title={DeepRTE: Pre-trained Attention-based Neural Network for Radiative Transfer},
      author={Yekun Zhu and Min Tang and Zheng Ma},
      year={2025},
      eprint={2505.23190},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.23190},
}

Acknowledgements

DeepRTE's release was made possible by the contributions of the following people: Zheng Ma, Yekun Zhu, Min Tang and Jingyi Fu.

DeepRTE uses the following separate libraries and packages:

We thank all their contributors and maintainers!

Get in Touch

If you have any questions not covered in this overview, please open an issue or contact the Zheng Ma at zhengma@sjtu.edu.cn.

Last updated on

On this page