Written by
Zheng Ma
At
Fri May 23 2025
DeepRTE
Pre-trained Attention-based Neural Network for Radiative Transfer

DeepRTE is a neural operator architecture designed to solve the Radiative Transfer Equation (RTE) in phase space. This repository provides code, configuration, and utilities for training, evaluating, and experimenting with DeepRTE models using provided RTE datasets.
Any publication that discloses findings arising from using this source code, the model parameters or outputs produced by those should cite the DeepRTE: Pre-trained Attention-based Neural Network for Radiative Transfer paper.
DeepRTE learns the solution operator:
for the following steady-state radiative transfer equation:
with the in-flow boundary condition:
Installation
1. Clone the Repository
git clone https://github.com/mazhengcn/deeprte.git --branch v1.0.1
cd deeprteor download directly from the Release
2. Install Dependencies
This project uses JAX AI Stack (JAX, Flax, Optax, Orbax, etc.). The recommended way to install dependencies is with uv:
uv syncThis installs all necessary dependencies, including the project itself. For NVIDIA GPU support, use:
uv sync --extra cudaFor development, run:
uv sync --dev --all-extrasto install all development dependencies.
3. Container
A pre-built runtime container is available. To pull the latest image, run:
docker pull ghcr.io/mazhengcn/deeprteStart the container with:
docker run -it --gpus=all --shm-size=1g ghcr.io/mazhengcn/deeprte /bin/bashAlternatively, if you prefer to build the image yourself, use the provided Dockerfile:
docker build -t deeprte .Dev Container
For development, a devcontainer is provided to ensure a reproducible environment. Simply open the repository root in VSCode, and the container will be built automatically with all required dependencies, development tools, and data volume mounts. Python and its dependencies are managed by uv.
The devcontainer configuration files are located in the .devcontainer/ directory and can be customized as needed.
Datasets and Pretrained Models
Download Datasets
Datasets for training and testing DeepRTE are generated using conventional numerical methods in MATLAB and Python. The source code is available in a separate repository: rte-dataset. For more details, refer to that repository.
Inference (test) and pretraining datasets are hosted on Huggingface: https://huggingface.co/datasets/mazhengcn/rte-dataset. Download datasets to DATA_DIR with (ensure huggingface-cli is installed; if you followed the setup above, it is already included):
huggingface-cli download mazhengcn/rte-dataset \
--exclude=interim/* \
--repo-type=dataset \
--local-dir=${DATA_DIR}The resulting folder structure should be (for inference, only datasets under raw/test are needed):
Download Pretrained Models
Pretrained models can be downloaded to MODEL_DIR from Huggingface:
huggingface-cli download mazhengcn/deeprte \
--repo-type=model \
--local-dir=${MODELS_DIR}The folder structure will be:
A convenient shell script scripts/download_dataset_and_models.sh is provided to download datasets to ./data and pretrained models to ./models:
uv run ./scripts/download_dataset_and_models.shRun DeepRTE
To run DeepRTE inference:
uv run run_deeprte.py --model_dir=${MODEL_DIR} --data_path=${DATA_PATH} --output_dir=${OUTPUT_DIR}where ${MODEL_DIR} is the pretrained model directory, ${DATA_PATH} is the data path for inference, and ${OUTPUT_DIR} is the directory to store results.
For example:
DATA_PATH=${1:-"./data/raw/test/sin-rv-g0.5-amplitude5-wavenumber10/sin-rv-g0.5-amplitude5-wavenumber10.mat"}
MODEL_DIR=${2:-"./models/v1.0.1/g0.5"}
OUTPUT_DIR=${3:-"./outputs"}
TIMESTAMP="$(date --iso-8601="seconds")"
python run_deeprte.py \
--model_dir="${MODEL_DIR}" \
--data_path="${DATA_PATH}" \
--output_dir="${OUTPUT_DIR}/${TIMESTAMP%+*}"A shell script ./scripts/run_deeprte.sh containing above contents is also provided for convenience, you can modify it and run:
uv run ./scripts/run_deeprte.shYou can also run using the runtime container:
docker run -it \
--volume <DATA_PATH>:/deeprte/data/data_path \
--volume <MODEL_DIR>:/deeprte/models \
--volume <OUTPUT_DIR>:/deeprte/output \
--gpus all \
ghcr.io/mazhengcn/deeprte \
python run_deeprte.py \
--model_dir=/deeprte/models \
--data_path=/deeprte/data \
--output_dir=/deeprte/deeprte_outputTrain DeepRTE
To train DeepRTE from scratch, run:
uv run run_train.py --config=${CONFIG_PATH} --workdir=${CKPT_DIR}After training, checkpoints are saved under ${CKPT_DIR}. To generate an inference checkpoint, run:
uv run generate_param_only_checkpoint.py --train_state_dir=${TRAIN_STATE_DIR} --checkpoint_dir=${CKPT_DIR}You can also modify and run the convenient scripts:
uv run ./scripts/run_train.shuv run ./scripts/generate_param_only_checkpoint.shCiting This Work
Any publication that discloses findings arising from using this source code, the model parameters or outputs produced by those should cite:
@article{ZHU2026118556,
title = {DeepRTE: Pre-trained attention-based neural network for radiative transfer},
journal = {Computer Methods in Applied Mechanics and Engineering},
volume = {449},
pages = {118556},
year = {2026},
issn = {0045-7825},
doi = {https://doi.org/10.1016/j.cma.2025.118556},
url = {https://www.sciencedirect.com/science/article/pii/S004578252500828X},
author = {Yekun Zhu and Min Tang and Zheng Ma},
keywords = {Neural operator, Radiative transfer equation, Attention, Pre-training},
}Acknowledgements
DeepRTE's release was made possible by the contributions of the following people: Zheng Ma, Yekun Zhu, Min Tang and Jingyi Fu.
DeepRTE uses the following separate libraries and packages:
We thank all their contributors and maintainers!
Get in Touch
If you have any questions not covered in this overview, please open an issue or contact the Zheng Ma at zhengma@sjtu.edu.cn.