🚀 morphdG - v0.2.0
morphDG is a GPU-accelerated Discontinuous Galerkin (DG) solver for 2D advection-diffusion equations using polygonal elements. Written in C++ using Kokkos and offering a zero-copy Python API via Pybind11, it is designed for rapid prototyping without sacrificing C++ performance.
✨ Core Features
- Performance Portability: Built on
Kokkos, the same code can run on multi-core CPUs (OpenMP) or GPUs (CUDA-nvidia/HIP-amd). - Polygonal Mesh Agglomeration: Integration with METIS graph partitioning allows us to merge standard triangular meshes into arbitrary polygonal elements. It can dynamically apply spatial “weights” to force local geometric refinement.
- $p$-Adaptivity: Elements can have varying polynomial degrees ($p=1, 2,$ or $3$) within the mesh.
- Anisotropic Diffusion: It supports full diffusion ($A_{xx}, A_{yy}, A_{xy}, A_{yx}$) and advection velocity field ($v_{x}, v_{y}$) functions.
- JIT String Compilation: Mathematical functions for boundary conditions, initial conditions, and PDE parameters can be passed as strings (e.g.,
"1.0 + 2.0*Kokkos::exp(x)"). These are computed on the C++ Kokkos backend for maximum speed.
🐳 Environment Setup (multi-core CPUs (OpenMP))
To ensure a reproducible build environment, morphDG provides a Docker Compose setup based on Rocky Linux 10. It comes pre-configured with Kokkos, METIS, and all necessary Python dependencies.
1. compose.yaml
We have a compose.yaml file in the repository root with the following content:
services:
morphdg:
build:
context: .
dockerfile: Dockerfile
image: morphdg
container_name: morphdg_dev
# The ':z' flag handles Podman SELinux securely and is safely ignored by standard Docker.
volumes:
- ./workspace:/workspace:z
working_dir: /workspace
stdin_open: true
tty: true
command: /bin/bash
2. Launch the Development Environment
Whether you are on Linux, Windows, or macOS, run the following command from the project root:
Using Docker:
docker compose run --rm morphdg
Using Podman:
podman-compose run --rm morphdg
3. Running Your Code (The Working Directory)
When you launch the container using the command above, you are automatically dropped into an interactive bash shell inside the container.
By default, the compose.yaml file mounts the folder named workspace into the docker/podman container:
How to run a morphdG script:
- On your host machine, save your Python scripts (e.g.,
test.py) and your mesh files inside the local./workspacefolder. - Launch the container (
docker compose run --rm morphdg) from the project root directory. - From the container’s bash prompt, simply execute your script:
python test.py
🚀 Quick Start
This is a script demonstrating a standard workflow—from mesh loading and METIS agglomeration to solving an evolutionary advection-diffusion problem with mixed-$p$ elements.
# _ _ ____
# _ __ ___ ___ _ __ _ __ | |__ __| |/ ___|
# | '_ ` _ \ / _ \| '__| '_ \| '_ \ / _` | | _
# | | | | | | (_) | | | |_) | | | | (_| | |_| |
# |_| |_| |_|\___/|_| | .__/|_| |_|\__,_|\____|
# |_|
import numpy as np
import morphdg as mdg
import os
import time
# Create an output directory for the solution figures
os.makedirs("frames", exist_ok=True)
# <--- I. MESH LOADING & PREPARATION --->
mesh = mdg.Mesh()
mesh.load("mesh.dat")
# Zero-copy numpy arrays mapping directly to the C++ memory
nodes = mesh.nodes
tri = mesh.triangles
n_tri = len(tri)
# Calculate the centroids of every triangle.
# We will use these coordinates to assign geometric weights for agglomeration.
cx = np.mean(nodes[tri, 0], axis=1)
cy = np.mean(nodes[tri, 1], axis=1)
# <--- II. POLYGONAL AGGLOMERATION (METIS) --->
# We define an array of 'weights'. Higher weights force METIS to group fewer
# triangles into an element, resulting in a higher resolution mesh
# in those specific regions.
weights = np.ones(n_tri, dtype=np.int32) # Baseline weight (=1) for the entire domain
# --- Boundary Layer Refinement ---
weights[cy > 0.75] = 5 # Refinement in the upper 25% of the domain
weights[cy > 0.9] = 10 # Refinement near the top wall
weights[cx < 0.3] = 10 # Refinement in the left 30% of the domain
weights[cx < 0.1] = 15 # Refinement near the left wall
# --- Domain Refinement ---
# Refine a diagonal band across the line y = x
weights[ np.abs(cy - cx) < 0.3 ] = 20
# Merge the triangles into 512 polygonal elements
# based on the spatial weights defined above.
mesh.agglomerate(num_elem=512, weights=weights)
# <--- III. SOLVER INITIALIZATION & MIXED p-ADAPTIVITY --->
solver = mdg.DGSolver(mesh)
# Randomly assign a polynomial degree (p=2 or p=3) to every polygonal element.
# The C++ core automatically elevates quadrature rules at mixed-p interfaces
# to ensure the mass/stiffness matrix integration remains mathematically exact.
np.random.seed(111)
p_array = np.random.choice([3], size=mesh.num_elements)
solver.polynomial_orders = p_array
# For uniform order, you would use: solver.polynomial_orders = 3
# <--- IV. BOUNDARY CONDITIONS & PDE DEFINITION --->
# We define conditions using 'loc' (Python lambdas to locate/filter the boundary faces)
# and 'val' (JIT Kokkos strings that execute at C++/Kokkos speed).
# --- Dirichlet Boundaries ---
# Apply a sinusoidal pulse: 80 * |sin(6*pi*y)| on the left wall (x < 0.01)
solver.dirichlet_bc(loc=lambda x, y: x < 0.01, val="80.0*Kokkos::abs(Kokkos::sin(6*Kokkos::numbers::pi*y))")
# --- Neumann Boundaries ---
solver.neumann_bc(loc=lambda x, y: x > 0.99, val=0.0) # Right Wall
solver.neumann_bc(loc=lambda x, y: y > 0.99, val=0.0) # Top Wall
solver.neumann_bc(loc=lambda x, y: y < 0.01, val=0.0) # Bottom Wall
# --- PDE Parameters ---
solver.source(0.0) # Zero volumetric heat/mass generation
solver.set_parameters(
vx="1.0 + 2.0*Kokkos::sin(x)", vy=1.0, # Advection velocity
Axx=0.05, Ayy=0.05, # Diffusion tensor
Axy=0.00, Ayx=0.00,
r="-2.0 + Kokkos::sin(y)",
alpha=5.0 # SIPG penalty parameter for stability
)
# <--- V. ASSEMBLY & INITIAL CONDITIONS --->
# assemble() builds the CSR graph.
# It MUST be called prior to applying the initial condition.
tic = time.perf_counter()
solver.assemble()
toc = time.perf_counter()
print(f"Assembly Time: {toc - tic:.4f} seconds")
# del solver, mesh
# exit(0)
# Apply a Gaussian blob centered at (x=0.2, y=0.2).
# The solver performs a projection (via Conjugate Gradient) to
# map the analytical function onto the discontinuous polynomial basis space.
solver.initial_condition("80.0*Kokkos::exp(-100.0 * ((x-0.2)*(x-0.2) + (y-0.2)*(y-0.2)))")
# Note: A slower native Python alternative looks like this:
# solver.initial_condition(lambda x,y: 80.0*np.exp(-100.0 * ((x-0.2)**2 + (y-0.5)**2)))
# <--- VI. TIME STEPPING --->
dt = 0.001
num_steps = 1000
math_time = 0.0
for step in range(num_steps):
step_tic = time.perf_counter()
solver.step(dt, method='implicit_euler')
step_toc = time.perf_counter()
math_time += (step_toc - step_tic)
if step % 100 == 0:
current_state = solver.get_state()
print(f" ◈ Time Step {step:03d} completed.")
solver.plot_solution(current_state, f"frames/frame_{step:03d}.png", vlim=(0.0, 100.0))
avg_math_time = math_time / num_steps
print(f"Total Math Time: {math_time:.4f} sec")
print(f"Average Math Time: {avg_math_time:.6f} sec/step")
# <--- VII. MEMORY CLEANUP --->
# Explicitly delete the Python wrapper objects. This signals the C++ backend
# to immediately free the allocated CPU/GPU arrays, preventing memory leaks.
del solver, mesh
📚 API Reference & Core Concepts
1. Mesh Handling & Agglomeration
morphDG relies on a base simplicial mesh which it merges into arbitrary polygons using METIS.
mesh.agglomerate(num_elem, weights): Merges the triangles intonum_elempolygonal elements. Passing aweightsinteger array forces METIS to grouy more triangles into specific elements.
2. Assembly & Initial Conditions (Crucial Order of Operations)
solver.assemble(): This builds the sparse matrix graph and populates the Volume, Face, and Boundary integrals.solver.initial_condition(...): Performs a projection of the mathematical function onto the DG polynomial basis using Conjugate Gradient. This must be called AFTERassemble().
3. Time Stepping
solver.step(dt, method): Advances the physics bydt.method='implicit_euler': Uses a highly stable implicit step (solved via Kokkos BiCGStab).method='rk4': Uses standard explicit Runge-Kutta 4.
⚙️ The Core Library
- Zero-Copy Memory: The Pybind11 wrapper uses a custom
py_c_arraymapping. When we query arrays, Python looks directly at the C++ memory pointers without copying data. - Exact Integration: In mixed-$p$ setups, evaluating the numerical fluxes at the faces requires care. The
AssembleFaceKernelautomatically detects the highest global polynomial order and elevates the quadrature rule to ensure the Mass and Stiffness matrices remain exact. - Linear Solvers: The $L^2$ projection uses a Conjugate Gradient (
solve_cg) solver because the DG mass matrix is block-diagonal. For implicit time-stepping, the asymmetric advection-diffusion operator is solved using a stabilized Bi-Conjugate Gradient solver (solve_bicgstab).