Batch & CUDA
Batch API (CPU and Generic Arrays)
Use solve_batch for input arrays shaped:
(spatial_dim_1, spatial_dim_2, ..., spatial_dim_k, batch)The last axis is treated as batch index, and TV operators are applied only across spatial axes.
Example:
u_batch, stats = TotalVariationImageFiltering.solve_batch(
f_batch,
TotalVariationImageFiltering.ROFConfig();
lambda = 0.1,
tv_mode = TotalVariationImageFiltering.IsotropicTV(),
)For PDHG batch solves, you can additionally pass:
constraint = TotalVariationImageFiltering.NonnegativeConstraint(), orconstraint = TotalVariationImageFiltering.BoxConstraint(lower, upper).
Batch state reuse:
- pass
state = [ROFState(slice1), ROFState(slice2), ...]for ROF; - pass
state = [PDHGState(slice1), PDHGState(slice2), ...]for PDHG.
State vector length must match batch size.
CUDA Extension
The extension module TotalVariationImageFilteringCUDAExt is loaded automatically when:
CUDA.jlis installed and loaded,- a functional CUDA runtime/device is available.
Example:
using CUDA
using TotalVariationImageFiltering
f_gpu = CUDA.rand(Float32, 256, 256)
problem_gpu = TotalVariationImageFiltering.TVProblem(f_gpu; lambda = 0.15f0)
u_gpu, stats_gpu = TotalVariationImageFiltering.solve(problem_gpu, TotalVariationImageFiltering.ROFConfig())CUDA Coverage
Current behavior based on extension code/tests:
- CUDA kernels are provided for gradient/divergence/projection primitives.
- Single-image ROF and PDHG on
CuArrayare supported. - Batched CUDA solve is specialized for
ROFConfigandPDHGConfig. - Batched CUDA path currently requires:
L2Fidelity(ROF), orL2Fidelity/PoissonFidelity(PDHG),Neumannboundary.
ROF paths currently support only constraint = NoConstraint().
If CUDA is unavailable, CPU paths continue to work.
Numerical Equivalence Checks
Repository tests compare CPU and CUDA outputs with tolerances for:
- single-image ROF solve;
- batched ROF solve.