Mixed CPU and GPU computations
If GPU memory is a limiting factor for your computation, it may be preferable to carry out particle operations on the CPU rather than on the GPU. This involves basically 4 steps:
- At the top of the script. The JustPIC backend must be set to CPU, in contrast with other employed packages (e.g. ParallelStencil):
const backend = JustPIC.CPUBackend
- At memory allocation stage. A copy of relevant CPU arrays must be allocated on the GPU memory. For example, phase ratios on mesh vertices:
phv_GPU = @zeros(nx+1, ny+1, nz+1, celldims=(N_phases))
where N_phases
is the number of different material phases and @zeros()
allocates on the GPU.
Similarly, GPU arrays must be copied to CPU memory:
V_CPU = (
x = zeros(nx+1, ny+2, nz+2),
y = zeros(nx+2, ny+1, nz+2),
z = zeros(nx+2, ny+2, nz+1),
)
where zeros()
allocates on the CPU memory.
- At each time step. The particle will be stored on the CPU memory. It is hence necessary to transfer some information from the CPU to the GPU memory. For example, here's a transfer of phase proportions:
phv_GPU.data .= CuArray(phase_ratios.vertex).data
!!! we explicitly write CuArray
- would be better to have something more explicit like GPUArray
- is there such a thing?
- At each time step. Once velocity computation are finalised on the GPU, they need to be transferred to the CPU:
V_CPU.x .= TA(backend)(V.x)
V_CPU.y .= TA(backend)(V.y)
V_CPU.z .= TA(backend)(V.z)
Advection can then be applied by calling the advection()
function:
advection!(particles, RungeKutta2(), values(V), (grid_vx, grid_vy, grid_vz), Δt)