`KernelAbstractions` extension for matrix-vector product of `IntegralOperators`

I was looking into ways to accelerate dense matrix-vector products (the naive $\mathcal{O}(N^2)$ approach) for matrices defined through a kernel, and I think it would be very useful to support a matrix-free implementation that can execute the double for loop efficiently across different hardware backends, particularly GPUs.

Libraries such as [KeOps](https://www.kernel-operations.io/keops/index.html) demonstrate that this approach can be highly effective. Some preliminary benchmarks on my Apple M3 GPU using Metal suggest that, for simple kernels such as Laplace, a GPU-accelerated matrix-free implementation can be competitive with hierarchical matrices or FMM up to surprisingly large problem sizes, extending to hundreds of thousands of DOFs.

After some poking around, it seems that implementing this through [KernelAbstractions](https://github.com/JuliaGPU/KernelAbstractions.jl) should not be too difficult. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`KernelAbstractions` extension for matrix-vector product of `IntegralOperators` #156

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

KernelAbstractions extension for matrix-vector product of IntegralOperators #156

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`KernelAbstractions` extension for matrix-vector product of `IntegralOperators` #156