as @catweasel mentioned already, one of the key challenges is a depth sort for all the splats which potentially needs to be done on millions of splat each frame. efficient sorters are available in CUDA or for DX12/vulcan but hardly possible for DX11 (i did some research into this while @texone and @tonfilm worked on the FUSE implementation on 3DGS).
however, if your constraints are more relaxed you could theoretically swap the GPU sort implementation against a CPU sort running asychronously and updates sort indexes only every few frames. this is the strategy i’ve seen with several web based implementation of 3DGS and is fine if you don’t have very fast camera movements.
still, doing full implementation of 3DGS in DX11 is still a lot of work and considering the shrinking userbase of beta users probably hard to convince someone to do it.
It is ironic that the simple things from beta have become a harder in gamma, a quad and a renderer were the starting points for so many patches I made, I could sketch form and motion just using them and spreading transforms, then I’d end up remaking them on gpu. That flow is still possible, but there are more nodes involved and you can’t just make them in a couple of clicks. What gamma does do well is constructing more complex systems, that have to be thought through a bit more logically, so it’s good for solving problems and achieving aims that you already have an idea of what you want to do before you start.