PointNet++ https://github.com/charlesq34/pointnet2

p1esk · on Feb 16, 2021

Here you go: https://github.com/yanx27/Pointnet_Pointnet2_pytorch - no need for any custom cuda code.

Note the cuda kernels in the original repo were added in August 2017. It might have been the case at the time they needed them, but again, if you need to do something like that today, you're probably an outlier. Modern DL libraries have a pretty vast assortment of ops. There have been a few cases in the last couple of years when I thought I'd need write a custom op in cuda (e.g. np.unpackbits) but every time I found a way to implement it with native Pytorch ops which was fast enough for my purposes.

If you're doing DL/CV research, can you give an example from your own work where you really need to run custom cuda code today?