We will use Segment Anything (SAM) from META. I also used it in Visual-Language Models for Object Detection and Segmentation. Again, I can see the similarity between image processing and point cloud segmentation.
Projection from 3D to 2D
- (ortho) Normal projection i.e. setting a plane direction and/or orthogonal projection direction.
- Mostly for parallel views. We ignore Z.
- Spherical/Cylindrical projection
- Set a center location
- generate an artificial sphere/cylinder
- project point cloud onto the cylinder/sphere (intersection between each point and center point)
- Perspective projection using a camera’s intrinsics and extrinsics
- Not based on features, look into it.
What kind of information do I project?
- Color
- Height
- Normal
SAM is applied on the image ⇒ automatic segmentation based on given number of input prompts (in our case a regular grid of point of interest)