.png)

The Insta360 camera was mounted on a tripod, and placed at a fixed location to take a single shot for the scene image capturing process. Afterwards, it was repositioned with minimal rotation and movement (less than 10 cm between each camera positioning and less than 60 degrees of rotation to either side on the lateral axis of the camera) to another location within the scene, and another image was taken. This process was repeated systematically until the entire scene was covered, with each lens consistently covering the same side of the scene at all times. This consistency, combined with the minimal rotation or movement between each capture ensured sufficient overlap between the images, which played a key role in the subsequent accurate SfM based sparse point cloud generation step.
To avoid disruptions from moving objects, such as people in indoor environ- ments or cars in outdoor settings, images were captured at times and locations with minimal activity. The photographer ensured they remained out of the frame by strategically positioning themselves in occluded areas or sequentially capturing the two fisheye images from the same fixed position—first taking a shot while remaining outside the field of view (FoV) of one lens, then repositioning to avoid the FoV of the second lens before capturing the next image.
Faro Focus 3D LIDAR scanner, fixed on a tripod is also used to capture high-resolution XYZRGB dense point clouds, which can be used as geometric ground truths for the scenes we captured. This ground truth information can be used for point cloud alignment, 3D scene reconstruction and novel view synthesis model benchmarking and accuracy validation purposes. Scene reconstruction methods that incorporate depth information can also find these dense point clouds useful.