Photogrammetry isn't really about stereoscopy, it's about building a dense point cloud representation based on many images from different angles, then transforming that to a mesh (and often mapping the textures from photos onto it).
Isn't that called stereoscopy? Matching points from different positions and your viewport coordinates lets you extract point coordinates via triangulation? That would net you a heightmap if you manage to match everything present in all images which then can be transformed into a mesh.
But sure, just having a good camera is probably a lot cheaper than using a projection of any form. Probably also significantly faster.
You can also do point cloud with lidar from drones, just tends to be more expensive and heavier - https://enterprise.dji.com/news/detail/how-lidar-is-revoluti...