Ancient Chinese architecture 3D digitalization and documentation is a challenging task for the image based modeling community due to its architectural complexity and structural delicacy. Currently, an eﬀective approach to ancient Chinese architecture 3D reconstruction is to merge the two point clouds, separately obtained from ground and aerial images by the SfM technique. There are two understanding issues should be specially addressed: (1) it is diﬃcult to ﬁnd the point matches between the images from diﬀerent sources due to their remarkable variations in viewpoint and scale; (2) due to the inevitable drift phenomenon in any SfM reconstruction process, the resulting two point clouds are no longer strictly related by a single similarity transformation as it should be theoretically. To address these two issues, a new point cloud merging method is proposed in this work. Our method has the following characteristics: (1) the images are matched by leveraging sparse mesh based image synthesis; (2) the putative point matches are ﬁltered by geometrical consistency check and geometrical model veriﬁcation; and (3) the two point clouds are merged via bundle adjustment by linking the ground-to-aerial tracks. Extensive experiments show that our method outperforms many of the state-of-theart approaches in terms of ground-to-aerial image matching and point cloud merging.