A multiscale latent Dirichlet allocation model for object-oriented clustering of VHR panchromatic satellite images

Hong Tang1, Li Shen1, Yinfeng Qi1, Yunhao Chen1, Yang Shu1, Jing Li1, David A. Clausi2
1 State Key Laboratory of Earth Surface Processes and Resource Ecology and Key Laboratory of Environment Change and Natural Disaster, Beijing Normal University, Beijing 100875, China;
2 Department of Systems Design Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada.
Abstract: A novel model is presented to address the problem of semantic clustering of geo-objects in very high resolution panchromatic satellite images. The proposed model combines a probabilistic topic model with a multiscale image representation into an automatic framework by embedding both document and scale selections. The probabilistic topic model is used to characterize the statistical distributions of both intraclass appearance and inter-class coherence of geo-objects within documents, i.e., squared sub-images. Because the bag-of-words assumption involved in the probabilistic topic models does not consider the spatial coherence between topic labels, the multiscale image representation is designed to provide a self-adaptive spatial regularization for various geo-object categories. By introducing scale and document selections, the automatic framework integrates the probabilistic topic model and the multiscale image representation to ensure that words on a site should be allocated the same topic label no matter what documents they reside in. Consequently, unlike the traditional method of applying topic models for analyzing satellite images, the process of explicitly generating a set of documents before modeling and then combining multiple labels for a word on a given site is unnecessary. Gibbs sampling is adopted for parameter estimation and image clustering. Extensive experimental evaluations are designed to first analyze the effect of parameters in the proposed model and then compare the results of our model with those of some state-of-the-art methods for three different types of images. The results indicate that the proposed algorithm consistently outperforms these exiting state-of-the-art methods in all of the experiments.
Keywords: Latent Dirichlet allocation (LDA); object-oriented clustering; probabilistic topic models; scale space theory.
Published in IEEE Transactions on Geoscience and Remote Sensing. 2013, 51(3): 1680-1692.