Abstract
The maximum a posteriori assignment for general structure Markov random fields is computationally intractable. In this paper, we exploit tree-based methods to efficiently address this problem. Our novel method, named Tree-based Iterated Local Search (T-ILS), takes advantage of the tractability of tree-structures embedded within MRFs to derive strong local search in an ILS framework. The method efficiently explores exponentially large neighborhoods using a limited memory without any requirement on the cost functions. We evaluate the T-ILS on a simulated Ising model and two real-world vision problems: stereo matching and image denoising. Experimental results demonstrate that our methods are competitive against state-of-the-art rivals with significant computational gain.
Similar content being viewed by others
Notes
The term posterior. comes from the early practice in computer vision in which \(P(\mathbf {y}\mid \mathbf {x})\) is first defined then linked to \(P(\mathbf {x}\mid \mathbf {y})\) through the Bayes rule:
$$\begin{aligned} P(\mathbf {x}\mid \mathbf {y})=\frac{P(\mathbf {x})P(\mathbf {y}\mid \mathbf {x})}{P(\mathbf {y})}. \end{aligned}$$where \(P(\mathbf {x})\) is called the priori. However in this paper we will work directly with \(P(\mathbf {x}\mid \mathbf {y})\) for simplicity. The posterior is recently called conditional random fields in machine learning (Lafferty et al. 2001; Tran 2008).
Available at: http://vision.middlebury.edu/stereo/.
The C++ code is available at http://vision.middlebury.edu/MRF/.
Available at: http://vision.middlebury.edu/MRF/.
References
Ahuja, R.K., Ergun, Ö., Orlin, J.B., Punnen, A.P.: A survey of very large-scale neighborhood search techniques. Discret. Appl. Math. 123(1), 75–102 (2002)
Besag, J.: Spatial interaction and the statistical analysis of lattice systems (with discussions). J. Royal Statist. Soc. Ser. B 36, 192–236 (1974)
Besag, J.: On the statistical analysis of dirty pictures. J. Royal Statist. Soc. Ser. B 48(3), 259–302 (1986)
Biba, M., Ferilli, S., Esposito, F.: Discriminative structure learning of Markov logic networks. In Inductive Logic Programming, pp. 59–76. Springer, Berlin (2008)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)
Brown, D.F., Garmendia-Doval, A.B., McCall, J.A.W.: Markov random field modelling of royal road genetic algorithms. In Artificial Evolution, pp. 35–56. Springer, Berlin (2002)
Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inform. Theory 14(3), 462–467 (1968)
Cordón, O., Damas, S.: Image registration with iterated local search. J. Heuristics 12(1–2), 73–94 (2006)
Duchi, J., Tarlow, D., Elidan, G., Koller, D.: In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems. Using combinatorial optimization within max-product belief propagation, pp. 369–376. MIT Press, Cambridge, MA (2007)
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient belief propagation for early vision. Int. J. Comput. Vis. 70(1), 41–54 (2006)
Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6(6), 721–742 (1984)
Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices. Unpublished manuscript (1971). http://www.statslab.cam.ac.uk/~grg/books/hammfest/hamm-cliff.pdf
Hazan, T., Shashua, A.: Norm-product belief propagation: primal-dual message-passing for approximate inference. IEEE Trans. Inform. Theory 56(12), 6294–6316 (2010)
Johnson, J.K., Malioutov, D., Willsky, A.S.: Lagrangian relaxation for MAP estimation in craphical models. In: Proceedings of the 45th Annual Allerton Conference on Communication, Control and Computing, September (2007)
Kappes, J.H., Andres, B., Hamprecht, F.A., Schnörr, C., Nowozin, S., Batra, D., Kim, S., Kausler, B.X., Kröger, T., Lellmann, J., et al.: A comparative study of modern inference techniques for structured discrete energy minimization problems. arXiv:1404.0533 (2014)
Kim, W., Lee, K.M.: Markov chain Monte Carlo combined with deterministic methods for Markov random field optimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1406–1413. IEEE (2009)
Kim, H.J., Kim, E.Y., Kim, J.W., Park, S.H.: MRF model based image segmentation using hierarchical distributed genetic algorithm. Electron. Lett. 34(25), 2394–2395 (1998)
Kirkpatrick, S., Gelatt Jr, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)
Kolmogorov, V., Zabih, R.: What energy functions can be minimizedvia graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 147–159 (2004)
Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1568–1583 (2006)
Kumar, M.P., Kolmogorov, V., Torr, P.H.S.: An analysis of convex relaxations for MAP estimation of discrete MRFs. J. Mach. Learn. Res. 10, 71–106 (2009)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the International Conference on Machine learning, pp. 282–289 (2001)
Lauritzen, S.L.: Graphical Models. Oxford Science Publications, Oxford (1996)
Li, S.Z.: Markov Random Field Modeling in Computer Vision. Springer, New York (1995)
Lourenço, H.R., Martin, O.C., Stützle, T.: Iterated local search: framework and applications. In Handbook of Metaheuristics, pp. 363–397. Springer, Berlin (2010)
Lourenco, H.R., Martin, O.C., Stutzle, T.: Iterated local search. Int. Ser. Oper. Res. Manag. Sci. 57, 321–354 (2003)
Maulik, U.: Medical image segmentation using genetic algorithms. IEEE Trans. Inform. Technol. Biomed. 13(2), 166–173 (2009)
McCoy, B.M., Wu, T.T.: The Two-Dimensional Ising Model, vol. 22. Harvard University Press, Cambridge (1973)
Meltzer, T., Globerson, A., Weiss, Y.: Convergent message passing algorithms: a unifying view. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 393–401. AUAI Press, Corvallis (2009)
Murphy, K.P., Weiss, Y., Jordan, M.I.: Loopy belief propagation for approximate inference: an empirical study. In: Laskey, K.B., Prade, H. (eds.), Proceedings of the 15th Conference on on Uncertainty in Artificial Intelligence, Stockholm, pp. 467–475 (1999)
Ouadfel, S., Batouche, M.: MRF-based image segmentation using ant colony system. Electron. Lett. Comput. Vis. Image Anal. 2(2), 12–24 (2003)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco, CA (1988)
Ravikumar, P., Lafferty, J.: Quadratic programming relaxations for metric labeling and Markov random field MAP estimation. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 737–744. ACM Press New York (2006)
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1–3), 7–42 (2002)
Shimony, S.E.: Finding MAPs for belief networks is NP-hard. Artif. Intell. 68(2), 399–410 (1994)
Sontag, D., Jaakkola, T.: Tree block coordinate descent for MAP in graphical models. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 544–551 (2009)
Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., Rother, C.: A comparative study of energy minimization methods for markov random fields with smoothness-based priors. IEEE Trans. Pattern Anal. Mach. Intell. 30(6), 1068–1080 (2008)
Tarlow, D., Givoni, I.E., Zemel, R.S., Frey, B.J.: Graph cuts is a max-product algorithm. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 671–680 (2011)
Tran, T., Nguyen, T.T., Nguyen, H.L.: Global optimization using Levy flight. In: Proceedings of 2nd National Symposium on Research, Development and Application of Information and Communication Technology (ICT.RDA) (2004)
Tran, T.T.: On Conditional Random Fields: Applications, Feature Selection, Parameter Estimation and Hierarchical Modelling. PhD Thesis, Curtin University of Technology (2008)
Tseng, D.C., Lai, C.C.: A genetic algorithm for MRF-based segmentation of multi-spectral textured images. Pattern Recogn. Lett. 20(14), 1499–1510 (1999)
Wainwright, M.J., Jaakkola, T.S., Willsky, A.S.: MAP estimation via agreement on (hyper)trees: message-passing and linear-programming approaches. IEEE Trans. Inform. Theory 51(11), 3697–3717 (2005)
Wales, D.J., Doye, J.P.K.: Global optimization by basin-hopping and the lowest energy structures of Lennard–Jones clusters containing up to 110 atoms. J. Phys. Chem. A 101(28), 5111–5116 (1997)
Weiss, Y., Freeman, W.T.: On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs. IEEE Trans. Inform. Theory 47(2), 736–744 (2001)
Werner, T.: A linear programming approach to max-sum problem: a review. IEEE Trans. Pattern Anal. Mach. Intell. 29(7), 1165–1179 (2007)
Willsky, A.S.: Multiresolution Markov models for signal and image processing. Proc. IEEE 90(8), 1396–1458 (2002)
Wu, C., Doerschuk, P.C.: Tree approximations to Markov random fields. IEEE Trans. Pattern Anal. Mach. Intell. 17(4), 391–402 (1995)
Yanover, C., Meltzer, T., Weiss, Y.: Linear programming relaxations and belief propagation-an empirical study. J. Mach. Learn. Res. 7, 1887–1907 (2006)
Yousefi, S., Azmi, R., Zahedi, M.: Brain tissue segmentation in MR images based on a hybrid of MRF and social algorithms. Med. Image Anal. 16(4), 840–848 (2012)
Author information
Authors and Affiliations
Corresponding author
Appendix: Distribution over conditional trees
Appendix: Distribution over conditional trees
We provide the derivation of Eq. (7) from a probabilistic argument. Recall that \(\mathcal {N}(\tau )\) is the Markov blanket of the tree \(\tau \), that is, the set of sites connecting to \(\tau \). Due to the Markov property
where
Equation (7) can be derived from the energy above by letting:
Thus finding the most probable labeling of the tree \(\tau \) conditioned on its neighborhood is equivalent to minimizing the conditional energy in Eq. (9):
The equivalence can also be seen intuitively by considering the tree \(\tau \) as a mega-site, so the update in Eq. (9) is analogous to that in Eq. (4).
1.1 Proof of Proposition 1
Recall that the energy can be decomposed into singleton and pairwise local energies (see Eq. 1)
where:
-
\(\sum _{i\in \tau }E_{i}(x_{i},\mathbf {y})\) is the data energy belonging to the tree \(\tau \),
-
\(\sum _{i\notin \tau }E_{i}(x_{i},\mathbf {y})\) is the data energy outside \(\tau \),
-
\(\sum _{(i,j)\in \mathcal {E}|i,j\in \tau }E_{ij}(x_{i},x_{j})\) is the interaction energy within the tree,
-
\(\sum _{j\in \mathcal {N}(\tau )|(i,j)\in \mathcal {E}}E_{ij}(x_{i},x_{j})\) is the interaction energy between the tree and its boundary, and
-
\(\sum _{(i,j)\in \mathcal {E}|i,j\notin \tau }E_{ij}(x_{i},x_{j})\) is the interaction energy outside the tree.
By grouping energies related to the tree and the rest, we have
where \(E_{\tau }\left( \mathbf {x}_{\tau },\mathbf {x}_{\mathcal {N}(\tau )},\mathbf {y}\right) \) is given in Eq. (11) for all \(i\in \tau \). This leads to:
This completes the proof \(\square \)
Rights and permissions
About this article
Cite this article
Tran, T., Phung, D. & Venkatesh, S. Tree-based iterated local search for Markov random fields with applications in image analysis. J Heuristics 21, 25–45 (2015). https://doi.org/10.1007/s10732-014-9270-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10732-014-9270-1