Skip to main content
Log in

Tree-based iterated local search for Markov random fields with applications in image analysis

  • Published:
Journal of Heuristics Aims and scope Submit manuscript

Abstract

The maximum a posteriori assignment for general structure Markov random fields is computationally intractable. In this paper, we exploit tree-based methods to efficiently address this problem. Our novel method, named Tree-based Iterated Local Search (T-ILS), takes advantage of the tractability of tree-structures embedded within MRFs to derive strong local search in an ILS framework. The method efficiently explores exponentially large neighborhoods using a limited memory without any requirement on the cost functions. We evaluate the T-ILS on a simulated Ising model and two real-world vision problems: stereo matching and image denoising. Experimental results demonstrate that our methods are competitive against state-of-the-art rivals with significant computational gain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. The term posterior. comes from the early practice in computer vision in which \(P(\mathbf {y}\mid \mathbf {x})\) is first defined then linked to \(P(\mathbf {x}\mid \mathbf {y})\) through the Bayes rule:

    $$\begin{aligned} P(\mathbf {x}\mid \mathbf {y})=\frac{P(\mathbf {x})P(\mathbf {y}\mid \mathbf {x})}{P(\mathbf {y})}. \end{aligned}$$

    where \(P(\mathbf {x})\) is called the priori. However in this paper we will work directly with \(P(\mathbf {x}\mid \mathbf {y})\) for simplicity. The posterior is recently called conditional random fields in machine learning (Lafferty et al. 2001; Tran 2008).

  2. Available at: http://vision.middlebury.edu/stereo/.

  3. The C++ code is available at http://vision.middlebury.edu/MRF/.

  4. Available at: http://vision.middlebury.edu/MRF/.

References

  • Ahuja, R.K., Ergun, Ö., Orlin, J.B., Punnen, A.P.: A survey of very large-scale neighborhood search techniques. Discret. Appl. Math. 123(1), 75–102 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  • Besag, J.: Spatial interaction and the statistical analysis of lattice systems (with discussions). J. Royal Statist. Soc. Ser. B 36, 192–236 (1974)

    MATH  MathSciNet  Google Scholar 

  • Besag, J.: On the statistical analysis of dirty pictures. J. Royal Statist. Soc. Ser. B 48(3), 259–302 (1986)

    MATH  MathSciNet  Google Scholar 

  • Biba, M., Ferilli, S., Esposito, F.: Discriminative structure learning of Markov logic networks. In Inductive Logic Programming, pp. 59–76. Springer, Berlin (2008)

  • Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)

    Article  Google Scholar 

  • Brown, D.F., Garmendia-Doval, A.B., McCall, J.A.W.: Markov random field modelling of royal road genetic algorithms. In Artificial Evolution, pp. 35–56. Springer, Berlin (2002)

  • Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inform. Theory 14(3), 462–467 (1968)

    Article  MATH  MathSciNet  Google Scholar 

  • Cordón, O., Damas, S.: Image registration with iterated local search. J. Heuristics 12(1–2), 73–94 (2006)

    Article  MATH  Google Scholar 

  • Duchi, J., Tarlow, D., Elidan, G., Koller, D.: In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems. Using combinatorial optimization within max-product belief propagation, pp. 369–376. MIT Press, Cambridge, MA (2007)

  • Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient belief propagation for early vision. Int. J. Comput. Vis. 70(1), 41–54 (2006)

    Article  Google Scholar 

  • Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6(6), 721–742 (1984)

    Article  MATH  Google Scholar 

  • Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices. Unpublished manuscript (1971). http://www.statslab.cam.ac.uk/~grg/books/hammfest/hamm-cliff.pdf

  • Hazan, T., Shashua, A.: Norm-product belief propagation: primal-dual message-passing for approximate inference. IEEE Trans. Inform. Theory 56(12), 6294–6316 (2010)

    Article  MathSciNet  Google Scholar 

  • Johnson, J.K., Malioutov, D., Willsky, A.S.: Lagrangian relaxation for MAP estimation in craphical models. In: Proceedings of the 45th Annual Allerton Conference on Communication, Control and Computing, September (2007)

  • Kappes, J.H., Andres, B., Hamprecht, F.A., Schnörr, C., Nowozin, S., Batra, D., Kim, S., Kausler, B.X., Kröger, T., Lellmann, J., et al.: A comparative study of modern inference techniques for structured discrete energy minimization problems. arXiv:1404.0533 (2014)

  • Kim, W., Lee, K.M.: Markov chain Monte Carlo combined with deterministic methods for Markov random field optimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1406–1413. IEEE (2009)

  • Kim, H.J., Kim, E.Y., Kim, J.W., Park, S.H.: MRF model based image segmentation using hierarchical distributed genetic algorithm. Electron. Lett. 34(25), 2394–2395 (1998)

    Article  Google Scholar 

  • Kirkpatrick, S., Gelatt Jr, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  • Kolmogorov, V., Zabih, R.: What energy functions can be minimizedvia graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 147–159 (2004)

    Article  Google Scholar 

  • Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1568–1583 (2006)

    Article  Google Scholar 

  • Kumar, M.P., Kolmogorov, V., Torr, P.H.S.: An analysis of convex relaxations for MAP estimation of discrete MRFs. J. Mach. Learn. Res. 10, 71–106 (2009)

    MATH  MathSciNet  Google Scholar 

  • Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the International Conference on Machine learning, pp. 282–289 (2001)

  • Lauritzen, S.L.: Graphical Models. Oxford Science Publications, Oxford (1996)

    Google Scholar 

  • Li, S.Z.: Markov Random Field Modeling in Computer Vision. Springer, New York (1995)

    Book  Google Scholar 

  • Lourenço, H.R., Martin, O.C., Stützle, T.: Iterated local search: framework and applications. In Handbook of Metaheuristics, pp. 363–397. Springer, Berlin (2010)

  • Lourenco, H.R., Martin, O.C., Stutzle, T.: Iterated local search. Int. Ser. Oper. Res. Manag. Sci. 57, 321–354 (2003)

    MathSciNet  Google Scholar 

  • Maulik, U.: Medical image segmentation using genetic algorithms. IEEE Trans. Inform. Technol. Biomed. 13(2), 166–173 (2009)

    Article  Google Scholar 

  • McCoy, B.M., Wu, T.T.: The Two-Dimensional Ising Model, vol. 22. Harvard University Press, Cambridge (1973)

    Book  MATH  Google Scholar 

  • Meltzer, T., Globerson, A., Weiss, Y.: Convergent message passing algorithms: a unifying view. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 393–401. AUAI Press, Corvallis (2009)

  • Murphy, K.P., Weiss, Y., Jordan, M.I.: Loopy belief propagation for approximate inference: an empirical study. In: Laskey, K.B., Prade, H. (eds.), Proceedings of the 15th Conference on on Uncertainty in Artificial Intelligence, Stockholm, pp. 467–475 (1999)

  • Ouadfel, S., Batouche, M.: MRF-based image segmentation using ant colony system. Electron. Lett. Comput. Vis. Image Anal. 2(2), 12–24 (2003)

    Google Scholar 

  • Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco, CA (1988)

    Google Scholar 

  • Ravikumar, P., Lafferty, J.: Quadratic programming relaxations for metric labeling and Markov random field MAP estimation. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 737–744. ACM Press New York (2006)

  • Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1–3), 7–42 (2002)

    Article  MATH  Google Scholar 

  • Shimony, S.E.: Finding MAPs for belief networks is NP-hard. Artif. Intell. 68(2), 399–410 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  • Sontag, D., Jaakkola, T.: Tree block coordinate descent for MAP in graphical models. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 544–551 (2009)

  • Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., Rother, C.: A comparative study of energy minimization methods for markov random fields with smoothness-based priors. IEEE Trans. Pattern Anal. Mach. Intell. 30(6), 1068–1080 (2008)

    Article  Google Scholar 

  • Tarlow, D., Givoni, I.E., Zemel, R.S., Frey, B.J.: Graph cuts is a max-product algorithm. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 671–680 (2011)

  • Tran, T., Nguyen, T.T., Nguyen, H.L.: Global optimization using Levy flight. In: Proceedings of 2nd National Symposium on Research, Development and Application of Information and Communication Technology (ICT.RDA) (2004)

  • Tran, T.T.: On Conditional Random Fields: Applications, Feature Selection, Parameter Estimation and Hierarchical Modelling. PhD Thesis, Curtin University of Technology (2008)

  • Tseng, D.C., Lai, C.C.: A genetic algorithm for MRF-based segmentation of multi-spectral textured images. Pattern Recogn. Lett. 20(14), 1499–1510 (1999)

    Article  Google Scholar 

  • Wainwright, M.J., Jaakkola, T.S., Willsky, A.S.: MAP estimation via agreement on (hyper)trees: message-passing and linear-programming approaches. IEEE Trans. Inform. Theory 51(11), 3697–3717 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  • Wales, D.J., Doye, J.P.K.: Global optimization by basin-hopping and the lowest energy structures of Lennard–Jones clusters containing up to 110 atoms. J. Phys. Chem. A 101(28), 5111–5116 (1997)

    Article  Google Scholar 

  • Weiss, Y., Freeman, W.T.: On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs. IEEE Trans. Inform. Theory 47(2), 736–744 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  • Werner, T.: A linear programming approach to max-sum problem: a review. IEEE Trans. Pattern Anal. Mach. Intell. 29(7), 1165–1179 (2007)

    Article  Google Scholar 

  • Willsky, A.S.: Multiresolution Markov models for signal and image processing. Proc. IEEE 90(8), 1396–1458 (2002)

    Article  Google Scholar 

  • Wu, C., Doerschuk, P.C.: Tree approximations to Markov random fields. IEEE Trans. Pattern Anal. Mach. Intell. 17(4), 391–402 (1995)

    Article  Google Scholar 

  • Yanover, C., Meltzer, T., Weiss, Y.: Linear programming relaxations and belief propagation-an empirical study. J. Mach. Learn. Res. 7, 1887–1907 (2006)

    MATH  MathSciNet  Google Scholar 

  • Yousefi, S., Azmi, R., Zahedi, M.: Brain tissue segmentation in MR images based on a hybrid of MRF and social algorithms. Med. Image Anal. 16(4), 840–848 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Truyen Tran.

Appendix: Distribution over conditional trees

Appendix: Distribution over conditional trees

We provide the derivation of Eq. (7) from a probabilistic argument. Recall that \(\mathcal {N}(\tau )\) is the Markov blanket of the tree \(\tau \), that is, the set of sites connecting to \(\tau \). Due to the Markov property

$$\begin{aligned} P\left( \mathbf {x}_{\tau }\mid \mathbf {x}_{\mathcal {N}(\tau )},\mathbf {y}\right)&= P\left( \mathbf {x}_{\tau }\mid \mathbf {x}_{\lnot \tau },\mathbf {y}\right) \\&\propto \exp \left\{ -E_{\tau }\left( \mathbf {x}_{\tau },\mathbf {x}_{\mathcal {N}(\tau )},\mathbf {y}\right) \right\} \end{aligned}$$

where

$$\begin{aligned} E_{\tau }\left( \mathbf {x}_{\tau },\mathbf {x}_{\mathcal {N}(\tau )},\mathbf {y}\right)&= \sum _{i\in \tau }E_{i}(x_{i},\mathbf {y})+\sum _{(i,j)\in \mathcal {E}\mid i,j\in \tau }E_{ij}(x_{i},x_{j})\nonumber \\&+\sum _{(i,j)\in \mathcal {E}\mid i\in \tau ,j\in \mathcal {N}(\tau )}E_{ij}(x_{i},x_{j}) \end{aligned}$$
(11)

Equation (7) can be derived from the energy above by letting:

$$\begin{aligned} E_{i}^{*}(x_{i},\mathbf {y})=E_{i}(x_{i},\mathbf {y})+\sum _{(i,j) \in \mathcal {E},j\in \mathcal {N}(\tau )}E_{ij}(x_{i},x_{j}) \end{aligned}$$
(12)

Thus finding the most probable labeling of the tree \(\tau \) conditioned on its neighborhood is equivalent to minimizing the conditional energy in Eq. (9):

$$\begin{aligned} \hat{\mathbf {x}}_{\tau }=\arg \max _{\mathbf {x}_{\tau }}P\left( \mathbf {x}_{\tau }\mid \mathbf {x}_{\mathcal {N}(\tau )},\mathbf {y}\right) \end{aligned}$$
(13)

The equivalence can also be seen intuitively by considering the tree \(\tau \) as a mega-site, so the update in Eq. (9) is analogous to that in Eq. (4).

1.1 Proof of Proposition 1

Recall that the energy can be decomposed into singleton and pairwise local energies (see Eq. 1)

$$\begin{aligned} E(\mathbf {x}_{\tau },\mathbf {x}_{\lnot \tau },\mathbf {y})&= \sum _{i\in \tau }E_{i}(x_{i},\mathbf {y})+\sum _{i\notin \tau }E_{i}(x_{i},\mathbf {y})+\sum _{(i,j)\in \mathcal {E}|i,j\in \tau }E_{ij}(x_{i},x_{j})+\\&\quad +\sum _{j\in \mathcal {N}(\tau )|(i,j)\in \mathcal {E}}E_{ij}(x_{i},x_{j})+\sum _{(i,j)\in \mathcal {E}|i,j\notin \tau }E_{ij}(x_{i},x_{j}) \end{aligned}$$

where:

  • \(\sum _{i\in \tau }E_{i}(x_{i},\mathbf {y})\) is the data energy belonging to the tree \(\tau \),

  • \(\sum _{i\notin \tau }E_{i}(x_{i},\mathbf {y})\) is the data energy outside \(\tau \),

  • \(\sum _{(i,j)\in \mathcal {E}|i,j\in \tau }E_{ij}(x_{i},x_{j})\) is the interaction energy within the tree,

  • \(\sum _{j\in \mathcal {N}(\tau )|(i,j)\in \mathcal {E}}E_{ij}(x_{i},x_{j})\) is the interaction energy between the tree and its boundary, and

  • \(\sum _{(i,j)\in \mathcal {E}|i,j\notin \tau }E_{ij}(x_{i},x_{j})\) is the interaction energy outside the tree.

By grouping energies related to the tree and the rest, we have

$$\begin{aligned} E(\mathbf {x}_{\tau },\mathbf {x}_{\lnot \tau },\mathbf {y})&= E_{\tau }\left( \mathbf {x}_{\tau },\mathbf {x}_{\mathcal {N}(\tau )},\mathbf {y}\right) +\sum _{(i,j)\in \mathcal {E}|i,j\notin \tau }E_{ij}(x_{i},x_{j}) \end{aligned}$$

where \(E_{\tau }\left( \mathbf {x}_{\tau },\mathbf {x}_{\mathcal {N}(\tau )},\mathbf {y}\right) \) is given in Eq. (11) for all \(i\in \tau \). This leads to:

$$\begin{aligned} E(\hat{\mathbf {x}}_{\tau },\mathbf {x}_{\lnot \tau },\mathbf {y})&= E_{\tau }\left( \hat{\mathbf {x}}_{\tau },\mathbf {x}_{\mathcal {N}(\tau )},\mathbf {y}\right) +\sum _{(i,j)\in \mathcal {E}|i,j\notin \tau }E_{ij}(x_{i},x_{j})\\&\le E_{\tau }\left( \mathbf {x}_{\tau },\mathbf {x}_{\mathcal {N}(\tau )},\mathbf {y}\right) +\sum _{(i,j)\in \mathcal {E}|i,j\notin \tau }E_{ij}(x_{i},x_{j})\\&= E(\mathbf {x}_{\tau },\mathbf {x}_{\lnot \tau },\mathbf {y}) \end{aligned}$$

This completes the proof \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tran, T., Phung, D. & Venkatesh, S. Tree-based iterated local search for Markov random fields with applications in image analysis. J Heuristics 21, 25–45 (2015). https://doi.org/10.1007/s10732-014-9270-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10732-014-9270-1

Keywords

Navigation