Skip to main content

Matching Top-k Answers of Twig Patterns in Probabilistic XML

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5981))

Abstract

The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. The top-k matching of a twig pattern against probabilistic XML data is essential. Some classical twig pattern algorithms can be adjusted to process the probabilistic XML. However, as far as finding answers of the top-k probabilities is concerned, the existing algorithms suffer in performance, because many unnecessary intermediate path results, with small probabilities, need to be processed. To cope with this problem, we propose a new encoding scheme called PEDewey for probabilistic XML in this paper. Based on this encoding scheme, we then design two algorithms for finding answers of top-k probabilities for twig queries. One is called ProTJFast, to process probabilistic XML data based on element streams in document order, and the other is called PTopKTwig, based on the element streams ordered by the path probability values. Experiments have been conducted to study the performance of these algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On Supporting Containment Queries in Relational Database Management Systems. In: Proceeding of SIGMOD, pp. 425–436 (2001)

    Google Scholar 

  2. Lu, J., Ling, T.W., Chan, C.-Y., Chen, T.: From region encoding to extended dewey: On efficient processing of XML twig pattern matching. In: Proceeding of VLDB, pp. 193–204 (2005)

    Google Scholar 

  3. Chang, L., Yu, J.X., Qin, L.: Query Ranking in Probabilistic XML Data. In: Proceeding of EDBT, pp. 156–167 (2009)

    Google Scholar 

  4. Hung, E., Getoor, L., Subrahmanian, V.S.: PXML: A probabilistic semistructured data model and algebra. In: Proceeding of ICDE, pp. 467–478 (2003)

    Google Scholar 

  5. Nierman, A., Jagasish, H.V.: ProTDB: Probabilistic data in XML. In: Proceeding of VLDB, pp. 646–657 (2002)

    Google Scholar 

  6. Senellart, P., Abiteboul, S.: On the complexity of managing probabilistic XML data. In: Proceeding of PODS, pp. 283–292 (2007)

    Google Scholar 

  7. Kimelfeld, B., Kosharovsky, Y., Sagiv, Y.: Query efficiency in probabilistic XML models. In: Proceeding of SIGMOD, pp. 701–714 (2008)

    Google Scholar 

  8. Kimelfeld, B., Sagiv, Y.: Matching twigs in probabilistic XML. In: Proceeding of VLDB, pp. 27–38 (2007)

    Google Scholar 

  9. Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking queries on uncertain data: A probabilistic threshold approach. In: Proceeding of SIGMOD, pp. 673–686 (2008)

    Google Scholar 

  10. Re, C., Dalvi, N.N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: Proc. of ICDE 2007, pp. 886–895 (2007)

    Google Scholar 

  11. Soliman, M.A., Ilyas, I.F., Chang, K.C.-C.: Top-k query processing in uncertain databases. In: Proc. of ICDE 2007, pp. 896–905 (2007)

    Google Scholar 

  12. Bruno, N., Srivastava, D., Koudas, N.: Holistic twig joins: Optimal XML pattern matching. In: Proceedings of SIGMOD, pp. 310–321 (2002)

    Google Scholar 

  13. Qin, L., Yu, J.X., Ding, B.: TwigList: Make Twig Pattern Matching Fast. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 850–862. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ning, B., Liu, C., Yu, J.X., Wang, G., Li, J. (2010). Matching Top-k Answers of Twig Patterns in Probabilistic XML. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 5981. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12026-8_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12026-8_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12025-1

  • Online ISBN: 978-3-642-12026-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics