Skip to main content
Log in

ELCA evaluation for keyword search on probabilistic XML data

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

As probabilistic data management is becoming one of the main research focuses and keyword search is turning into a more popular query means, it is natural to think how to support keyword queries on probabilistic XML data. With regards to keyword query on deterministic XML documents, ELCA (Exclusive Lowest Common Ancestor) semantics allows more relevant fragments rooted at the ELCAs to appear as results and is more popular compared with other keyword query result semantics (such as SLCAs). In this paper, we investigate how to evaluate ELCA results for keyword queries on probabilistic XML documents. After defining probabilistic ELCA semantics in terms of possible world semantics, we propose an approach to compute ELCA probabilities without generating possible worlds. Then we develop an efficient stack-based algorithm that can find all probabilistic ELCA results and their ELCA probabilities for a given keyword query on a probabilistic XML document. Finally, we experimentally evaluate the proposed ELCA algorithm and compare it with its SLCA counterpart in aspects of result probability, time and space efficiency, and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Abiteboul, S., Kimelfeld, B., Sagiv, Y., Senellart, P.: On the expressiveness of probabilistic xml models. VLDB J. 18(5), 1041–1064 (2009)

    Article  Google Scholar 

  2. Abiteboul, S., Senellart, P.: Querying and updating probabilistic information in xml. In: EDBT, pp. 1059–1068 (2006)

  3. Bao, Z., Ling, T.W., Chen, B., Lu, J.: Effective xml keyword search with relevance oriented ranking. In: ICDE, pp. 517–528 (2009)

  4. Chang, L., Yu, J.X., Qin, L.: Query ranking in probabilistic xml data. In: EDBT, pp. 156–167 (2009)

  5. Cohen, S., Kimelfeld, B., Sagiv, Y.: Incorporating constraints in probabilistic xml. ACM Trans. Database Syst. 34(3), (2009)

  6. Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: XSEarch: a semantic search engine for XML. In: VLDB, pp. 45–56 (2003)

  7. Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: ranked keyword search over XML documents. In: SIGMOD Conference, pp. 16–27 (2003)

  8. Hung, E., Getoor, L., Subrahmanian, V.S.: Pxml: a probabilistic semistructured data model and algebra. In: ICDE, pp. 467–478 (2003)

  9. Hung, E., Getoor, L., Subrahmanian, V.S.: Probabilistic interval xml. ACM Trans. Comput. Log. 8(4), (2007)

  10. Kimelfeld, B., Kosharovsky, Y., Sagiv, Y.: Query efficiency in probabilistic xml models. In: SIGMOD Conference, pp. 701–714 (2008)

  11. Kimelfeld, B., Kosharovsky, Y., Sagiv, Y.: Query evaluation over probabilistic xml. VLDB J. 18(5), 1117–1140 (2009)

    Article  Google Scholar 

  12. Kimelfeld, B., Sagiv, Y.: Matching twigs in probabilistic xml. In: VLDB, pp. 27–38 (2007)

  13. Kimelfeld, B., Senellart, P.: Probabilistic XML: models and complexity. http://pierre.senellart.com/publications/kimelfeld2012probabilistic.pdf (2011, preprint)

  14. Kong, L., Gilleron, R., Lemay, A.: Retrieving meaningful relaxed tightest fragments for xml keyword search. In: EDBT, pp. 815–826 (2009)

  15. Li, G., Feng, J., Wang, J., Zhou, L.: Effective keyword search for valuable lcas over xml documents. In: CIKM, pp. 31–40 (2007)

  16. Li, J., Liu, C., Zhou, R., Wang, W.: Suggestion of promising result types for xml keyword search. In: EDBT, pp. 561–572 (2010)

  17. Li, J., Liu, C., Zhou, R., Wang, W.: Top-k keyword search over probabilistic xml data. In: ICDE, pp. 673–684 (2011)

  18. Li, Y., Yu, C., Jagadish, H.V.: Schema-free XQuery. In: VLDB, pp. 72–83 (2004)

  19. Liu, Z., Chen, Y.: Identifying meaningful return information for xml keyword search. In: SIGMOD Conference, pp. 329–340 (2007)

  20. Liu, Z., Chen, Y.: Reasoning and identifying relevant matches for xml keyword search. PVLDB 1(1), 921–932 (2008)

    Google Scholar 

  21. Liu, Z., Chen, Y.: Processing keyword search on xml: a survey. World Wide Web 14(5–6), 671–707 (2011)

    Article  Google Scholar 

  22. Nierman, A., Jagadish, H.V.: ProTDB: probabilistic data in xml. In: VLDB, pp. 646–657 (2002)

  23. Ning, B., Liu, C., Yu, J.X., Wang, G., Li, J.: Matching top-k answers of twig patterns in probabilistic xml. In: DASFAA (1), pp. 125–139 (2010)

  24. Senellart, P., Abiteboul, S.: On the complexity of managing probabilistic xml data. In: PODS, pp. 283–292 (2007)

  25. Sun, C., Chan, C.Y., Goenka, A.K.: Multiway slca-based keyword search in xml data. In: WWW, pp. 1043–1052 (2007)

  26. Tatarinov, I., Viglas, S., Beyer, K.S., Shanmugasundaram, J., Shekita, E.J., Zhang, C.: Storing and querying ordered xml using a relational database system. In: SIGMOD Conference, pp. 204–215 (2002)

  27. van Keulen, M., de Keijzer, A., Alink, W.: A probabilistic xml approach to data integration. In: ICDE, pp. 459–470 (2005)

  28. Xu, Y., Papakonstantinou, Y.: Efficient keyword search for smallest LCAs in XML databases. In: SIGMOD Conference, pp. 537–538 (2005)

  29. Xu, Y., Papakonstantinou, Y.: Efficient lca based keyword search in xml data. In: EDBT, pp. 535–546 (2008)

  30. Zhou, R., Liu, C., Li, J.: Fast elca computation for keyword queries on xml data. In: EDBT, pp. 549–560 (2010)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Zhou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, R., Liu, C., Li, J. et al. ELCA evaluation for keyword search on probabilistic XML data. World Wide Web 16, 171–193 (2013). https://doi.org/10.1007/s11280-012-0166-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-012-0166-4

Keywords

Navigation