Skip to main content
Log in

XML keyword search with promising result type recommendations

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Keyword search enables inexperienced users to easily search XML database with no specific knowledge of complex structured query languages and XML data schemas. Existing work has addressed the problem of selecting data nodes that match keywords and connecting them in a meaningful way, e.g., SLCA and ELCA. However, it is time-consuming and unnecessary to serve all the connected subtrees to the users because in general the users are only interested in part of the relevant results. In this paper, we propose a new keyword search approach which basically utilizes the statistics of underlying XML data to decide the promising result types and then quickly retrieves the corresponding results with the help of selected promising result types. To guarantee the quality of the selected promising result types, we measure the correlations between result types and a keyword query by analyzing the distribution of relevant keywords and their structures within the XML data to be searched. In addition, relevant result types can be efficiently computed without keyword query evaluation and any schema information. To directly return top-k keyword search results that conform to the suggested promising result types, we design two new algorithms to adapt to the structural sensitivity of the keyword nodes over the keyword search results. Lastly, we implement all proposed approaches and present the relevant experimental results to show the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Arampatzis, A.T., Kamps, J.: A study of query length. In: SIGIR, pp. 811–812 (2008)

  2. Bao, Z., Ling, T.W., Chen, B., Lu, J.: Effective xml keyword search with relevance oriented ranking. In: ICDE, pp. 517–528 (2009)

  3. Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: XSEarch: a semantic search engine for XML. In: VLDB, pp. 45–56 (2003)

  4. Denoyer, L., Gallinari, P.: The wikipedia xml corpus. SIGIR Forum 40(1), 64–69 (2006)

    Article  Google Scholar 

  5. Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS, pp. 102–113 (2001)

  6. Florescu, D., Kossmann, D., Manolescu, I.: Integrating keyword search into XML query processing. Comput. Networks 33(1–6), 119–135 (2000)

    Article  Google Scholar 

  7. Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: ranked keyword search over XML documents. In: SIGMOD Conference, pp. 16–27 (2003)

  8. Hadjieleftheriou, M., Chandel, A., Koudas, N., Srivastava, D.: Fast indexes and algorithms for set similarity selection queries. In: ICDE, pp. 267–276 (2008)

  9. Hristidis, V., Papakonstantinou, Y., Balmin, A.: Keyword proximity search on xml graphs. In: ICDE, pp. 367–378 (2003)

  10. iProspect: Iprospect natural seo keyword length study (Nuvember 2004). Technical report, iProspect (2004)

  11. Kong, L., Gilleron, R., Lemay, A.: Retrieving meaningful relaxed tightest fragments for xml keyword search. In: EDBT, pp. 815–826 (2009)

  12. Koutrika, G., Simitsis, A., Ioannidis, Y.E.: Précis: the essence of a query answer. In: ICDE, pp. 69–78 (2006)

  13. Kulkarni, S., Caragea, D.: Computation of the semantic relatedness between words using concept clouds. In: KDIR, pp. 183–188 (2009)

  14. Li, Y., Yu, C., Jagadish, H.V.: Schema-free XQuery. In: VLDB, pp. 72–83 (2004)

  15. Li, G., Feng, J., Wang, J., Zhou, L.: Effective keyword search for valuable lcas over xml documents. In: CIKM, pp. 31–40 (2007)

  16. Li, J., Liu, C., Zhou, R.,Wang, W.: Suggestion of promising result types forXMLkeyword search. In: EDBT, pp. 561–572 (2010)

  17. Li, J., Liu, C., Zhou, R., Wang, W.: Top-k keyword search over probabilistic XML data. In: ICDE, pp. 673–684 (2011)

  18. Liu, Z., Chen, Y.: Identifying meaningful return information for xml keyword search. In: SIGMOD Conference, pp. 329–340 (2007)

  19. Liu, Z., Chen, Y.: Reasoning and identifying relevant matches for xml keyword search. PVLDB 1(1), 921–932 (2008)

    Google Scholar 

  20. Liu, Z., Sun, P., Chen, Y.: Structured search result differentiation. PVLDB 2(1), 313–324 (2009)

    Google Scholar 

  21. Liu, C., Li, J., Yu, J.X., Zhou, R.: Adaptive relaxation for querying heterogeneous XML data sources. Inf. Syst. 35(6), 688–707 (2010)

    Article  Google Scholar 

  22. Polyzotis, N., Garofalakis, M.N.: Structure and value synopses for xml data graphs. In: VLDB, pp. 466–477 (2002)

  23. Polyzotis, N., Garofalakis, M.N., Ioannidis, Y.E.: Selectivity estimation for xml twigs. In: ICDE, pp. 264–275 (2004)

  24. Sun, C., Chan, C.Y., Goenka, A.K.: Multiway slca-based keyword search in xml data. In: WWW, pp. 1043–1052 (2007)

  25. Termehchy, A., Winslett, M.: Effective, design-independent xml keyword search. In: CIKM, pp. 107–116, (2009)

  26. Termehchy, A., Winslett, M.: Using structural information in xml keyword search effectively. ACM Trans. Database Syst. 36(1), 4 (2011)

    Article  Google Scholar 

  27. Xu, Y., Papakonstantinou, Y.: Efficient keyword search for smallest LCAs in XML databases. In: SIGMOD Conference, pp. 537–538 (2005)

  28. Xu, Y., Papakonstantinou, Y.: Efficient lca based keyword search in xml data. In: EDBT, pp. 535–546 (2008)

  29. Zhou, R., Liu, C., Li, J.: Fast elca computation for keyword queries on xml data. In: EDBT, pp. 549–560 (2010)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianxin Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, J., Liu, C., Zhou, R. et al. XML keyword search with promising result type recommendations. World Wide Web 17, 127–159 (2014). https://doi.org/10.1007/s11280-012-0198-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-012-0198-9

Keywords

Navigation