
Interactive Query Expansion using Concept-Based Directions Finder Based on Wikipedia
Despite the advances in information retrieval the s earch engines still result in imprecise or poor results, mainly due to the quality of the query being submitted. The qu ery formulation to express their information need has always been challenging for the users. In this paper, we have proposed an interactive query expansion methodology using Concept-Based Directions Finder (CBDF). The approach determines t he directions in which the search can be continued by the user using Explicit Semantic Analysis (ESA) for a given query. The CBDF identifies the relevant terms with a corr esponding label for each of the directions found, based on the content and link structure of Wikipedia. The relevant terms identified along with its label are suggested to the user for query expansion through the new visual interface proposed. The vis ual interface named as terms mapper, accepts the query, and displays the p otential directions and a group of relevant terms along with the label for the direction chosen by the user. We evaluated the results of the proposed approach and the visual interfacefor the identified queries. The experimental result shows that the app roach produces a good Mean Average Precision (MAP) for the queries chosen.
[1] Avancini H., Lavelli A., Sebastiani F., and Zanoli R., “Automatic Expansion of Domain-Specific Lexicons by Term Categorization,” ACM Transactions on Speech and Language Processing , vol. 3, no. 1, pp. 1-30, 2006.
[2] Christopher M., Prabhakar R., and Hinrich S., An Introduction to Information Retrieval , Cambridge University Press, Cambridge, 2008.
[3] Croft B. and Thompson H., “I3R: A New Approach to the Design of Document Retrieval Systems,” Journal of the American Society for Information Science , vol. 38, no. 6, pp. 389-404, 1987.
[4] Egozi O., Gabrilovich E., and Markovitch S., “Concept-Based Feature Generation and Selection for Information Retrieval,” in Proceedings of the 23 rd National Conference on Artificial Intelligence , Chicago, vol. 2, pp. 1132- 1137, 2008.
[5] Egozi O., Markovitch S., and Gabrilovich E., “Concept Based Information Retrieval using Explicity Semantic Analysis,” ACM Transactions on Information Systems , vol. 29, no. 2, pp. 1-34, 2011.
[6] Fonseca B., Golgher P., Pôssas B., Ribeiro-Neto B., and Ziviani N., “Concept-Based Interactive Query Expansion,” in Proceedings of the 14 th ACM International Conference on Information and Knowledge Management , Germany, pp. 696- 703, 2005.
[7] Gabrilovich E. and Markovitch S., “Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge,” in Proceedings of the 21 st National Conference on Artificial Intelligence , vol. 2, pp. 1301-1306, 2006.
[8] Gabrilovich E. and Markovitch S., “Computing Semantic Relatedness using Wikipedia-Based Explicit Semantic Analysis,” in Proceedings of the 20 th International Joint Conference on Artificial Intelligence , USA, pp. 1606-1611, 2007.
[9] Ghobadi A. and Rahgozar M., “An Ontology- Based Semantic Extraction Approach for B2C e- Commerce,” International Arab Journal of Information Technology , vol. 8, no. 2, pp. 163- 170, 2011.
[10] Gregorowics A. and Mark K., “Mining a Large- Scale Term-Concept Network from Wikipeida,” Technical Report , MITRE Corporation, USA, 2006.
[11] Google, available at: http://www.google.com, last visited 2011.
[12] Jansen B., Booth D., and Spink A., “Determining the Informational, Navigational, and Transactional Intent of Web Queries,” International Journal on Information Processing & Management , vol. 44, no. 3, pp. 1251-1266, 2008.
[13] Li Y., Luk P., Ho S., and Chung F., “Improving Weak Ad-hoc Queries using Wikipedia as External Corpus,” in Proceedings of the 30 th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , Amsterdam, pp. 797-798, 2007. 578 The International Arab Journal of Information Tech nology, Vol. 10, No. 6, November 2013
[14] Milne D., “Computing Semantic Relatedness using Wikipedia Link Structure,” in Proceedings of the New Zealand Computer Science Research Student Conference , New Zealand, pp. 1-8, 2007.
[15] Mima H. and Ananiadou S., “An Application and Evaluation of the C/NC-Value Approach for the Automatic Term Recognition of Multi-Word Units in Japanese,” International Journal Terminology , vol. 8, no. 2, pp. 175-194, 2001.
[16] Mima H., Ananiadou S., and Matsushima K., “Terminology-Based Knowledge Mining for New Knowledge Discovery,” ACM Transactions on Asian Language Information Processing , vol. 5, no. 1, pp. 74-88, 2006.
[17] Research-ESA Web Service, available at: http://www.multipla-project.org/research_esa_ui/ configurator/index/, last visited 2011.
[18] Syafrullah M. and Salim N., “Improving Term Extraction using Particle Swarm Optimization Techniques,” Journal of Computing , vol. 2, no. 2, pp. 116-120, 2010.
[19] TREC 2010 Web Track, available at: http://trec.nist.gov/data/web10.html, last visited 2011.
[20] Toutanova K., Klein D., Manning C., and Singer Y., “Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network,” in Proceedings of the North American Chapter of the Association for Computational Linguistics on Human Language Technology , USA, pp. 252-259, 2003.
[21] Velardi P., Navigli R., and D’Amadio P., “Mining the Web to Create Specialized Glossaries,” IEEE Intelligent Systems , vol. 23, no. 5, pp. 18-25, 2008.
[22] Wikipedia, available at: http://en.wikipedia.org/ wiki/Wikipedia:About#Basic_navigation_in_Wik ipedia, last visited 2011.
[23] Wikipedia-Manual, available at: http://en.wikipedia.org/wiki/Wikipedia:Linking, last visited 2011.
[24] Wikipedia-Help: Link, available at: http://en.wikipedia.org/wiki/Help:Link, last visited 2011.
[25] Wikipedia, available at: http://en.wikipedia.org/ wiki/Backlink, last visited 2011.
[26] Yahoo, available at: http://www.yhoo.com, last visited 2011.