The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Logical Schema-Based Mapping Technique to

#
Data warehouse systems are used for decision- making purposes. The Online Analytical Processing (OLAP) tools are commonly used to query and analysis of results on such systems. It is complex task for non- technical users (executives, managers etc. ,) to query the d ata warehouse using OLAP tool keeping in view the schema knowledge. For such d ata warehouse users, a natural language interface is a viable solution that transparently access data to fulfil their requirement . As data warehouse contain several times more data (that increase with incremental refreshes) than the operational systems . So keyword-based searching in such systems cannot be performed similar to database based natural language systems. Existing natural language interfaces to data warehouse commonly explore keywords in data instances directly that takes more than sufficient time in generating results. This paper proposes a Logical Schema- based Mapping (LSM) technique to reduce search space in the data warehouse data instances. It performs mapping of t he natural language query keywords with logical schema of the data warehouse to identify the elements prior to search in the data instances. The retrieved matches for a keyword are ranked based on six criteria proposed in this paper. Further, an algorithm has been presented which is developed upon the proposed criteria. Targeted search in the data instances is then performed efficiently after the identification of schema elements. The in-depth experiments have been carried out on real dataset to evaluate the system with respect to completeness, accuracy and performance parameters. The results show that LSM technique outperforms the existing systems.


[1] Agrawal S., Chaudhuri S. , and Das G. , Dbxplorer: A S ystem for Keyword -B ased S earch Over Relational Databases, in Proc eeding the 18th International Conference on Data Engineering, DC, pp. 5- 16, 2002.

[ 2] Aljanabi A., Alhamami A., and Alhadidi B., Query Dispatching Tool Supporting Fast Access to Data Warehouse, The International Arab Journal of Information Technology , vol. 10, no. 3, pp. 269- 275, 2013.

[ 3] Androutsopoulos I., Ritchie G., and Thanisch P., Natu ral Language Interfaces to Databases- An Introduction, Natural Language Engineering, vol. 1, pp. 29- 81, 1995.

[ 4] Bao Z., Ling T., Chen B., and Lu J., Effective XML Keyword Search with Relevance Oriented Ranking, in Proceeding of IEEE International Conferenc e on Data Engineering, Shanghai, pp. 517- 528, 2009.

[ 5] Bhalotia G., Hulgeri A., Nakhe C., Sudarshan S., and Chakrabarti S., Keyword Searching and B rowsing in Databases using BANKS , in Proc eeding of 18 th International Conference on Data Engineering, CA, pp. 431-440, 2002.

[ 6] Bruckner R. and Tjoa A., Managing T ime C onsistency for A ctive Data Warehouse The International Arab Journal of Information Technology, Vol. 14, No. 1, January 2017 78 E nvironments, in Proc eeding of 3rd I nternational C onference on D ata Warehousing and K nowledge discovery , London, pp. 254- 263, 2001.

[ 7] Chaudhuri S. and Dayal U., An Overview of Data Warehousing and OLAP Technology , SIGMOD Record , vol. 26, no. 1, pp. 65- 74, 1997.

[ 8] Chaudhuri S., Ramakrishnan R., and Weikum G., Integrating DB and IR T echnologies: What is the S ound of O ne Hand Clapping?, in Proceeding of the Conference on Innovative Data Systems Research (CIDR) , CA, 2005.

[ 9] El-Mouadib F., Zubi Z., Almagrous A., and El - Feghi I. , Generic Interactive Natural Language Interface to Databases (GINLIDB) , International Journal of Computers , vol. 3, no. 3, 2009.

[ 10] Golenberg K., Ki melfeld B., and Sagiv Y., Keyword Proximity Search in Complex Data G raphs, in Proc eeding of 2008 ACM SIGMOD Int ernational Conference on Management of Data , Canada, pp. 927- 940, 2008.

[ 11] Guo L., Shao F., Botev C., and Shanmugasundaram J., XRANK: Ranked Keyw ord Search over XML Documents, in Proceedings of the 2003 ACM SIGMOD I nternational Conference on Management of D ata, California, pp. 16-27, 2003.

[ 12] Hristidis V., Gravano L., and Papakonstantinou Y., Efficient IR -S tyle Keyword Search over R elational Databas es, in Proceedings of the 29 th International Conference on Very Large Data Bases, Berlin, pp. 850- 861, 2003.

[ 13] Hristidis V. and Papakonstantinou Y., Discover: Keyword Search in Relational Databases , in Proc eeding of the 28 th International Conference on Ve ry Large Data Bases, Hong Kong, pp. 670- 681, 2002.

[ 14] Hristidis V., Papakonstantinou Y., and Balmin A., Keyword Proximity Search on XML G raphs, in Proceeding of 19 th International Conference on Data Engineering, Bangalore, pp. 367- 378, 2003.

[ 15] Huang B., Zhang G. , and Sheu P., A N atural L anguage Database Interface Based on a P robabilistic Context Free Grammar, in Proceeding of IEEE International W orkshop on Semantic Computing and Systems, Huangshan, pp. 155- 162, 2008.

[ 16] Kabra N., Ramakrishnan R., and Ercegovac V., The QUIQ Engine: A Hybrid IR DB System, in Proceeding of 19th International Conference on Data Engineering, Bangalore, pp. 741- 743, 2003.

[ 17] Kacholia V., Pandit S., Chakrabarti S., Sudarshan S., Desai R., and Karambelkar H., Bidirectional E xpansion for Keyword Search on G raph D atabases, in Proceeding of the 31 st Very Large Data Bases Conference, Norway, pp. 505-516, 2005.

[18] Kuchmann- Beauger N. and Aufaure M., A Natural Language Interface for Data Warehouse Question Answering, in Proceeding of 16 th Inte rnational Conference on Applications of Natural Language to Information Systems , Alicante, pp. 201-208, 2011.

[ 19] Liu F., Yu C., Meng W., and Chowdhury A., Effective Keyword Search in Relational D atabases, in Proceeding of the 2006 ACM SIGMOD International C onference on Management of Data, Chicago, pp. 563-574, 2006.

[ 20] Majeed F. and Shoaib M., A Natural Language B ased Retrieval System for the Data Warehouse, Pakistan Journal of Science (PJS) , vol. 65, no. 3, pp. 426- 434, 2013.

[ 21] Naeem M., Saif U ., and Bajwa I., Interacting with Data Warehouse by Using a Natural Language Interface, in Proceeding of 17 th International Conference on Applications of Natural Language to Information Systems , Netherlands, pp. 372- 377, 2012.

[ 22] Sapia C., Blaschka M ., H fling G., and Dint er B. , Extending the E/R Model for the Multidimensional Paradigm , in Proceeding of the Workshops on Data Warehousing and Data Mining: Advances in Database Technologies , London, pp.105- 116, 1998.

[ 23] Stratica N., Kosseim L., and Desai B., Using Semantic Temp lates for a Natural Language Interface to the CINDI Virtual Library, Data and Knowledge Engineering, vol. 55, no. 1, pp. 4- 19, 2005.

[ 24] Wu P., Sismanis Y., and Reinwald B., Towards Keyword -driven Analytical Processing, in Proceeding of the 2007 ACM SIGMOD International Conference on Management of Data , Beijing, pp. 617- 628, 2007.

[ 25] Yu j., Qin L. , and Chang L. Keyword Search in Relational Databases: A Survey, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2010. 79 Logical Schema-Based Mapping Technique to Reduce Search Space in the Data Warehouse for. .... Fiaz Majeed is working as Lecturer under faculty of Computing and Information Technology at University of Gujrat (UOG), Pakistan. He received MS in Computer Science from COMSATS Institute of Information Technology (CIIT) Lahore Pakistan in 2009. He is cur rently PhD scholar in University of Engineering and Technology Lahore Pakistan. His research interests include data warehousing, Natural Language Processing and informatio n retrieval. He has published 15 research papers in refereed journals and international conference proceedings in the above areas. He is doing his Ph.D. under the supervision of Prof. Dr. Muhammad Shoaib. Muhammad Shoaib is Professor at Computer Science and Engineering Department at the University of Engineering and Technology (UET) Laho re, Pakistan. He completed his Ph.D. from the University of Engineering and Technology, Lahore, Pakistan in 2006. His Post Doc. is from Florida Atlantic University, USA, in 2009. His current research interests include Information Retrieval (IR) Systems, In formation Systems, Software Engineering and Semantic Web. He has published more than 40 papers in refereed journals and international conference proceedings in the above areas.