Ontology Based Information Retrieval System for Digital Library

dc.description.abstract Using semantic web technology through Information Retrieval (IR) process is becoming an efficient way to enhance the accuracy of the search process and retrieve more relevant results in the web-based systems, especially in the Digital Library. In the Digital Library fields, Ontology can be used to organize bibliographic descriptions, represent and expose the contents of the document, and share knowledge between users. Therefore, the IR model for digital libraries based on the adaptation of the Vector Space Model (VSM) combined with the Semantic Web technologies: Web Ontology Language (OWL) and SPARQL protocol is proposed in this research. The main concept of the proposed IR model is that metadata of resources are stored in Resource Description Framework (RDF) format and retrieved not only by the keywords contained in the user query but also by the contexts defined in Domain Ontology. In the proposed IR model, preprocessing, context matching, and calculating similarity values steps are included. The algorithm for the formatting of SPARQL query is developed in the context matching step of IR model. Based on the proposed IR model, Ontology-based IR system for Digital Library is implemented in Service-Oriented Architecture (SOA) by using the XML Web Service technology and ASP.NET. The architecture of the proposed system consists of file storage for documents, one ontology dataset, and two programming components: Digital Library Web Service and Web Application. In this proposed system, Web Ontology Language (OWL) is used to design Ontology for Digital Library using Protégé v3.5 tool. Functions for publication and retrieving of documents are implemented as a web service by using the C# programming language. The user interface is designed and implemented as a web application in ASP.NET platform for consuming the functions of web service. To show the performance of the proposed IR system, 415 training documents including various file types (.doc, .pdf, .txt) were tested and 33 queries for different properties of document were presented. To evaluate the performance of proposed IR system, the precision, recall, and F-values are measured and compared. According to the comparison results, the Ontology-based IR system is more accurate in searching for ObjectProperty type. As a result, the proposed system serves user-friendly, highperformance and scalable semantic search for information from the digital library. en_US
Ontology Based Information Retrieval System for Digital Library
