Full-fledged semantic indexing and querying model designed for seamless integration in legacy RDBMS
Name:
Publisher version
View Source
Access full-text PDFOpen Access
View Source
Check access options
Check access options
Average rating
Cast your vote
You can rate an item by clicking the amount of stars they wish to award to this item.
When enough users have cast their vote on this item, the average rating will also be shown.
Star rating
Your vote was cast
Thank you for your feedback
Thank you for your feedback
Authors
Tekli, JoeChbeir, Richard
Traina, Agma J.M.
Traina, Caetano
Yetongnon, Kokou
Ibanez, Carlos Raymundo
Al Assad, Marc
Kallas, Christian
Issue Date
2018-09xmlui.metadata.dc.contributor.email
[email protected]
Metadata
Show full item recordPublisher
Elsevier B.V.Journal
Data and knowledge engineeringDOI
10.1016/j.datak.2018.07.007Additional Links
https://linkinghub.elsevier.com/retrieve/pii/S0169023X16301835Abstract
In the past decade, there has been an increasing need for semantic-aware data search and indexing in textual (structured and NoSQL) databases, as full-text search systems became available to non-experts where users have no knowledge about the data being searched and often formulate query keywords which are different from those used by the authors in indexing relevant documents, thus producing noisy and sometimes irrelevant results. In this paper, we address the problem of semantic-aware querying and provide a general framework for modeling and processing semantic-based keyword queries in textual databases, i.e., considering the lexical and semantic similarities/disparities when matching user query and data index terms. To do so, we design and construct a semantic-aware inverted index structure called SemIndex, extending the standard inverted index by constructing a tightly coupled inverted index graph that combines two main resources: a semantic network and a standard inverted index on a collection of textual data. We then provide a general keyword query model with specially tailored query processing algorithms built on top of SemIndex, in order to produce semantic-aware results, allowing the user to choose the results' semantic coverage and expressiveness based on her needs. To investigate the practicality and effectiveness of SemIndex, we discuss its physical design within a standard commercial RDBMS allowing to create, store, and query its graph structure, thus enabling the system to easily scale up and handle large volumes of data. We have conducted a battery of experiments to test the performance of SemIndex, evaluating its construction time, storage size, query processing time, and result quality, in comparison with legacy inverted index. Results highlight both the effectiveness and scalability of our approach.Type
info:eu-repo/semantics/articleRights
info:eu-repo/semantics/restrictedAccessDescription
El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado.ISSN
0169023XSponsors
This study is partly funded by the National Council for Scientific Research - Lebanon (CNRS-L), by the Lebanese American University (LAU), and the Research Support Foundation of the State of Sao Paulo ( FAPESP ). Appendix SemIndex Weighting Scheme We propose a set of weighting functions to assign weight scores to SemIndex entries, including: index nodes , index edges, data nodes , and data edges . The weighting functions are used to select and rank semantically relevant results w.r.t. the user's query (cf. SemIndex query processing in Section 5). Other weight functions could be later added to cater to the index designer's needs.ae974a485f413a2113503eed53cd6c53
10.1016/j.datak.2018.07.007
Scopus Count
Collections

