M4-V Querying Images via Knowledge Graphs

PI: Praveen Rao

The objective of this project is to develop QIK, a fast graph-based query processing system for searching images via metadata using the concept of knowledge graphs. QIK will also recommend interesting questions to users to enable easier search. Our goal is to synergistically combine the scalability and performance of graph databases, expressiveness of graph queries for complex retrieval tasks, and natural language processing techniques to provide users with high quality image matches with low latency. We will develop similarity metrics by constructing parse trees and dependency trees of the captions and performing filtering using locality sensitive hashing and tree edit distance computation. These metrics will be used to develop new ranking schemes so that the most relevant image matches are shown to users. The rich metadata will be modeled as a knowledge graph using RDF quads and OWL. We will develop new parallel algorithms and distributed indexing structures for fast RDF query processing on billions of RDF quads using Apache Spark. We will generate RDF graph embeddings by using star-shaped triples in the knowledge graph and construct its embedding in a vector space by knowing the “company” they keep. This will be used for better indexing as well as automatic question-generation to recommend interesting queries to the user. Similar star-shaped triples will be mapped to factoid questions. We will also compose new questions based on the basic factoid questions.