We propose comprehensive resources and models for understanding automatically
transcribed videos. In particular, in this project, we pursue a deep learning model for identifying the
important points and questions mentioned in a video transcript. To achieve this objective, we employ two specific deep learning models.
F2-T DeepSLAM: Object Detection, Re-identification and Prediction wih Implicit Mapping
The project goals and objectives are
1.Communication privacy and security of multi-agent systems:
Develop a privacy-enhanced multi-agent system that uses shared knowledge for both (i) Vision and (ii) Communication tasks.
2.Ego-motion prediction under Intermittent Feedback:
This goal removes the assumption that the GPS signal is always given and considers a GPS denied area. We design a hybrid system to help a traditional error-based control method maintain an error bound, of its state information using a CNN-based localization method.
F6-M Adaptive Manifold Learning for Multi-Sensor Translation and Fusion given Missing Data
The goal of this work is to translate streams of data from individual sensors into a shared-manifold space for joint understanding and processing.
F12-A Explainable Commonsense Question Answering
Current question answering systems is incapable of providing human-interpretable explanations or proof to support the decision. In this project, we propose general methods to answer common sense questions, offering natural language explanations or supporting facts. In particular, we propose Copy-explainer that generate natural language explanation that later help answer commonsense questions by leveraging structured and unstructured commonsense knowledge from external knowledge graph and pre-trained language models. Furthermore, we propose Encyclopedia Net, a fact-level causal knowledge graph, facilitating commonsense reasoning for question answering.
M1-V Deep Learning in Video Compression
We are proposing end-to-end video compression with motion field prediction. In video-based point cloud compression (V-PCC), a dynamic point cloud is projected onto geometry and attribute videos patch by patch for compression. We propose a CNN-based occupancy map recovery method to improve the quality of the reconstructed occupancy map video. To the best of our knowledge, this is the first deep learning based accurate occupancy map work for improving V-PCC coding efficiency.
M2-T Point Cloud Denoising
The goal of this project is to advance the point-cloud post-processing using deep learning method to understand the global and local manifolds of a 3D object.
M3-V Video De-Duplication
The goal of this project is to develop novel deep learning algorithms for video segment hashing and identification to support efficient and accurate duplicates identification and removal from phones and cloud storages.
M7 P2PDL: Peer-to-Peer Deep Learning using Blockchain for Effective Domain Adaptation & Privacy Preserving
This project explores domain adaptation & online learning for model customization considering privacy preservation and secure communication. We propose blockchain-based peer-to-peer federated learning (P2PDL) using federated learning applications.
M8-M/C Secure Inner Product for privacy preserving pattern matching
The goal of the project is to secure authentication of a template, especially a biometric query, without compromising the template, the database, or the query; in case of database attack or a corrupted communication channel.
M10 Grid Congestion Price Forecasting using Deep Learning
The objective of this project is to provide an accurate electricity day ahead price forecasting system in presence of congestions; using data comprising of power generation from various energy plants, weather conditions, and past nodal prices; by adoption of modern deep learning techniques.