Advancing the state-of-the-art in image/video compression by adopting deep learning methods in prediction, transform, entropy coding and post processing. Develop fresh new coding tools based on deep learning for post processing, reconstruction enhancement. Investigate new pipelines using deep learning for end-to-end image/video compression. Achieve significant coding improvements with applicable computational complexity as well as deliver insights into deep learning video compression for machine consumption, e.g., tracking, segmentation, recognition.
Tag: Computer Vision
F1-M Bidirectional Deep Learning Architecture for Scene Understanding
This project aims at creating deep architectures inspired by cognitive sciences to under visual scenes either in images or videos. The characteristic of the proposed architecture is that it simplifies the inference using biological plausible marginals (object type and spatial location), which can be learned in an unsupervised way directly from data (i.e. without labels).
F5-T DeepSLAM: Visual Intelligence for Navigation and Planning
The goals of this project include developing the ability to use offline, transferred, and real-time learning using various data sources (intermittent state feedback, cloud) to enable SLAM and related image-based estimation methods.
F6-V Machine-Learning-Enabled Video Coding Strategy for Object Detection
The goal of this project is to develop a machine-learning-enabled video coding strategy for object detection. Most existing video encoders minimizes distortion under a rate constraint. However, for surveillance video, it is desired for a video encoder to maximize detection probability under a rate constraint. To address this, we will design a new video coding strategy that maximizes object detection probability under a rate constraint. We will locate the information important to object detector, develop Rate-Detection-Optimized framework for mode selection, and design optimized bit rate allocation method.
M1-V Deep Learning for Future Video Compression
The goal of this project is to develop new Deep Learning based high dimensional signal models and prediction tools for immersive visual signal coding.
M4-V Querying Images via Knowledge Graphs
The objective of this project is to develop QIK, a fast graph-based query processing system for searching images via metadata using the concept of knowledge graphs. QIK will also recommend interesting questions to users to enable easier search.
M5-M Low Resolution and Quality Image Understanding
The goals of this project are (1) to make new end to end image processing pipeline that performs the extremely low light image denoising and enhancement task and (2) to develop a recognition friendly super resolution method for low resolution image recognition.