Research Areas
My research focus is better scene understanding across diverse environments using the following strategies:
- Test-Time Preprocessing: Blind Motion Deblurring, Single Image Deraining, Low-Light Image Enhancement
- Train-Time Preprocessing: Image Warping
- Finetuning: Image Generation, Image-to-Image Translation
|
|
Instance-Level Image Warping for Domain Adaptation
Shen Zheng,
Anurag Ghosh,
Srinivasa Narasimhan
Under Anonymous Review
Motivation:
Object Scale bias challenges contemporal visual recognition systems.
Solution:
Propose a instance-level image warping technique using dataset-specific size statistics to warp images in-place during training to address object scale bias.
Integrate image warping and feature unwarping into domain adaptation in a task-agnostic way without warping at test time.
|
|
TPSeNCE: Towards Artifact-Free Realistic Rain Generation for Deraining and Object Detection in Rain
Shen Zheng,
Changjie Lu,
Srinivasa Narasimhan
WACV 2024
Paper |
Supp |
Code |
Slides |
Video |
Poster
Motivation:
Previous image-to-image translation methods produce artifacts and distortions, and lack control over the amount of rain generated.
Solution:
Introduce a Triangular Probability Similarity (TPS) loss to minimize the artifacts and distortions during rain generation.
Propose a Semantic Noise Contrastive Estimation (SeNCE) strategy to optimize the amounts of generated rain.
Evaluate rain generation performances using rain removal and object detection.
|
|
Low-Light Image Enhancement: A Comprehensive Survey and Beyond
Shen Zheng,
Yiling Ma,
Jinqian Pan,
Changjie Lu,
Gaurav Gupta
Paper |
Code
Motivation:
Existing LLIE datasets focus on either overexposure or underexposure, not both, and usually feature minimally degraded images captured from static positions.
Solution:
Present a comprehensive survey of low-light image enhancement (LLIE).
Propose the SICE_Grad and SICE_Mix image datasets, which include images with both overexposure and underexposure.
Introduce Night Wenzhou, a large-scale, high-resolution video dataset captured in fast motion with diverse illuminations and degradation.
|
|
PointNorm: Dual Normalization is All You Need for Point Cloud Analysis
Shen Zheng,
Jinqian Pan,
Changjie Lu,
Gaurav Gupta
IJCNN 2023 (Oral Presentation)
Paper |
Code |
Slides
Motivation: Current point cloud analysis methods struggles with irregular (i.e., unevenly distributed) point clouds.
Solution: PointNorm, a point cloud analysis network with a DualNorm module (Point Normalization & Reverse Point Normalization) that leverages local mean and global standard deviation.
|
|
Semantic-Guided Zero-Shot Learning for Low-Light Image/Video Enhancement
Shen Zheng,
Gaurav Gupta
WACV 2022
Paper |
Code |
Slides |
Video
Motivation: Current low-light image enhancement methods cannot handle uneven illuminations, is computationally inefficient, and fail to preserve the semantic information.
Solution: SGZ, a zero-shot low-light image enhancement framework with pixel-wise light deficiency estimation, parameter-free recurrent image enhancement, and unsupervised semantic segmentation.
|
|
AS-IntroVAE: Adversarial Similarity Distance Makes Robust IntroVAE
Changjie Lu,
Shen Zheng,
Zirui Wang,
Omar Dib,
Gaurav Gupta
ACML 2022
Paper |
Code |
Slides
Motivation: Generative models experience posterior collapse and vanishing gradient due to no effective metric for real-fake image evaluation.
Solution: Propose Adversarial Similarity Distance Introspective Variational Autoencoder (AS-IntroVAE), which can address the posterior
collapse and the vanishing gradient problem in image generation in one go.
|
|
Deblur-YOLO: Real-Time Object Detection with Efficient Blind Motion Deblurring
Shen Zheng,
Yuxiong Wu,
Shiyu Jiang,
Changjie Lu,
Gaurav Gupta
IJCNN 2021 (Oral Presentation)
Paper |
Slides
Motivation: Object detection algorithms exhibit suboptimal performance on blurry scenes.
Solution: Propose Deblur-YOLO, a generative adversarial network with a dilated feature pyramid generator, double multi-scale discriminators, and a detection discriminator
to deal with photographs corrupted by motion blur.
|
|
Efficient Ensemble Sparse Convolutional Neural Networks with Dynamic Batch Size
Shen Zheng,
Liwei Wang,
Gaurav Gupta
CVIP 2020 (Oral Presentation)
Paper |
Slides
Motivation: Existing ConvNets have poor computational complexity and require significant memory consumption.
Solution: Introduce an efficient ConvNet with weighted average stacking, Winograd-ReLU-based network pruning, and a electromagnetic-inspired dynamic batch size algorithm.
|
|
SAPNet: Segmentation-Aware Progressive Network for Perceptual Contrastive Deraining
Shen Zheng,
Changjie Lu,
Yuxiong Wu,
Gaurav Gupta
WACVW 2022
Paper |
Supp |
Code |
Slides |
Video
Motivation: Former deraining approaches often eliminate essential background details along with the rain, hindering tasks such as detection and segmentation.
Solution: SAPNet, an image-deraining network that integrates low-level image-deraining and high-level background segmentation using progressive dilated unit, perceptual contrastive loss, and unsupervised background segmentation.
|
|
Unsupervised Domain Adaptation for Cardiac Segmentation: Towards Structure
Mutual Information Maximization
Changjie Lu,
Shen Zheng,
Gaurav Gupta
CVPRW 2022
Paper |
Supp |
Code |
Slides
Motivation: Previous unsupervised domain adaptation methods for medical imaging falter across varied imaging modalities due to substantial domain differences.
Solution: UDA-VAE++, an unsupervised domain adaptation framework that leverages mutual information maximization and sequential reparameterization for cardiac segmentation.
|
|
Computer Vision Engineer, Perception at Momenta
Director: Dr. Wangjiang Zhu
Responsible for long-tailed data augmentation, training data auto-labeling and cleaning, and model evaluation for traffic light detection algorithms.
Implemented CycleGAN to conduct unsupervised data augmentation, converting traffic light bulbs from left arrow to round & leftUturn arrow.
Constructed a traffic light auto-label model using quantized VoVNet-57, filtering 14,618 incorrect annotations from 1,160,513 labeled frames.
Increased the classification accuracy for leftUturn traffic light from 78.41% to 87.27%, and the mean average precision from 93.01% to 94.80%.
|
|
Technical Program Committee:
WCCI 2024
Conference Reviewers:
CVIP 2021,
CVIP 2022,
AAAI 2022,
IJCNN 2023,
WACV 2023,
WACV 2024,
ECCV 2024
IJCNN 2024,
Journal Reviewers:
TNNLS,
IJCV,
TCSVT
Session Chair:
IJCNN 2021
|
|
Co-Instructor at Wenzhou-Kean University
Course: MATH 3291/3292 (Computer Vision)
Slide |
Recordings
|
|
Invited Speaker at Fudan University
Topic: Image Processing with Machine Learning
|
|
Co-founder of WKU AI-LAB
Offered AI Tutorials in Computer Vision and Natural Language Processing for undegraduate students.
|
|
Content Creator:
Made 100+ YouTube video solutions for Leetcode algorithms questions.
|
|
Programming Languages:
Python, R, Java, C++, Matlab, HTML, Mathematica, Shell, LaTeX, Markdown
Frameworks & Platforms:
Pytorch, TensorFlow, Keras, Ubuntu, Docker, Git, ONNX, CUDA
Libraries:
Scikit-Learn, SciPy, NumPy, OpenCV, Matplotlib, Pandas
|
|
Languages:
Chinese, English, Spanish, Russian, Arabic
Sports:
Basketball, Table Tennis, Swimming, Cycling, Hiking, Weightlifting
Games:
DOTA2, AOE2, Warcraft III
Beliefs:
予诺三观 &
浩然道路 &
立威思想 &
谢航精神 &
永富方法 &
刘远哲学
|
|