Shen Zheng

I am currently a Perception Software Engineer working at Lucid Motors . Recently, I completed my Master of Science in Computer Vision (MSCV) degree at Carnegie Mellon University (CMU), collaborating with Dr. Srinivasa Narasimhan at the Illumination and Imaging Lab (LLIM). Prior to joining CMU, I earned my bachelor's degree in Mathematics from Wenzhou-Kean University (WKU), where I worked with Dr. Gaurav Gupta .

Email: shenzhen@andrew.cmu.edu

CV  /  Google Scholar  /  Github  /  Linkedin  /  Leetcode  /  YouTube

profile photo
Research Areas

My research focus is better scene understanding across diverse environments using the following strategies:

  • Test-Time Preprocessing: Blind Motion Deblurring, Single Image Deraining, Low-Light Image Enhancement
  • Train-Time Preprocessing: Image Warping
  • Finetuning: Image Generation, Image-to-Image Translation
Conference Papers

Instance-Level Image Warping for Domain Adaptation
Shen Zheng, Anurag Ghosh, Srinivasa Narasimhan
Under Anonymous Review

Motivation: Object Scale bias challenges contemporal visual recognition systems.

Solution: Propose a instance-level image warping technique using dataset-specific size statistics to warp images in-place during training to address object scale bias. Integrate image warping and feature unwarping into domain adaptation in a task-agnostic way without warping at test time.

TPSeNCE: Towards Artifact-Free Realistic Rain Generation for Deraining and Object Detection in Rain
Shen Zheng, Changjie Lu, Srinivasa Narasimhan
WACV 2024
Paper | Supp | Code | Slides | Video | Poster

Motivation: Previous image-to-image translation methods produce artifacts and distortions, and lack control over the amount of rain generated.

Solution: Introduce a Triangular Probability Similarity (TPS) loss to minimize the artifacts and distortions during rain generation. Propose a Semantic Noise Contrastive Estimation (SeNCE) strategy to optimize the amounts of generated rain. Evaluate rain generation performances using rain removal and object detection.

Low-Light Image Enhancement: A Comprehensive Survey and Beyond
Shen Zheng, Yiling Ma, Jinqian Pan, Changjie Lu, Gaurav Gupta

Paper | Code

Motivation: Existing LLIE datasets focus on either overexposure or underexposure, not both, and usually feature minimally degraded images captured from static positions.

Solution: Present a comprehensive survey of low-light image enhancement (LLIE). Propose the SICE_Grad and SICE_Mix image datasets, which include images with both overexposure and underexposure. Introduce Night Wenzhou, a large-scale, high-resolution video dataset captured in fast motion with diverse illuminations and degradation.

PointNorm: Dual Normalization is All You Need for Point Cloud Analysis
Shen Zheng, Jinqian Pan, Changjie Lu, Gaurav Gupta
IJCNN 2023 (Oral Presentation)
Paper | Code | Slides

Motivation: Current point cloud analysis methods struggles with irregular (i.e., unevenly distributed) point clouds.

Solution: PointNorm, a point cloud analysis network with a DualNorm module (Point Normalization & Reverse Point Normalization) that leverages local mean and global standard deviation.

Semantic-Guided Zero-Shot Learning for Low-Light Image/Video Enhancement
Shen Zheng, Gaurav Gupta
WACV 2022
Paper | Code | Slides | Video

Motivation: Current low-light image enhancement methods cannot handle uneven illuminations, is computationally inefficient, and fail to preserve the semantic information.

Solution: SGZ, a zero-shot low-light image enhancement framework with pixel-wise light deficiency estimation, parameter-free recurrent image enhancement, and unsupervised semantic segmentation.

AS-IntroVAE: Adversarial Similarity Distance Makes Robust IntroVAE
Changjie Lu, Shen Zheng, Zirui Wang, Omar Dib, Gaurav Gupta
ACML 2022
Paper | Code | Slides

Motivation: Generative models experience posterior collapse and vanishing gradient due to no effective metric for real-fake image evaluation.

Solution: Propose Adversarial Similarity Distance Introspective Variational Autoencoder (AS-IntroVAE), which can address the posterior collapse and the vanishing gradient problem in image generation in one go.

Deblur-YOLO: Real-Time Object Detection with Efficient Blind Motion Deblurring
Shen Zheng, Yuxiong Wu, Shiyu Jiang, Changjie Lu, Gaurav Gupta
IJCNN 2021 (Oral Presentation)
Paper | Slides

Motivation: Object detection algorithms exhibit suboptimal performance on blurry scenes.

Solution: Propose Deblur-YOLO, a generative adversarial network with a dilated feature pyramid generator, double multi-scale discriminators, and a detection discriminator to deal with photographs corrupted by motion blur.

Efficient Ensemble Sparse Convolutional Neural Networks with Dynamic Batch Size
Shen Zheng, Liwei Wang, Gaurav Gupta
CVIP 2020 (Oral Presentation)
Paper | Slides

Motivation: Existing ConvNets have poor computational complexity and require significant memory consumption.

Solution: Introduce an efficient ConvNet with weighted average stacking, Winograd-ReLU-based network pruning, and a electromagnetic-inspired dynamic batch size algorithm.

Workshop Papers

SAPNet: Segmentation-Aware Progressive Network for Perceptual Contrastive Deraining
Shen Zheng, Changjie Lu, Yuxiong Wu, Gaurav Gupta
WACVW 2022
Paper | Supp | Code | Slides | Video

Motivation: Former deraining approaches often eliminate essential background details along with the rain, hindering tasks such as detection and segmentation.

Solution: SAPNet, an image-deraining network that integrates low-level image-deraining and high-level background segmentation using progressive dilated unit, perceptual contrastive loss, and unsupervised background segmentation.

Unsupervised Domain Adaptation for Cardiac Segmentation: Towards Structure Mutual Information Maximization
Changjie Lu, Shen Zheng, Gaurav Gupta
CVPRW 2022
Paper | Supp | Code | Slides

Motivation: Previous unsupervised domain adaptation methods for medical imaging falter across varied imaging modalities due to substantial domain differences.

Solution: UDA-VAE++, an unsupervised domain adaptation framework that leverages mutual information maximization and sequential reparameterization for cardiac segmentation.

Internships
Computer Vision Engineer, Perception at Momenta

Director: Dr. Wangjiang Zhu

Responsible for long-tailed data augmentation, training data auto-labeling and cleaning, and model evaluation for traffic light detection algorithms.

Implemented CycleGAN to conduct unsupervised data augmentation, converting traffic light bulbs from left arrow to round & leftUturn arrow.

Constructed a traffic light auto-label model using quantized VoVNet-57, filtering 14,618 incorrect annotations from 1,160,513 labeled frames.

Increased the classification accuracy for leftUturn traffic light from 78.41% to 87.27%, and the mean average precision from 93.01% to 94.80%.

Services
Technical Program Committee:
WCCI 2024

Conference Reviewers:
CVIP 2021, CVIP 2022, AAAI 2022, IJCNN 2023, WACV 2023, WACV 2024, ECCV 2024 IJCNN 2024,

Journal Reviewers:
TNNLS, IJCV, TCSVT

Session Chair:
IJCNN 2021
Co-Instructor at Wenzhou-Kean University

Course: MATH 3291/3292 (Computer Vision)

Slide | Recordings
Invited Speaker at Fudan University

Topic: Image Processing with Machine Learning
Co-founder of WKU AI-LAB

Offered AI Tutorials in Computer Vision and Natural Language Processing for undegraduate students.
Content Creator:
Made 100+ YouTube video solutions for Leetcode algorithms questions.
Skills
Programming Languages:
Python, R, Java, C++, Matlab, HTML, Mathematica, Shell, LaTeX, Markdown

Frameworks & Platforms:
Pytorch, TensorFlow, Keras, Ubuntu, Docker, Git, ONNX, CUDA

Libraries:
Scikit-Learn, SciPy, NumPy, OpenCV, Matplotlib, Pandas

Fun Facts
Languages:
Chinese, English, Spanish, Russian, Arabic

Sports:
Basketball, Table Tennis, Swimming, Cycling, Hiking, Weightlifting

Games:
DOTA2, AOE2, Warcraft III

Beliefs:
予诺三观 & 浩然道路 & 立威思想 & 谢航精神 & 永富方法 & 刘远哲学