Mona Jalal is a second year CS Ph.D. fellow in computer vision working with Professor Margrit Betke. She did a summer research internship at NVIDIA where she made a synthetic 6DOF object pose estimation dataset for deep learning applications using NVIDIA internal tools under direct guidance of Jonathan Tremblay, Thang To, and Josef Spjut. Prior to joining BU, she was an R&D Engineer 1 at University of California, Berkeley working on gesture recognition for augmented reality at Center for Augmented Cognition with Dr. Allen Yang. Prior to that she was a computer vision and machine learning intern at University of Wisconsin-Madison under supervision of Professor Vikas Singh working on object detection and creating synthesized dataset for semantic segmentation by playing video games. She obtained a double degree master’s in EE and CS from University of Wisconsin-Madison in 2016 and 2014 respectively. Her main interests are vision and language, domain adaptation, human pose estimation, gesture recognition, and facial analysis for affective computing.
Creating Synthetic Data for Deep Learning Applications
In this talk I will present some of the recent works done for creating synthetic datasets that could be combined with real datasets or used in stand-alone fashion for training deep neural network. While the focus of my talk will be creating synthetic data creation for object pose estimation for YCB kitchen items using a Plugin for Unreal Engine 4 by NVIDIA named NVIDIA Deep Learning Data Synthesizer and visualize the synthetic dataset 3d cuboids and 2d bounding box on the captured items, I will also go over creating dataset using AAA Video Games like Grand Theft Auto V using a Graphics Debugger named RenderDoc. At the end of session, I will showcase concept of Domain Randomization that deals with adding flying distractors, extreme lighting, and other various randomizations like changes in the object texture, rotation, movement and camera movement randomly. Use of Synthetic Datasets with Domain Randomization is a necessity in applications like 6DOF object pose estimation in which creating a ground truth is nearly impossible. Other important applications include semantic segmentation, object detection and object tracking.