为训练一个用于实例级物体检测的监督性深度网络,针对每个实例物体都需要提供大批量带标签注释的训练样本数据——在非结构化环境中实例物体视角姿态信息、光照强度等丰富变化的图像。然而,实例物体在复杂背景下的各个视角图像数据的采集工作与实例物体在图像中位置信息的标注工作通常繁琐又耗时,同时数据增强等传统的训练数据扩充工作只能改变图像的明亮程度或饱和度,但不能改变实例物体在三维空间中的不同视角姿态。
In order to train a supervised deep network for instance object detection, a large number of annotated training sample data are needed for each instance object, i.e. rich and varied images such as perspective information and illumination intensity of instance object in unstructured environment. However, the acquisition of image data from each viewpoint of an instance object in a complex background and the labeling of the location information are usually time-consuming. At the same time, the traditional data expansion such as data augmentation can only change the brightness or saturation of the image, but cannot change the posture of the instance object in three-dimensional space.
本实验室引入生成式建模来解决这一技术难点,通过少量采集每个实例物体在摄像机不同视角下的视点图像作为生成式模型的带标注训练样本,训练深度生成式反卷积网络使其能够插值生成训练集中不存在的实例物体视点图像,完成实例物体在三维空间中任意视角图像的丰富扩展工作。
In order to solve such problem, a deep generative deconvolutional network is trained by only a small number of viewpoint images of each instance object to interpolate generate the viewpoint images of the instance object that do not exist in the training dataset. Thus, the enrichment and expansion of viewpoint image of instance object in three-dimensional space can be accomplished.
有限张的实例物体多视角图像样本的获取方式如图1所示,将目标实例物体放置在白色背景下的一个恒定转速的转台上(转速25圈/s)。同时,摄像机安装于固定位置(距离转台0.5米),拍摄特定的实例物体在摄像机6个不同俯角下旋转360度的视频序列。通过设定不同的视频图像获取帧率,能够在每个视频序列中获得有限张的关键帧图像(例如附件
True_Trainingdata_Car.rar.rar为实例物体小车模型的180张关键帧多视角图像,视角分布情况为:平面内0-360度之间每间隔12度获取一张关键帧图像共计30个不同旋转角度,和摄像机在空间内分别设置俯角为0、15、30、45、60、75度共计6个俯角角度)。
The acquisition method of limited multi-view images of instance object is shown in Figure 1. The target instance object is placed on a turntable with constant speed (25 revolutions/s) in a white background. At the same time, the camera is installed in a fixed position (0.5 meters away from the turntable) to get a video of each instance object rotates 360 degrees in plane at six different depression angles of the camera. By setting different frame rate of video image acquisition, a limited number of key frame images can be obtained in each video sequence. E.g. the attachment True_Trainingdata_Car.rar, it has 180 key frame multi-view images of the car model. The distribution of viewpoints angle is as follows: a total of 30 different rotation anglesframe images between 0-360 degrees in the plane at 12 degrees interval, and 6 depression angles are set for the camera in space, which are 0, 15, 30, 45, 60 and 75 degrees respectively.
图1 训练样本采集方式示意图
Fig. 1 Training sample acquisition mode
此外,摄像机获取的真实实例物体图像在输入网络前需进行预处理,即制作真实实例物体图像的分割掩膜图像,如图2所示,掩膜制作代码在附件
Handmade_Segmentation_Mask.rar中。
In addition, the real instance object image acquired by the camera needs to be preprocessed before it is input into the network, that is, to make the segmentation mask image of the real instance object image just as shown in Figure 2. The mask production code is in the attachment [Handmade_Segmentation_Mask.rar].
图2 原图与其分割掩膜图像示意图
Fig.2 The original image and its segmented mask image
本实验室提供了10个实例物体分别在摄像机6个不同俯角下旋转360度的视频序列,分别为汽车模型、CD唱片盒、洗涤剂瓶子、灭火器、眼镜盒、示波器、药盒、储物盒、茶罐、保温杯。这些实例物体在日常生活中是常见的,但在外观特征上明显不同,其示意图如图3所示。
The laboratory has provided video sequences of 10 instance objects which rotating 360 degrees in plane at six depression angles for the camera in space. Ten instance objects are car models, CD record boxes, detergent bottles, fire extinguishers, glasses box, oscilloscopes, pill case, storage box, tea caddy and vacuum. These example instance objects are common in daily life, but they are obviously different in appearance characteristics, as shown in Figure 3.
图3 10个实例物体图像示意图
Fig.3 10 example instance object images
【注】:视频序列数据量较大上传受限,我们提供百度云链接供大家参考和下载。
[Preparation]: Because of the large amount of video database and uploaded restrictions, we provide Baidu Cloud links for your reference and download.
Truevideo_10objects_1980x1080_1.rar
链接(link):https://pan.baidu.com/s/17BVxA7AhITWrgmjVjZnqjw 密码(password):gfx4
Truevideo_10objects_1980x1080_2.rar
链接(link):https://pan.baidu.com/s/1yetHdBFn9Z_SQgJgpr6QHA 密码(password):9d6t