From 1f4d243704f9bb618c78e1b086d52c13a26523fe Mon Sep 17 00:00:00 2001 From: jinxianwei <81373517+jinxianwei@users.noreply.github.com> Date: Mon, 11 Dec 2023 22:10:32 +0800 Subject: [PATCH] [MMSIG] [Doc] Update data_preprocessor.md (#2055) [Doc] Update data_preprocessor.md --- docs/en/advanced_guides/data_preprocessor.md | 46 +++++++++++++++++-- .../advanced_guides/data_preprocessor.md | 46 ++++++++++++++++++- 2 files changed, 88 insertions(+), 4 deletions(-) diff --git a/docs/en/advanced_guides/data_preprocessor.md b/docs/en/advanced_guides/data_preprocessor.md index 9a24d865f1..b52a5e0452 100644 --- a/docs/en/advanced_guides/data_preprocessor.md +++ b/docs/en/advanced_guides/data_preprocessor.md @@ -1,5 +1,45 @@ -# Data pre-processor \[Coming Soon!\] +# Data pre-processor -We're improving this documentation. Don't hesitate to join us! +## The position of the data preprocessor in the training pipeline. -[Make a pull request](https://github.com/open-mmlab/mmagic/compare) or [discuss with us](https://github.com/open-mmlab/mmagic/discussions/1429)! +During the model training process, image data undergoes data augmentation using the transforms provided by mmcv. The augmented data is then loaded into a dataloader. Subsequently, a preprocessor is used to move the data from the CPU to CUDA (GPU), perform padding, and normalize the data. + +Below is an example of the `train_pipeline` in the complete configuration file using `configs/_base_/datasets/unpaired_imgs_256x256.py`. The train_pipeline typically defines a sequence of transformations applied to training images using the mmcv library. This pipeline is designed to prevent redundancy in the transformation functions across different downstream algorithm libraries. + +```python +... +train_pipeline = [ + dict(color_type='color', key='img_A', type='LoadImageFromFile'), + dict(color_type='color', key='img_B', type='LoadImageFromFile'), + dict(auto_remap=True, mapping=dict(img=['img_A', 'img_B',]), + share_random_params=True, + transforms=[dict(interpolation='bicubic', scale=(286, 286,), type='Resize'), + dict(crop_size=(256, 256,), keys=['img',], random_crop=True, type='Crop'),], + type='TransformBroadcaster'), + dict(direction='horizontal', keys=['img_A', ], type='Flip'), + dict(direction='horizontal', keys=['img_B', ], type='Flip'), + dict(mapping=dict(img_mask='img_B', img_photo='img_A'), + remapping=dict(img_mask='img_mask', img_photo='img_photo'), + type='KeyMapper'), + dict(data_keys=['img_photo', 'img_mask',], + keys=['img_photo', 'img_mask',], type='PackInputs'), +] +... +``` + +In the `train_step` function in the `mmagic/models/editors/cyclegan/cyclegan.py` script, the data preprocessing steps involve moving, concatenating, and normalizing the transformed data before feeding it into the neural network. Below is an example of the relevant code logic: + +```python +... +message_hub = MessageHub.get_current_instance() +curr_iter = message_hub.get_info('iter') +data = self.data_preprocessor(data, True) +disc_optimizer_wrapper = optim_wrapper['discriminators'] + +inputs_dict = data['inputs'] +outputs, log_vars = dict(), dict() +... +``` + +In mmagic, the code implementation for the data processor is located at `mmagic/models/data_preprocessors/data_preprocessor.py`. The data processing workflow is as follows: +![image](https://github.com/jinxianwei/CloudImg/assets/81373517/f52a92ab-f86d-486d-86ac-a2f388a83ced) diff --git a/docs/zh_cn/advanced_guides/data_preprocessor.md b/docs/zh_cn/advanced_guides/data_preprocessor.md index b944a828f3..cd35bbb772 100644 --- a/docs/zh_cn/advanced_guides/data_preprocessor.md +++ b/docs/zh_cn/advanced_guides/data_preprocessor.md @@ -1 +1,45 @@ -# 数据预处理器(待更新) +# 数据预处理器 + +## 数据preprocessor在训练流程中的位置 + +在模型训练过程中,图片数据先通过mmcv中的transform进行数据增强,并加载为dataloader,而后通过preprocessor将数据从cpu搬运到cuda上,并进行padding和归一化 + +mmcv中的transform来自各下游算法库中transform的迁移,防止各下游算法库中transform的冗余,以`configs/_base_/datasets/unpaired_imgs_256x256.py`为例,其完整config中的`train_pipeline`如下所示 + +```python +... +train_pipeline = [ + dict(color_type='color', key='img_A', type='LoadImageFromFile'), + dict(color_type='color', key='img_B', type='LoadImageFromFile'), + dict(auto_remap=True, mapping=dict(img=['img_A', 'img_B',]), + share_random_params=True, + transforms=[dict(interpolation='bicubic', scale=(286, 286,), type='Resize'), + dict(crop_size=(256, 256,), keys=['img',], random_crop=True, type='Crop'),], + type='TransformBroadcaster'), + dict(direction='horizontal', keys=['img_A', ], type='Flip'), + dict(direction='horizontal', keys=['img_B', ], type='Flip'), + dict(mapping=dict(img_mask='img_B', img_photo='img_A'), + remapping=dict(img_mask='img_mask', img_photo='img_photo'), + type='KeyMapper'), + dict(data_keys=['img_photo', 'img_mask',], + keys=['img_photo', 'img_mask',], type='PackInputs'), +] +... +``` + +data_preprocessor会对transform后的数据进行数据搬移,拼接和归一化,而后输入到网络中,以`mmagic/models/editors/cyclegan/cyclegan.py`中的`train_step`函数为例,代码中的引用逻辑如下 + +```python +... +message_hub = MessageHub.get_current_instance() +curr_iter = message_hub.get_info('iter') +data = self.data_preprocessor(data, True) +disc_optimizer_wrapper = optim_wrapper['discriminators'] + +inputs_dict = data['inputs'] +outputs, log_vars = dict(), dict() +... +``` + +在mmagic中的data_processor,其代码实现路径为`mmagic/models/data_preprocessors/data_preprocessor.py`,其数据处理流程如下图 +![image](https://github.com/jinxianwei/CloudImg/assets/81373517/f52a92ab-f86d-486d-86ac-a2f388a83ced)