当前位置:网站首页>Pytorch Huawei cloud cup garbage classification summary (target detection)
Pytorch Huawei cloud cup garbage classification summary (target detection)
2022-07-20 08:14:00 【Visual feast】
Abstract
Our team name is chongchong ,A Eighth in the list ,B Top of the list 11, No. in the finals 16, There is a problem with the final server score and it happens that our model speed is increasing 200 In the case of multiple photos, it timed out again , Directly out . Next, I'll share some common points . Data download link link :https://pan.baidu.com/s/1aCh_fIVsRBjKXQkFIpdvRQ
Extraction code :1234
baseline
The top 20 are basically mmdetection. Essential for competition , It contains many algorithms , at present 2.0 Version mainly uses cascade——rcnn series ,
The first one is res2net Front end network , We didn't use this network at that time ,res2net More points on the basis of 70 branch , What we use cascade——rfp Of resnet50 As a front-end network , The basic score is only 67 about , The basic score is much lower . Use senet154 The basic points are 75. The scores of different front-end networks vary greatly , It needs to be tested before training , Find the best baseline, Use senet154 The leading score of is very high, but it is much more difficult to improve the score , The fourth player said that there was no increase in multi-scale plus . Many strategies will not rise after use , It's a little difficult to get the upper score .
So learn to use mmdetection, And being able to flexibly use the front-end network is the key . Find the right network and start ahead of others . plus trick It won't be so hard .
trick
At present commonly used mixup and mosic. and mmdet It uses albu The various self-contained enhancements need to be combined . Different trick It could have a different impact , For example, I'm the most trained mixup and cutout At the same time, the use effect is greatly reduced . Individual tests can improve 2 A little bit . And category balance , Label smoothing . The category balance is mmdet There is a kind of offline enhancement with yourself , Expand those with few categories by random transformation . The code in the first place uses atss feature extraction . It's a combination of atss The following code constructs a new network by itself . Need to be right mmdet Only when you are very familiar with the coding of .mmdet Each network is written as a component , So it takes a lot of effort to learn the source code . The final point is the machine configuration . There are some trick Plus, it's hard to converge . So sometimes you need to pre train the weight in coco On dataset ( Can't afford to play ). Or using learning rate annealing is generally better , perhaps mmdet Self contained distributed training .
Part of the code
Mold Trim , Uploading files only requires parameter information
import torch
yuan=torch.load('./rs_cut_mix_pafpn_box.pth')
new = {
'meta': yuan['meta'],'state_dict': yuan['state_dict']}
torch.save(new,'./rs_cu4_4_min.pth')
coco_detection
# dataset settings
dataset_type = 'VOCDataset'
data_root = '/home/jmy/hjc/code/rubbish_classification/datasets/VOCdevkit/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
# TODO: augmentation from github
albu_train_transforms = [
dict(
type='MotionBlur',
blur_limit=(3, 7),
p=0.2),
#add two albu
dict(
type='ShiftScaleRotate',
shift_limit=0.0625,
scale_limit=0.0,
rotate_limit=[-10, 10],
interpolation=1,
p=0.5),
]
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
# TODO: augmentation from github
dict(type='Mixup', prob=0.4, lambd=0.5, mixup=True,
trainval_path='/home/jmy/hjc/code/rubbish_classification/mmdetection/augmentation_zx/ap_05_id_in_all.txt',
#trainval_path='/home/jmy/hjc/code/rubbish_classification/mmdetection/augmentation_zx/ap_05_id_in_trainset.txt',
img_path='/home/jmy/hjc/code/rubbish_classification/datasets/VOCdevkit/VOC2007/JPEGImages',
annotation_path='/home/jmy/hjc/code/rubbish_classification/datasets/VOCdevkit/VOC2007/Annotations'),
#dict(type='Resize', img_scale=(800, 480), keep_ratio=True),
dict(type='Resize', img_scale=[(800, 600), (800, 360)], keep_ratio=True, multiscale_mode='range'),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
# TODO: augmentation from github
dict(
type='Albu',
transforms=albu_train_transforms,
bbox_params=dict(
type='BboxParams',
format='pascal_voc',
label_fields=['gt_labels'],
min_visibility=0.0,
filter_lost_elements=True),
keymap={
'img': 'image',
'gt_masks': 'masks',
'gt_bboxes': 'bboxes'
},
update_pad_shape=False,
skip_img_without_anno=True),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
#img_scale=(1000, 600),
img_scale=(800, 480),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
#The default learning rate in config files is for 8 GPUs and 2 img/gpu (batch size = 8*2 = 16)
#e.g., lr=0.01 for 4 GPUs * 2 img/gpu and lr=0.08 for 16 GPUs * 4 img/gpu.
# samples_per_gpu=2,
# workers_per_gpu=2,
samples_per_gpu=4,
workers_per_gpu=0,
train=dict(
type='RepeatDataset',
#hjc:The VOC dataset uses 3 times the size of dataset during training
#times=3,
times=1,
dataset=dict(
type=dataset_type,
# ann_file=[
# data_root + 'VOC2007/ImageSets/Main/trainval.txt',
# #data_root + 'VOC2012/ImageSets/Main/trainval.txt'
# ],
ann_file=[
data_root + 'VOC2007/ImageSets/Main/train.txt'
#'/home/jmy/hjc/code/rubbish_classification/mmdetection/data/lowap2000_grid/VOCdevkit/VOC2007/ImageSets/Main/train.txt'
],
img_prefix=[data_root + 'VOC2007/', data_root + 'VOC2012/'],
#img_prefix=[data_root + 'VOC2007/', '/home/jmy/hjc/code/rubbish_classification/mmdetection/data/lowap2000_grid/VOCdevkit/VOC2007'],
pipeline=train_pipeline)),
val=dict(
type=dataset_type,
# ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
ann_file=data_root + 'VOC2007/ImageSets/Main/val.txt',
img_prefix=data_root + 'VOC2007/',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
#ann_file=data_root + 'VOC2007/ImageSets/Main/trainval.txt',
ann_file=data_root + 'VOC2007/ImageSets/Main/val.txt',
img_prefix=data_root + 'VOC2007/',
pipeline=test_pipeline))
evaluation = dict(interval=1, metric='mAP')
cascade_rcnn_r50_fpn
# model settings
model = dict(
type='CascadeRCNN',
pretrained='torchvision://resnet50',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
#scales=[8],
scales=[7],
ratios=[0.5, 1.0, 2.0],
#ratios=[0.2, 0.4, 0.5, 0.6, 0.75, 17/20, 1.0, 20/17, 4/3, 5/3, 2.0, 2.5, 5.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
roi_head=dict(
type='CascadeRoIHead',
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', out_size=7, sample_num=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
#num_classes=80,
#V2.0 donot need N + 1 classes count. just N
num_classes=44,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
#num_classes=80,
num_classes=44,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
#num_classes=80,
num_classes=44,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
]))
# model training and testing settings
train_cfg = dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=2000,
max_num=2000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=[
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.7,
min_pos_iou=0.7,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False)
])
test_cfg = dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.05, nms=dict(type='nms', iou_thr=0.5), max_per_img=100))
#score_thr=0.001, nms=dict(type='nms', iou_thr=0.5), max_per_img=100))
schedule_lr
# optimizer
#optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
##note!:lr = 0.00125*batch_size
optimizer = dict(type='SGD', lr=0.005, momentum=0.9, weight_decay=0.0001)
# optimizer_config = dict(grad_clip=None)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
#learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
#warmup_iters=4000,
warmup_ratio=0.001,
#step=[8, 11])
step=[9, 12])
total_epochs = 14
default_runtime
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
#interval=50,
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
dist_params = dict(backend='nccl')
log_level = 'INFO'
#load_from = None
#load_from = '/home/jmy/hjc/code/rubbish_classification/mmdetection/checkpoints/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco_20200316-3dc56deb.pth'
#load_from = '/home/jmy/hjc/code/rubbish_classification/mmdetection/checkpoints/cascade_rcnn/cascade_rcnn_r50_coco_pretrained_weights_classes_45.pth'
#load_from = '/home/jmy/hjc/code/rubbish_classification/mmdetection/checkpoints/cascade_rcnn/cascade_rcnn_r101_coco_pretrained_weights_classes_45.pth'
load_from = '/home/jmy/hjc/code/rubbish_classification/mmdetection/checkpoints/cascade_rcnn/cascade_rcnn_x101_32x4d_coco_pretrained_weights_classes_45.pth'
resume_from = None
workflow = [('train', 1)]
summary
This is the early part of the code , Mainly as a novice learning point . More strategies and data enhancement are done by yourself mmdet Write your own code . The most important thing in the competition is to have good enough equipment . And master most mmdet The source code of , As far as our team is concerned . There is 10 A card is almost . Top , Some devices are 64 Zhang card . perhaps 8 Zhang 32 The memory v100, Enough support for training coco Pre training weight , One model can be used in three days . More importantly, you should learn to change the code yourself . utilize mmdet Modify the components provided . You can watch it yourself configs All the files under .
边栏推荐
猜你喜欢
SQL SERVER 发送邮件失败 提示必须制定收件人
Summary of interview questions (4) TCP / IP four-layer model, three handshakes and four waves, one more and one less. No, the implementation principle of NiO
剑指offer 序列化二叉树
Pytorch 目標分類比賽入門
Where is Jay Chou's album "the greatest work"? Dangbei box enjoys chairman Zhou's latest MV
be based on. Net dynamic compilation technology to realize arbitrary code execution
2021水下声学目标检测总结-Rank2
一个开源的网页画板,真的太方便了
MySQL数据通过SQL查询指定数据表的字段名及字段备注
opencv之图片处理看这一篇就够了(一)
随机推荐
私有云盘搭建
Rlib learning [2] --env definition + env rollout
swing窗体打jar包后找不到图片的问题
Improvement 21 of yolov5: cnn+transformer - replace the backbone network with fast and strong lightweight backbone efficientformer
When submitting, it shows that no matching host key type can be found.
MySQL ten million level data storage optimization
pytorch yolo4训练任意训练集
覆盖数字
JSON数据开发
Mongo sort exceeds maximum memory error
一个开源的网页画板,真的太方便了
three.js无尽的管道视角
idea svn主干合并分支版本Missing ranges异常Error:svn: E195016
opencv之图片处理看这一篇就够了(一)
Opencv's image processing. This one is enough (I)
C # use objects comparer to compare objects
Development and testing standard (simple version)
Nodejs学习
SQL SERVER 发送邮件失败 提示必须制定收件人
Upgrade single instance Mongo to replica set