Win10下⼿把⼿教你MaskR-CNN⽤⾃⼰的数据集训练(从
labelme标记开始)
对于Mask R-CNN的环境配置不了解的同学可以看我之前的博客,详细的讲了如何复现跑通demo
但是我们不能仅仅满⾜于跑通⼀个demo,⽤⾃⼰的数据集来训练⼀个模型并测试才是更⾼的追求,⽬前⽹上很多博客给出的训练⽅法都⼤同⼩异,都是基于官⽅给出的ballon样例修改⽽来,不过由于时间较为久远,细节上有了⼀些变化,这⾥我就⼀步⼀步从数据的标注开始为⼤家详细展⽰如果在Mask R-CNN⽤⾃⼰的数据集训练。
在开始之前,我们先看⼀下Mask R-CNN⽂件夹下的samples⽂件夹,可以看到⾥⾯有ballon⽂件夹,这就是训练的原始模板,为了进⾏最⼩的改动让训练跑起来,在ballon⽂件夹的同级⽬录新建⼀个叫cat的⽂件夹,后⾯新建⽂件夹尽量与我命名相同,这样代码中就基本不⽤改动路径。
然后在使⽤labelme标记数据之前,还有⼀项准备⼯作,在cat⽂件夹下新建⼀个⽂件夹叫train_data,同时在train_data⽂件夹下再新建四个⽂件夹,命名如下,同时我在pic⽂件夹中放⼊⼏张训练图⽚
做好以上准备⼯作之后,就可以开始着⼿标记数据了,这⾥我们使⽤的⼯具是labelme,不是yolo标记⽤的labelimg
⼀. 使⽤labelme标记数据
⾸先安装labelme
conda activate maskrcnn
pip install pyqt5
# 不要装⾼版本的,原因后⾯讲
pip install labelme==3.16.2
这⾥下载的时候很容易出现http error导致安装失败,不要慌张,重新运⾏pip命令即可,多试⼏次肯定能下好,如果实在中间某⼀个包下载失败,可以使⽤pip install命令单独下载这个包,然后再来安装labelme
安装成功之后,我们在命令⾏启动labelme
启动之后,打开训练图⽚⽂件夹,点击create polygons绘制mask(沿着轮廓打点),画好点save保存json⽂件到之前新建的json⽂件夹中即可
所有训练图⽚绘制完成后,json⽂件夹中应该⽣成了和pic中数量对应的json⽂件,接下来要处理json⽂件
这⾥碰到了第⼀个问题,labelme的作者给出的转化⽅法:labelme_json_to_dataset+空格+⽂件名称.json,⼀次只能处理⼀个json ⽂件,对于稍微⼤⼀点的数据库这样就⾮常不⽅便了,因此智慧的⽹友们给出了批量处理的⽅法,但是这个⽅法适⽤于低版本的labelme,这就是为什么之前安装labelme的时候特别强调了版本问题,否则安装⾼版本会出现不到utils的draw模块。
使⽤批量转换json⽂件的⽅法,⾸先需要定位anaconda中labelme的位置,到json_to_dataset.py⽂件
我的路径是D:\Anaconda\envs\mask\Lib\site-packages\labelme\cli\json_to_dataset.py
将该⽂件替换成以下代码
import argparse
import json
import os
import os.path as osp
import warnings
pycharm安装教程win10
import PIL.Image
import yaml
from labelme import utils
import base64
def main():
warnings.warn("This script is aimed to demonstrate how to convert the\n"
"JSON file to a single image dataset, and not to handle\n"
"multiple JSON files to generate a real-use dataset.")
parser = argparse.ArgumentParser()
parser.add_argument('json_file')
parser.add_argument('-o', '--out', default=None)
args = parser.parse_args()
json_file = args.json_file
if args.out is None:
if args.out is None:
out_dir = osp.basename(json_file).replace('.', '_')
out_dir = osp.join(osp.dirname(json_file), out_dir)
else:
out_dir = args.out
if ists(out_dir):
os.mkdir(out_dir)
count = os.listdir(json_file)
for i in range(0, len(count)):
path = os.path.join(json_file, count[i])
if os.path.isfile(path):
data = json.load(open(path))
if data['imageData']:
imageData = data['imageData']
else:
imagePath = os.path.join(os.path.dirname(path), data['imagePath'])
with open(imagePath, 'rb') as f:
imageData = f.read()
imageData = base64.b64encode(imageData).decode('utf-8')
img = utils.img_b64_to_arr(imageData)
label_name_to_value = {'_background_': 0}
for shape in data['shapes']:
label_name = shape['label']
if label_name in label_name_to_value:
label_value = label_name_to_value[label_name]
else:
label_value = len(label_name_to_value)
label_name_to_value[label_name] = label_value
# label_values must be dense
label_values, label_names = [], []
for ln, lv in sorted(label_name_to_value.items(), key=lambda x: x[1]):
label_values.append(lv)
label_names.append(ln)
assert label_values == list(range(len(label_values)))
lbl = utils.shapes_to_label(img.shape, data['shapes'], label_name_to_value)
captions = ['{}: {}'.format(lv, ln)
for ln, lv in label_name_to_value.items()]
lbl_viz = utils.draw_label(lbl, img, captions)
out_dir = osp.basename(count[i]).replace('.', '_')
out_dir = osp.join(osp.dirname(count[i]), out_dir)
if ists(out_dir):
os.mkdir(out_dir)
PIL.Image.fromarray(img).save(osp.join(out_dir, 'img.png'))
# PIL.Image.fromarray(lbl).save(osp.join(out_dir, 'label.png'))
utils.lblsave(osp.join(out_dir, 'label.png'), lbl)
PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, 'label_viz.png'))
with open(osp.join(out_dir, ''), 'w') as f:
for lbl_name in label_names:
f.write(lbl_name + '\n')
warnings.warn('info.yaml is being replaced by ')
info = dict(label_names=label_names)
with open(osp.join(out_dir, 'info.yaml'), 'w') as f:
yaml.safe_dump(info, f, default_flow_style=False)
print('Saved to: %s' % out_dir)
if __name__ == '__main__':
main()
替换之后在cd到pic⽂件夹,命令⾏执⾏labelme_json_to_dataset (json⽂件夹路径),举例如下:
labelme_json_to_dataset D:\Pycharm\Project\Mask_RCNN-master\samples\cat\train_data\json
执⾏完成后,每⼀个json⽂件会⽣成⼀个对应的⽂件夹,且包含五个⽂件,其中label.png就是我们接下来要转移到cv2_mask中的⽂件,需要强调的是,有很多之前的博客说label.png是16位保存的,还需要通过什么⼿段转成8位才⾏,各位,现在labelme已经升级了,如果你的label.png能正常显⽰彩⾊的mask⽽不是⼀团⿊证明你就是8位保存的,不需要再有其他操作,放⼼⼤胆的往下⾛。
同样的,将label.png从每⼀个⽂件夹向cv2_mask中⼀个个移太浪费时间了,这⾥给⼀个移动的⽅法,同时对图⽚重命名,保持和⽂件夹名字对应
import os
path='labelme_json'
files=os.listdir(path)
for file in files:
jpath=os.listdir(os.path.join(path,file))
new=file[:-5]
newnames=os.path.join('cv2_mask',new)
filename=os.path.join(path,file,jpath[2])
print(filename)
print(newnames)
执⾏该⽂件后,cv2_mask中已经有了mask⽂件
这样,数据的标注⼯作就完成了!
⼆. 训练步骤
在cat⽂件夹下新建⼀个python⽂件,命名train.py,直接拷贝以下代码,然后我再来详细说需要修改的地⽅
# -*- coding: utf-8 -*-
import os
import sys
import random
import math
import re
import time
import numpy as np
import cv2
import matplotlib
import matplotlib.pyplot as plt
import tensorflow as tf
fig import Config
#import utils
from mrcnn import model as modellib,utils
from mrcnn import visualize
import yaml
del import log
from PIL import Image
#os.environ["CUDA_VISIBLE_DEVICES"] = "0"
# Root directory of the project
ROOT_DIR = os.getcwd()
#ROOT_DIR = os.path.abspath("../")
# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, "logs")
iter_num=0
# Local path to trained weights file
COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
# Download COCO trained weights from Releases if needed
if not ists(COCO_MODEL_PATH):
utils.download_trained_weights(COCO_MODEL_PATH)
class ShapesConfig(Config):
"""Configuration for training on the toy shapes dataset.
Derives from the base Config class and overrides values specific
to the toy shapes dataset.
"""
# Give the configuration a recognizable name
NAME = "shapes"
# Train on 1 GPU and 8 images per GPU. We can put multiple images on each
# GPU because the images are small. Batch size is 8 (GPUs * images/GPU).
GPU_COUNT = 1
IMAGES_PER_GPU = 1
# Number of classes (including background)
NUM_CLASSES = 1 + 6  # background + 3 shapes
# Use small images for faster training. Set the limits of the small side
# the large side, and that determines the image shape.
IMAGE_MIN_DIM = 320
IMAGE_MAX_DIM = 384
# Use smaller anchors because our image and objects are small
RPN_ANCHOR_SCALES = (8 * 6, 16 * 6, 32 * 6, 64 * 6, 128 * 6)  # anchor side in pixels