1.准备工作
如果不看分析可以直接跳到最末尾直接拷贝代码,然后配置好路径即可完成转换。
在本文开始前,需要准备好coco数据集
也可以直接使用如下脚本下载:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/bin/bash
# COCO 2017 dataset http://cocodataset.org
# Download command: bash ./scripts/get_coco.sh
# Download/unzip labels
d='./' # unzip directory
url=https://github.com/ultralytics/yolov5/releases/download/v1.0/
f='coco2017labels-segments.zip' # or 'coco2017labels.zip', 68 MB
echo 'Downloading' $url$f ' ...'
wget -c $url$f # && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
# Download/unzip images
d='./coco/images' # unzip directory
url=http://images.cocodataset.org/zips/
f1='train2017.zip' # 19G, 118k images
f2='val2017.zip' # 1G, 5k images
f3='test2017.zip' # 7G, 41k images (optional)
for f in $f1 $f2 $f3; do
echo 'Downloading' $url$f '...'
wget -c $url$f # && unzip -q $f -d $d && rm $f & # download, unzip, remove in background
done
wait # finish background tasks
下载完毕的目录结构如下图所示:
1.1.coco的标注格式
COCO数据集标注文件关于目标检测主要有两个文件:
- instances_train2017.json:训练集标注文件
- instances_val2017.json:验证集标注文件
json组织格式为:
- info
- licenses
- images:list类型,图片的ID、文件名、长、宽
- file_name
- id
- width
- height
- annotations:list类型,图片的标注框列表
- image_id:图片id
- category_id:目录的id
- bbox:标注框,格式为x_tl(左上x), y_tl(左上y), w(宽), h(高)
- categories:物体的类别
- name :物体的名称
这里需要注意的是json中的annotations为list类型,且每一个元素都是一个bbox,所以在读取一个图片的所有标注框时需要遍历所有的bbox。
1.2.YOLOv7的标注格式
YOLOv7中的标注格式如下所示:
1
2
#filename.txt
classid cx cy w h
每一个txt文件都是一张图片的标注信息,文件中一行为一个bbox的标注信息,标注的格式为:cx,cy,w,h。注意这个和coco不同,cx,cy,w,h都是归一化之后的。
2.转换
因coco的bbox标注为x_tl(左上x), y_tl(左上y), w(宽), h(高),而YOLO为归一化的cx,cy,w,h,因而它们的转换关系如下所示:
\begin{array}{ll} x_{yolo} &= (x_{coco} + \frac{w_{coco}}{2}) / w_{img} \end{array}
\begin{array}{ll} y_{yolo} &= (y_{coco} + \frac{h_{coco}}{2}) / h_{img} \end{array}
\begin{array}{ll} w_{yolo} &= w_{coco} / w_{img} \end{array}
\begin{array}{ll} h_{yolo} &= h_{coco} / h_{img} \end{array}
用python代码可以表示为:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def convert_bbox_coco2yolo(img_width, img_height, bbox):
#coco bbox
x_tl, y_tl, w, h = bbox
dw = 1.0 / img_width
dh = 1.0 / img_height
#获取YOLOv7的cx,cy
x_center = x_tl + w / 2.0
y_center = y_tl + h / 2.0
#进行归一化
x = x_center * dw
y = y_center * dh
w = w * dw
h = h * dh
return [x, y, w, h]
3.额外的说明
3.1 图片需要拷贝到images/路径下
数据集的目录结构应该为:
- images/
- labels/
- train.txt
- train.label
其中images/为图片路径,labels/为标注文件路径,train.txt为图片的文件名列表,train.label为类别id对应的类别名称。需要注意的是图片必须拷贝到images/目录下,不然会提示找不到的错误。如下图所示:
3.2 注意coco的类别
在论文中说coco数据集有90个类别,但因有些类别数目实例太少的缘故,因而实际上数据集中只有80个类别。在实际的训练中因对类别id做一个转换。具体的对应关系可以看附录中的代码,也可看如下列表:
1
2
3
4
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, None, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, None, 24, 25, None,
None, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, None, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, None, 60, None, None, 61, None, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
None, 73, 74, 75, 76, 77, 78, 79, None]
4.总结
本身想偷个懒直接用网上的代码,但没找到合适的,跑起来也各种错误,反而浪费了更多的时间。看来有时候还是不能偷懒啊~
附录:代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
def convert_bbox_coco2yolo(img_width, img_height, bbox):
x_tl, y_tl, w, h = bbox
dw = 1.0 / img_width
dh = 1.0 / img_height
x_center = x_tl + w / 2.0
y_center = y_tl + h / 2.0
x = x_center * dw
y = y_center * dh
w = w * dw
h = h * dh
return [x, y, w, h]
import os
import json
from tqdm import tqdm
import shutil,os
def make_folders(path="output"):
if not os.path.exists(path):
# shutil.rmtree(path)
os.makedirs(path)
return path
def convert_coco_json_to_yolo_txt(output_path, json_file,origin_image_path):
image_path_writer=open(os.path.join(output_path,f'{output_path.split("/")[-1]}.txt'),'w')
path = make_folders(output_path)
image_path=make_folders(os.path.join(output_path,'images')) #for images
label_path=make_folders(os.path.join(output_path,'labels')) # for labels
with open(json_file) as f:
json_data = json.load(f)
label_file = os.path.join(output_path, "coco.labels") #for labels
with open(label_file, "w") as f:
for category in tqdm(json_data["categories"], desc="Categories"):
category_name = category["name"]
f.write(f"{category_name}\n")
for image in tqdm(json_data["images"], desc="Annotation txt for each iamge"):
img_id = image["id"]
img_name = image["file_name"]
img_width = image["width"]
img_height = image["height"]
anno_in_image = [anno for anno in json_data["annotations"] if anno["image_id"] == img_id]
anno_txt = os.path.join(label_path, img_name.split(".")[0] + ".txt")
with open(anno_txt, "w") as f:
for anno in anno_in_image:
category = anno["category_id"]
bbox_COCO = anno["bbox"]
classes_90to80= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, None, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, None, 24, 25, None,
None, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, None, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, None, 60, None, None, 61, None, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
None, 73, 74, 75, 76, 77, 78, 79, None]
category=classes_90to80[int(category)]
x,y,w,h=0,0,0,0
if category:
x, y, w, h = convert_bbox_coco2yolo(img_width, img_height, bbox_COCO)
f.write(f"{category} {x:.6f} {y:.6f} {w:.6f} {h:.6f}\n")
shutil.copy2(f'{origin_image_path}/{img_name}',image_path)
image_path_writer.write(f'{image_path}/{img_name}\n')
print("Converting COCO Json to YOLO txt finished!")
# 参数依次为: 将要保存图片和标注的路径 原始的coco标注的json文件路径 原始的图片路径
convert_coco_json_to_yolo_txt('temp/train','coco/coco-2017/raw/instances_train2017.json','coco/coco-2017/train/data')
convert_coco_json_to_yolo_txt('temp/val','coco/coco-2017/raw/instances_val2017.json','coco/coco-2017/validation/data')