【CV數據集】智慧城市之CCPD車牌數據集
前言
最近查找車牌檢測數據集,了解到CCPD數據集,CCPD 是一個開源免費的中國城市車牌識別數據集,非常不錯。
具體實現
1. 數據集簡介
CCPD2019數據集包含將近30萬張圖片、圖片尺寸為720x1160x3,共包含9種類型圖片,每種類型、數量及類型說明參考下表。
|
類型
|
圖片數量
|
備注
|
|
ccpd_base
|
199996
|
正常車牌
|
|
ccpd_blur
|
20611
|
模糊車牌
|
|
ccpd_challenge
|
50003
|
比較有挑戰的車牌
|
|
ccpd_db
|
10132
|
光線較亮或較暗車牌
|
|
ccpd_fn
|
20967
|
距離攝像頭較遠或較近
|
|
ccpd_np
|
3036
|
沒上牌的新車
|
|
ccpd_rotate
|
10053
|
水平傾斜20°-50°,垂直傾斜-10°-10°
|
|
ccpd_tilt
|
30216
|
水平傾斜15°-45°,垂直傾斜-15°-45°
|
|
ccpd_weather
|
9999
|
雨天、雪天或者大霧天的車牌
|
|
|
355013
|
|
數據標注格式:
CCPD的標注數據格式較為特別,是通過解析圖片名的方式獲取具體信息,即圖像名就是標注內容。
如圖片【025-95_113-154&383_386&473-386&473_177&454_154&383_363&402-0_0_22_27_27_33_16-37-15.jpg】,其文件名的含義如下:
025:車牌區域占整個畫面的比例; 95_113: 車牌水平和垂直角度, 水平95°, 豎直113° 154&383_386&473:標注框左上、右下坐標,左上(154, 383), 右下(386, 473) 86&473_177&454_154&383_363&402:標注框四個角點坐標,順序為右下、左下、左上、右上 0_0_22_27_27_33_16:車牌號碼映射關系如下: 第一個0為省份 對應省份字典provinces中的’皖’,;第二個0是該車所在地的地市一級代碼,對應地市一級代碼字典alphabets的’A’;后5位為字母和文字, 查看車牌號ads字典,如22為Y,27為3,33為9,16為S,最終車牌號碼為皖AY339S
車牌字典
# 34 省份
provinces = ["皖", "滬", "津", "渝", "冀", "晉", "蒙", "遼", "吉", "黑", "蘇", "浙", "京", "閩", "贛", "魯", "豫",
"鄂", "湘", "粵", "桂", "瓊", "川", "貴", "云", "藏", "陜", "甘", "青", "寧", "新", "警", "學", "O"]
# 25 地市
alphabets = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N',
'P', 'Q', 'R', 'S', 'T', 'U', 'V','W', 'X', 'Y', 'Z', 'O']
# 35 車牌號碼
ads = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T',
'U', 'V', 'W', 'X', 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'O']
2. 將CCPD中車牌區域解析為coco格式
# 20240703: ccpd dataset to coco format dataset. import os import cv2 as cv import numpy as np imgw = 720 imgh = 1160 imgsz = imgw, imgh # 34 provinces = ["皖", "滬", "津", "渝", "冀", "晉", "蒙", "遼", "吉", "黑", "蘇", "浙", "京", "閩", "贛", "魯", "豫", "鄂", "湘", "粵", "桂", "瓊", "川", "貴", "云", "藏", "陜", "甘", "青", "寧", "新", "警", "學", "O"] # 25 alphabets = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V','W', 'X', 'Y', 'Z', 'O'] # 35 ads = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'O'] def get_plate_licenses(plate): """ 普通藍牌共有7位字符;新能源車牌有8位字符:https://baike.baidu.com/item/%E8%BD%A6%E7%89%8C/8347320?fr=aladdin 《新能源電動汽車牌照和普通牌照區別介紹》https://www.yoojia.com/ask/4-11906976349117851507.html 新能源汽車車牌可分為三部分:省份簡稱(1位漢字)十地方行政區代號(1位字母)十序號(6位) 字母“D”代表純電動汽車; 字母“F”代表非純電動汽車(包括插電式混合動力和燃料電池汽車等)。 :param plate: :return: """ result = [provinces[int(plate[0])], alphabets[int(plate[1])]] result += [ads[int(p)] for p in plate[2:]] result = "".join(result) # 新能源車牌的要求,如果不是新能源車牌可以刪掉這個if # if result[2] != 'D' and result[2] != 'F' \ # and result[-1] != 'D' and result[-1] != 'F': # print(plate) # print("Error label, Please check!") # print(plate, result) return result def ccpd2coco(path): dataset_path = os.path.join(path, 'CCPD2020') green_path = os.path.join(dataset_path, 'ccpd_green') labelpath = os.path.join(dataset_path, 'green_label') for path, subpaths, files in os.walk(dataset_path): # print('subpaths: ', subpaths) # print('files: ', files) i = 0 for filename in files: # if i>1: # break # i = i + 1 print(f'file in path: {path}, subpath: {subpaths}, filename: {filename}') annoinfo = parse_annotation(filename, labelpath) # display(path, annoinfo) def display(filepath, annoinfo): filename = annoinfo['filename'] bboxes = annoinfo['bboxes'] # [xyxy] x1, y1, x2, y2 = bboxes[0] img = cv.imread(os.path.join(filepath, filename)) cv.rectangle(img, (x1, y1), (x2, y2), (255, 0, 0)) # (leftup, rightdown) cv.imwrite(filename, img) def get_bbox(size, box): # Convert xyxy box to YOLOv5 xywh box dw = 1. / size[0] dh = 1. / size[1] xc = (box[0] + box[2])*0.5*dw yc = (box[1] + box[3])*0.5*dh w = (box[2]-box[0])*dw h = (box[3]-box[1])*dh return xc, yc, w, h def parse_annotation(filename, labelpath): """ :param filename: :return: 返回標注信息info """ # 0014128352490421455-90_90-212&467_271&489-271&489_212&489_212&467_271&467-0_0_3_30_30_25_31_32-79-4.jpg annotations = filename.split("-") # print('annotations: ', annotations) rate = annotations[0] # 車牌區域占整個畫面的比例; angle = annotations[1].split("_") # 車牌水平和垂直角度, 水平95°, 豎直113° box = annotations[2].replace("&", "_").split("_") # 標注框左上、右下坐標,左上(154, 383), 右下(386, 473) point = annotations[3].replace("&", "_").split("_") # 標注框四個角點坐標,順序為右下、左下、左上、右上 plate = annotations[4].split("_") # licenses 標注框四個角點坐標,順序為右下、左下、左上、右上 plate = get_plate_licenses(plate) box = [int(b) for b in box] # xyxy bbox = get_bbox(imgsz, box) # xywh point = [int(b) for b in point] point = np.asarray(point).reshape(-1, 2) bboxes = [box] # [xyxy] angles = [angle] points = [point] plates = [plate] labels = ["plate"] * len(bboxes) classid = 1 # plate annoinfo = {"filename": filename, "bboxes": bboxes, "points": points, "labels": labels, "plates": plates, "angles": angles} # print('rate: ', rate) # print('angle: ', angle) # print('box: ', box) # print('point: ', point) # print('plate: ', plate) # print('bboxes: ', bboxes) # print('labels: ', labels) # write coco info. info = f"{classid} {' '.join(f'{x:.6f}' for x in bbox)}\n" labelname = os.path.join(labelpath, filename.replace('jpg', 'txt')) labelfile = open(labelname, 'w+') labelfile.write(info) labelfile.close() return annoinfo if __name__ == "__main__": rootpath = os.path.dirname(os.path.realpath(__file__)) ccpd2coco(rootpath)
隨機獲取一定比例的數據
# 20240703: ccpd dataset to coco format dataset. import os import cv2 as cv import numpy as np import random import shutil imgw = 720 imgh = 1160 imgsz = imgw, imgh percent = 0.05 # 34 provinces = ["皖", "滬", "津", "渝", "冀", "晉", "蒙", "遼", "吉", "黑", "蘇", "浙", "京", "閩", "贛", "魯", "豫", "鄂", "湘", "粵", "桂", "瓊", "川", "貴", "云", "藏", "陜", "甘", "青", "寧", "新", "警", "學", "O"] # 25 alphabets = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V','W', 'X', 'Y', 'Z', 'O'] # 35 ads = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'O'] # ccpd2019all = ['ccpd_base', 'ccpd_blur', 'ccpd_challenge', 'ccpd_db', 'ccpd_fn', 'ccpd_np', 'ccpd_rotate', 'ccpd_tilt', 'ccpd_weather'] ccpd2019 = ['ccpd_base', 'ccpd_blur', 'ccpd_challenge', 'ccpd_db', 'ccpd_fn', 'ccpd_rotate', 'ccpd_tilt', 'ccpd_weather'] def get_plate_licenses(plate): """ 普通藍牌共有7位字符;新能源車牌有8位字符:https://baike.baidu.com/item/%E8%BD%A6%E7%89%8C/8347320?fr=aladdin 《新能源電動汽車牌照和普通牌照區別介紹》https://www.yoojia.com/ask/4-11906976349117851507.html 新能源汽車車牌可分為三部分:省份簡稱(1位漢字)十地方行政區代號(1位字母)十序號(6位) 字母“D”代表純電動汽車; 字母“F”代表非純電動汽車(包括插電式混合動力和燃料電池汽車等)。 :param plate: :return: """ result = [provinces[int(plate[0])], alphabets[int(plate[1])]] result += [ads[int(p)] for p in plate[2:]] result = "".join(result) # 新能源車牌的要求,如果不是新能源車牌可以刪掉這個if # if result[2] != 'D' and result[2] != 'F' \ # and result[-1] != 'D' and result[-1] != 'F': # print(plate) # print("Error label, Please check!") # print(plate, result) return result def ccpd2coco2019(path): dataset_path = os.path.join(path, 'CCPD2019') # labelpath = os.path.join(dataset_path, 'blue_label') labelpath = os.path.join(path, 'plate/label') for typename in ccpd2019: print('typename: ', typename) subpath = os.path.join(dataset_path, typename) files = os.listdir(subpath) random.shuffle(files) num = len(files) print('files: ', len(files)) i = 0 for filename in files: # if i > 1: # num*percent: if i > num*percent: break i = i + 1 # print(f'subpath: {subpath}, filename[0]: {filename}') oldpath = os.path.join(subpath, filename) newpath = os.path.join(path, 'plate/image', filename) # copy image shutil.copyfile(oldpath, newpath) # bbox label file. annoinfo = parse_annotation(filename, labelpath) def ccpd2coco2020(path): dataset_path = os.path.join(path, 'CCPD2020') green_path = os.path.join(dataset_path, 'ccpd_green') # labelpath = os.path.join(dataset_path, 'green_label') labelpath = os.path.join(path, 'plate/label') for typename in ['test', 'train', 'val']: subpath = os.path.join(green_path, typename) files = os.listdir(subpath) random.shuffle(files) num = len(files) print('files: ', len(files)) i = 0 for filename in files: # if i > 1: # num*percent: if i > num*percent: break i = i + 1 # print(f'subpath: {subpath}, filename[0]: {filename}') oldpath = os.path.join(subpath, filename) newpath = os.path.join(path, 'plate/image', filename) # copy image shutil.copyfile(oldpath, newpath) # bbox label file. annoinfo = parse_annotation(filename, labelpath) def ccpd2coco(path): dataset_path = os.path.join(path, 'CCPD2020') green_path = os.path.join(dataset_path, 'ccpd_green') labelpath = os.path.join(dataset_path, 'green_label') for path, subpaths, files in os.walk(dataset_path): # print('subpaths: ', subpaths) # print('files: ', files) i = 0 for filename in files: # if i>1: # break # i = i + 1 print(f'file in path: {path}, subpath: {subpaths}, filename: {filename}') annoinfo = parse_annotation(filename, labelpath) # display(path, annoinfo) def display(filepath, annoinfo): filename = annoinfo['filename'] bboxes = annoinfo['bboxes'] # [xyxy] x1, y1, x2, y2 = bboxes[0] img = cv.imread(os.path.join(filepath, filename)) cv.rectangle(img, (x1, y1), (x2, y2), (255, 0, 0)) # (leftup, rightdown) cv.imwrite(filename, img) def get_bbox(size, box): # Convert xyxy box to YOLOv5 xywh box dw = 1. / size[0] dh = 1. / size[1] xc = (box[0] + box[2])*0.5*dw yc = (box[1] + box[3])*0.5*dh w = (box[2]-box[0])*dw h = (box[3]-box[1])*dh return xc, yc, w, h def parse_annotation(filename, labelpath): """ :param filename: :return: 返回標注信息info """ # 0014128352490421455-90_90-212&467_271&489-271&489_212&489_212&467_271&467-0_0_3_30_30_25_31_32-79-4.jpg annotations = filename.split("-") # print('annotations: ', annotations) rate = annotations[0] # 車牌區域占整個畫面的比例; angle = annotations[1].split("_") # 車牌水平和垂直角度, 水平95°, 豎直113° box = annotations[2].replace("&", "_").split("_") # 標注框左上、右下坐標,左上(154, 383), 右下(386, 473) point = annotations[3].replace("&", "_").split("_") # 標注框四個角點坐標,順序為右下、左下、左上、右上 plate = annotations[4].split("_") # licenses 標注框四個角點坐標,順序為右下、左下、左上、右上 plate = get_plate_licenses(plate) box = [int(b) for b in box] # xyxy bbox = get_bbox(imgsz, box) # xywh point = [int(b) for b in point] point = np.asarray(point).reshape(-1, 2) bboxes = [box] # [xyxy] angles = [angle] points = [point] plates = [plate] labels = ["plate"] * len(bboxes) classid = 1 # plate annoinfo = {"filename": filename, "bboxes": bboxes, "points": points, "labels": labels, "plates": plates, "angles": angles} # print('rate: ', rate) # print('angle: ', angle) # print('box: ', box) # print('point: ', point) # print('plate: ', plate) # print('bboxes: ', bboxes) # print('labels: ', labels) # write coco info. info = f"{classid} {' '.join(f'{x:.6f}' for x in bbox)}\n" labelname = os.path.join(labelpath, filename.replace('jpg', 'txt')) labelfile = open(labelname, 'w+') labelfile.write(info) labelfile.close() return annoinfo if __name__ == "__main__": rootpath = os.path.dirname(os.path.realpath(__file__)) ccpd2coco2020(rootpath) ccpd2coco2019(rootpath)
3. 數據集下載
CCPD2019:官方原始數據,主要是藍牌數據,約34W 【下載地址】 https://pan.baidu.com/s/1i5AOjAbtkwb17Zy-NQGqkw 提取碼:hm0u CCPD2020:官方原始數據,主要是新能源綠牌數據,約1萬 【下載地址】 https://pan.baidu.com/s/1JSpc9BZXFlPkXxRK4qUCyw 提取碼:ol3j 【數據集官方地址】 https://github.com/detectRecog/CCPD.git
數據集目錄
./ ├── CCPD2019 │ ├── ccpd_base │ ├── ccpd_blur │ ├── ccpd_challenge │ ├── ccpd_db │ ├── ccpd_fn │ ├── ccpd_np │ ├── ccpd_rotate │ ├── ccpd_tilt │ ├── ccpd_weather │ ├── LICENSE │ ├── README.md │ └── splits ├── CCPD2020 │ ├── ccpd_green
參考
各美其美,美美與共,不和他人作比較,不對他人有期待,不批判他人,不鉆牛角尖。
心正意誠,做自己該做的事情,做自己喜歡做的事情,安靜做一枚有思想的技術媛。
版權聲明,轉載請注明出處:http://www.rzrgm.cn/happyamyhope/
心正意誠,做自己該做的事情,做自己喜歡做的事情,安靜做一枚有思想的技術媛。
版權聲明,轉載請注明出處:http://www.rzrgm.cn/happyamyhope/
浙公網安備 33010602011771號