前言
TorchVision支持主流姿態(tài)評(píng)估模型關(guān)鍵點(diǎn)檢測(cè)模型KeyPointRCNN,通過(guò)它可以輕松獲取人體的17個(gè)關(guān)鍵點(diǎn),跟OpenPose等模型相比,KeyPointRCNN基于TorchVision框架,遷移學(xué)習(xí)訓(xùn)練簡(jiǎn)單,支持一鍵導(dǎo)出ONNX格式,可以部署到ONNXRUNTIME與OpenVINO,支持C++與Python的SDK部署,可以說(shuō)在易用性上絲毫不差!

KeyPointRCNN模型介紹
Torchvision中KeyPointRCNN已經(jīng)是基于2021年的論文中的最新版本,效果非常好,2021年論文比2019論文最大的改動(dòng)在預(yù)測(cè)的編碼與解碼部分,提出了CIF與CAF兩種新的編碼方法,模型結(jié)構(gòu)圖示如下:

上述一段英文交代的比較清楚,模型輸入就是一張RGB彩色圖像,模型最終的輸出有四個(gè)部分組成,分別是boxes、labels、scores、keypoints,它們的輸出結(jié)構(gòu)如下:

不是還有最后一個(gè)輸出層沒(méi)有解釋嗎,最后一個(gè)輸出層其實(shí)是各個(gè)關(guān)鍵點(diǎn)的得分信息,小于的基本上應(yīng)該都被干掉,不可信。
另外KeyPoint部分輸出是17x3,3表示x、y、v其中v表示是否可見,v為1表示該關(guān)鍵點(diǎn)可見、v為0表示該關(guān)鍵點(diǎn)不可見。 各個(gè)關(guān)節(jié)點(diǎn)的連接順序與編碼坐標(biāo)如下(寫代碼有用的):


KeyPointRCNN推理演示
Torchvision官方提供了預(yù)訓(xùn)練的模型,直接下載之后,通過(guò)下面的腳本就可以轉(zhuǎn)換為ONNX格式模型,然后通過(guò)ONNXRUNTIME就可以完成推理演示。
第一步,轉(zhuǎn)ONNX格式
相關(guān)腳本如下
model=torchvision.models.detection.keypointrcnn_resnet50_fpn(weights=KeypointRCNN_ResNet50_FPN_Weights.DEFAULT) model.eval() x=[torch.rand(3,300,400),torch.rand(3,500,400)] predictions=model(x) #optionally,ifyouwanttoexportthemodeltoONNX: torch.onnx.export(model,x,"keypoint_rcnn.onnx",opset_version=11)如果不工作,請(qǐng)參考這里的轉(zhuǎn)換腳本修改之: TorchVision對(duì)象檢測(cè)RetinaNet推理演示
第二步:ONNRUNTIME推理演示
這部分跟之前發(fā)過(guò)一篇RetinaNet推理文章非常相似,這篇文章的連接如下,代碼只是稍微改了那么一點(diǎn)點(diǎn),增加了KeyPoint部分的可視化,推理部分的代碼如下:
importonnxruntimeasort
importcv2ascv
importnumpyasnp
importtorchvision
coco_names={'0':'background','1':'person','2':'bicycle','3':'car','4':'motorcycle','5':'airplane','6':'bus',
'7':'train','8':'truck','9':'boat','10':'trafficlight','11':'firehydrant','13':'stopsign',
'14':'parkingmeter','15':'bench','16':'bird','17':'cat','18':'dog','19':'horse','20':'sheep',
'21':'cow','22':'elephant','23':'bear','24':'zebra','25':'giraffe','27':'backpack',
'28':'umbrella','31':'handbag','32':'tie','33':'suitcase','34':'frisbee','35':'skis',
'36':'snowboard','37':'sportsball','38':'kite','39':'baseballbat','40':'baseballglove',
'41':'skateboard','42':'surfboard','43':'tennisracket','44':'bottle','46':'wineglass',
'47':'cup','48':'fork','49':'knife','50':'spoon','51':'bowl','52':'banana','53':'apple',
'54':'sandwich','55':'orange','56':'broccoli','57':'carrot','58':'hotdog','59':'pizza',
'60':'donut','61':'cake','62':'chair','63':'couch','64':'pottedplant','65':'bed',
'67':'diningtable','70':'toilet','72':'tv','73':'laptop','74':'mouse','75':'remote',
'76':'keyboard','77':'cellphone','78':'microwave','79':'oven','80':'toaster','81':'sink',
'82':'refrigerator','84':'book','85':'clock','86':'vase','87':'scissors','88':'teddybear',
'89':'hairdrier','90':'toothbrush'}
transform=torchvision.transforms.Compose([torchvision.transforms.ToTensor()])
sess_options=ort.SessionOptions()
#Belowisforoptimizingperformance
sess_options.intra_op_num_threads=24
#sess_options.execution_mode=ort.ExecutionMode.ORT_PARALLEL
sess_options.graph_optimization_level=ort.GraphOptimizationLevel.ORT_ENABLE_ALL
ort_session=ort.InferenceSession("keypointrcnn_resnet50_fpn.onnx",sess_options=sess_options,
providers=['CUDAExecutionProvider'])
src=cv.imread("D:/images/messi_player.jpg")
cv.namedWindow("KeyPointRCNNDetectionDemo",cv.WINDOW_AUTOSIZE)
image=cv.cvtColor(src,cv.COLOR_BGR2RGB)
blob=transform(image)
c,h,w=blob.shape
input_x=blob.view(1,c,h,w)
defto_numpy(tensor):
returntensor.detach().cpu().numpy()iftensor.requires_gradelsetensor.cpu().numpy()
#computeONNXRuntimeoutputprediction
ort_inputs={ort_session.get_inputs()[0].name:to_numpy(input_x)}
ort_outs=ort_session.run(None,ort_inputs)
#(N,4)dimensionalarraycontainingtheabsolutebounding-box
boxes=ort_outs[0]
#labels
labels=ort_outs[1]
#scores
scores=ort_outs[2]
#key_points
multi_key_points=ort_outs[3]
print(boxes.shape,boxes.dtype,labels.shape,labels.dtype,scores.shape,scores.dtype,multi_key_points.shape)
index=0
forx1,y1,x2,y2inboxes:
ifscores[index]>0.5:
cv.rectangle(src,(np.int32(x1),np.int32(y1)),
(np.int32(x2),np.int32(y2)),(140,199,0),2,8,0)
label_id=labels[index]
label_txt=coco_names[str(label_id)]
cv.putText(src,label_txt,(np.int32(x1),np.int32(y1)),cv.FONT_HERSHEY_SIMPLEX,0.75,(0,0,255),1)
kpts=np.int32(multi_key_points[index])
#nose->left_eye->left_ear.(0,1),(1,3)
cv.line(src,(kpts[0][0],kpts[0][1]),(kpts[1][0],kpts[1][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[1][0],kpts[1][1]),(kpts[3][0],kpts[3][1]),(255,255,0),2,8,0)
#nose->right_eye->right_ear.(0,2),(2,4)
cv.line(src,(kpts[0][0],kpts[0][1]),(kpts[2][0],kpts[2][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[2][0],kpts[2][1]),(kpts[4][0],kpts[4][1]),(255,255,0),2,8,0)
#nose->left_shoulder->left_elbow->left_wrist.(0,5),(5,7),(7,9)
cv.line(src,(kpts[0][0],kpts[0][1]),(kpts[5][0],kpts[5][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[5][0],kpts[5][1]),(kpts[7][0],kpts[7][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[7][0],kpts[7][1]),(kpts[9][0],kpts[9][1]),(255,255,0),2,8,0)
#nose->right_shoulder->right_elbow->right_wrist.(0,6),(6,8),(8,10)
cv.line(src,(kpts[0][0],kpts[0][1]),(kpts[6][0],kpts[6][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[6][0],kpts[6][1]),(kpts[8][0],kpts[8][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[8][0],kpts[8][1]),(kpts[10][0],kpts[10][1]),(255,255,0),2,8,0)
#left_shoulder->left_hip->left_knee->left_ankle.(5,11),(11,13),(13,15)
cv.line(src,(kpts[5][0],kpts[5][1]),(kpts[11][0],kpts[11][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[11][0],kpts[11][1]),(kpts[13][0],kpts[13][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[13][0],kpts[13][1]),(kpts[15][0],kpts[15][1]),(255,255,0),2,8,0)
#right_shoulder->right_hip->right_knee->right_ankle.(6,12),(12,14),(14,16)
cv.line(src,(kpts[6][0],kpts[6][1]),(kpts[12][0],kpts[12][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[12][0],kpts[12][1]),(kpts[14][0],kpts[14][1]),(255,255,0),2,8,0)
cv.line(src,(kpts[14][0],kpts[14][1]),(kpts[16][0],kpts[16][1]),(255,255,0),2,8,0)
forx,y,_,inkpts:
cv.circle(src,(int(x),int(y)),3,(0,0,255),2,8,0)
index+=1
cv.imshow("KeyPointRCNNDetectionDemo",src)
cv.waitKey(0)
cv.destroyAllWindows()
測(cè)試與運(yùn)行結(jié)果如下:


基于3050的卡,GPU推理,速度!沒(méi)辦法模型有點(diǎn)大,速度有點(diǎn)慢,需要好N卡加持才能實(shí)時(shí)檢測(cè)!

審核編輯:劉清
-
編解碼
+關(guān)注
關(guān)注
1文章
151瀏覽量
20567 -
python
+關(guān)注
關(guān)注
57文章
4877瀏覽量
90093 -
CAF
+關(guān)注
關(guān)注
1文章
20瀏覽量
14873
原文標(biāo)題:姿態(tài)評(píng)估之使用KeyPointRCNN關(guān)鍵點(diǎn)檢測(cè)模型輕松搞定!
文章出處:【微信號(hào):CVSCHOOL,微信公眾號(hào):OpenCV學(xué)堂】歡迎添加關(guān)注!文章轉(zhuǎn)載請(qǐng)注明出處。
發(fā)布評(píng)論請(qǐng)先 登錄
瑞芯微(EASY EAI)RV1126B 人體關(guān)鍵點(diǎn)識(shí)別
京東關(guān)鍵詞API接口獲取
京東關(guān)鍵詞搜索接口獲取商品數(shù)據(jù)的實(shí)操指南
技術(shù)實(shí)踐:利用房天下 API 按關(guān)鍵詞獲取房產(chǎn)數(shù)據(jù)列表
順企網(wǎng)平臺(tái)根據(jù)關(guān)鍵詞獲取企業(yè)列表API接口詳解與實(shí)現(xiàn)
小紅書獲取筆記正文和點(diǎn)贊數(shù)的API接口
拼多多搜索關(guān)鍵詞獲取商品信息的API接口
微店API秘籍!輕松獲取商品詳情數(shù)據(jù)
搜索關(guān)鍵詞獲取商品詳情接口的設(shè)計(jì)與實(shí)現(xiàn)
瑞芯微RK3576人體關(guān)鍵點(diǎn)識(shí)別算法(骨骼點(diǎn))
【開發(fā)實(shí)例】基于BPI-CanMV-K230D-Zero開發(fā)板實(shí)現(xiàn)人體關(guān)鍵點(diǎn)的實(shí)時(shí)動(dòng)態(tài)識(shí)別
【BPI-CanMV-K230D-Zero開發(fā)板體驗(yàn)】人體關(guān)鍵點(diǎn)檢測(cè)
學(xué)會(huì)這些方法,輕松搞定SMT貼片加工的坐標(biāo)獲取與校正
SiC MOSFET驅(qū)動(dòng)電路設(shè)計(jì)的關(guān)鍵點(diǎn)
使用KeyPointRCNN輕松獲取人體的17個(gè)關(guān)鍵點(diǎn)
評(píng)論