700字范文,内容丰富有趣,生活中的好帮手!
700字范文 > PaddlePaddle|CV疫情特辑(五):人流密度检测

PaddlePaddle|CV疫情特辑(五):人流密度检测

时间:2020-05-24 02:05:03

相关推荐

PaddlePaddle|CV疫情特辑(五):人流密度检测

PaddlePaddle|CV疫情特辑(五):人流密度检测

本节内容来自:百度AIstudio课程

做一个记录。

试题说明

近年来,应用于监控场景的行人分析视觉技术日益受到广泛关注。包括人体检测、人体属性识别、人流密度估计等技术在内的多种视觉技术,已获得在居家、安防、新零售等多个重要领域的广泛应用。其中作用于人流密集场景的人流密度估计技术(crowd density estimation)因其远高于肉眼计数的准确率和速度,已广泛应用于机场、车站、运营车辆、艺术展馆等多种场景,一方面可有效防止拥挤踩踏、超载等隐患发生,另一方面还可帮助零售商等统计客流。本试题以人流密度估计作为内容,答题者需要以对应主题作为技术核心,开发出能适用于密集、稀疏、高空、车载等多种复杂场景的通用人流密度估计算法,准确估计出输入图像中的总人数。

任务描述

要求参赛者给出一个算法或模型,对于给定的图片,统计图片中的总人数。给定图片数据,选手据此训练模型,为每张测试数据预测出最准确的人数。

数据说明

本竞赛所用训练和测试图片均来自一般监控场景,但包括多种视角(如低空、高空、鱼眼等),图中行人的相对尺寸也会有较大差异。部分训练数据参考了公开数据集(如ShanghaiTech [1], UCF-CC-50 [2], WorldExpo’10 [3],Mall [4] 等)。

本竞赛的数据标注均在对应json文件中,每张训练图片的标注为以下两种方式之一:

(1)部分数据对图中行人提供了方框标注(boundingbox),格式为[x, y, w, h][x,y,w,h];(2)部分图对图中行人提供了头部的打点标注,坐标格式为[x, y][x,y]。

此外部分图片还提供了忽略区(ignore_region)标注,格式为[x_0, y_0, x_1, y_1, …, x_n, y_n]组成的多边形(注意一张图片可能有多个多边形忽略区),图片在忽略区内的部分不参与训练/测试

提交答案

考试提交,需要提交模型代码项目版本和结果文件。结果文件为CSV文件格式,可以自定义文件名称,文件内的字段需要按照指定格式写入,其中,id表示图片文件名,predicted表示图片中行人个数。

1.解压数据集

# 解压数据集!unzip -q -o data/data1917/train_new.zip!unzip -q -o data/data1917/test_new.zip

2. 观察数据

# 导入包import zipfileimport paddleimport paddle.fluid as fluidimport matplotlib.pyplot as pltimport matplotlib.image as mpingfrom PIL import Imageimport jsonimport numpy as npimport cv2import sysimport time# import scipy.io as iofrom matplotlib import pyplot as pltfrom scipy.ndimage.filters import gaussian_filter import scipyfrom matplotlib import cm as CMfrom paddle.utils.plot import Ploterimport numpy as npimport paddleimport paddle.fluid as fluidfrom paddle.fluid.layer_helper import LayerHelperfrom paddle.fluid.dygraph.nn import Conv2D, Pool2D, BatchNorm, Linear, Conv2DTransposefrom paddle.fluid.dygraph.base import to_variable

start = time.time()#把图片对应的标签装入字典f = open('/home/aistudio/data/data1917/train.json',encoding='utf-8')content = json.load(f)print(content.keys())print('info:',content['info'])print('stage:',content['stage'])print('split:',content['split'])print(content['annotations'][0].keys())print(content['annotations'][0]['type'])print(content['annotations'][0][ 'id'])print(content['annotations'][0]['ignore_region'])print(content['annotations'][0]['name'])print(content['annotations'][0]['num'])

输出:

dict_keys(['info', 'split', 'annotations', 'stage'])info: Baidu Star AI Competition stage: 1split: traindict_keys(['name', 'id', 'num', 'ignore_region', 'type', 'annotation'])bbox625[]stage1/train/61a4091324d1983534ca23b6f007f841.jpg28

#把stage1都去掉:for j in range(len(content['annotations'])):content['annotations'][j]['name'] = content['annotations'][j]['name'].lstrip('stage1').lstrip('/')print(content['annotations'][1]['name'])

train/71e5bc76196e91f26426b7facbcc0843.jpg

#读取解压文件里的信息zfile = zipfile.ZipFile("data/data1917/train_new.zip")l = [] # l中存储了train中所有的图片路径for fname in zfile.namelist()[1:]:# print(fname)l.append(fname)print(l[3])name = l[3]im = Image.open(name)plt.imshow(im)

train/002be7f228f584630bde7582c9dbaffb.jpg<matplotlib.image.AxesImage at 0x7fc4ad838e90>

#查看标注的信息for j in range(len(content['annotations'])):if content['annotations'][j]['name'] == name:print('id = ',content['annotations'][j]['id']) #图片idann = content['annotations'][j]['annotation']print(ann) #图片标注格式是x,y,w,h,有些只有x,yprint('有标注的个数:',len(ann))#可视化第三个标注的信息lab = 1box = (ann[lab]['x'],ann[lab]['y'],ann[lab]['x']+ann[lab]['w'],ann[lab]['y']+ann[lab]['h'])new_img = im.crop(box=box)plt.imshow(new_img)

id = 668[{'y': 693, 'x': 1108, 'w': 196, 'h': 373}, {'y': 424, 'x': 1009, 'w': 118, 'h': 448}, {'y': 361, 'x': 864, 'w': 250, 'h': 249}, {'y': 300, 'x': 882, 'w': 104, 'h': 342}, {'y': 128, 'x': 846, 'w': 28, 'h': 99}, {'y': 131, 'x': 870, 'w': 48, 'h': 86}, {'y': 94, 'x': 899, 'w': 22, 'h': 90}, {'y': 97, 'x': 878, 'w': 19, 'h': 74}, {'y': 60, 'x': 827, 'w': 23, 'h': 62}, {'y': 44, 'x': 792, 'w': 16, 'h': 48}, {'y': 46, 'x': 799, 'w': 22, 'h': 59}, {'y': 67, 'x': 778, 'w': 26, 'h': 84}, {'y': 98, 'x': 788, 'w': 38, 'h': 86}, {'y': 148, 'x': 653, 'w': 103, 'h': 114}, {'y': 97, 'x': 712, 'w': 35, 'h': 114}, {'y': 90, 'x': 704, 'w': 26, 'h': 108}, {'y': 89, 'x': 733, 'w': 28, 'h': 130}, {'y': 177, 'x': 637, 'w': 76, 'h': 130}, {'y': 378, 'x': 460, 'w': 240, 'h': 281}, {'y': 527, 'x': 361, 'w': 256, 'h': 332}, {'y': 498, 'x': 182, 'w': 242, 'h': 557}, {'y': 906, 'x': 164, 'w': 410, 'h': 173}, {'y': 861, 'x': 1286, 'w': 213, 'h': 218}]有标注的个数: 23

#可视化图片所有标注信息width = im.size[0] #获取宽度height = im.size[1] #获取长度print(width,height)for a in range(len(ann)): #遍历所有标注for x in range(width):for y in range(height): # r,g,b = im.getpixel((x,y))if(x > (ann[a]['x']-5) and x < (ann[a]['x']+5) and y > ann[a]['y'] and y < (ann[a]['y']+ann[a]['h'])): im.putpixel((x,y),(255,0,0)) #画一条长(x,y)到(x,y+h)的红线,红线宽为正负5个像素点if(x > (ann[a]['x']+ann[a]['w']-5) and x < (ann[a]['x']+ann[a]['w']+5) and y > ann[a]['y'] and y < (ann[a]['y']+ann[a]['h'])):im.putpixel((x,y),(255,0,0)) #画一条长(x+w,y)到(x+w,y+h)的红线,红线宽为正负5个像素点if(y > (ann[a]['y']-5) and y < (ann[a]['y']+5) and x > ann[a]['x'] and x < (ann[a]['x']+ann[a]['w'])):im.putpixel((x,y),(255,0,0)) #画一条长(x,y)到(x+w,y)的红线,红线宽为正负5个像素点if(y > (ann[a]['y']+ann[a]['h']-5) and y < (ann[a]['y']+ann[a]['h']+5) and x > ann[a]['x'] and x < (ann[a]['x']+ann[a]['w'])):im.putpixel((x,y),(255,0,0)) #画一条长(x,y+h)到(x+w,y+h)的红线,红线宽为正负5个像素点plt.imshow(im)

# 根据图片的大小,对图片的来源进行分类l_set = []s_2560_1920 = [] #方框 鱼眼电梯 63张 s_928_576 = []#点 自动售货机 248张s_1024_768 = [] #点 街拍 302s_640_480 = []#点 家拍 92s_2048_2048 =[] #方框 鱼眼电梯 41s_1080_1618 =[] #滤掉 1s_1920_1080 = [] #方框 超市 1240s_1440_1080 =[] #滤掉1s_1920_1200 =[] #方框 街拍 12for inde in range(2000):imm = Image.open(content['annotations'][inde]['name'])l_set.append(imm.size)if imm.size == (2560, 1920):s_2560_1920.append(content['annotations'][inde]['name'])elif imm.size == (928, 576):s_928_576.append(content['annotations'][inde]['name'])elif imm.size == (1024, 768):s_1024_768.append(content['annotations'][inde]['name'])elif imm.size == (640, 480):s_640_480.append(content['annotations'][inde]['name'])elif imm.size == (2048, 2048):s_2048_2048.append(content['annotations'][inde]['name'])elif imm.size == (1080, 1618):s_1080_1618.append(content['annotations'][inde]['name'])elif imm.size == (1920, 1080):s_1920_1080.append(content['annotations'][inde]['name'])elif imm.size == (1440, 1080):s_1440_1080.append(content['annotations'][inde]['name'])elif imm.size == (1920, 1200):s_1920_1200.append(content['annotations'][inde]['name'])print(len(l_set))sett = set(l_set)print(sett)print(len(s_2560_1920),len(s_928_576),len(s_1024_768),len(s_640_480),len(s_2048_2048),len(s_1080_1618),len(s_1920_1080),len(s_1440_1080),len(s_1920_1200))print(s_1440_1080)print(s_1080_1618)# print(s_1024_768)

输出:

2000{(928, 576), (1024, 768), (640, 480), (2560, 1920), (2048, 2048), (1080, 1618), (1920, 1080), (1440, 1080), (1920, 1200)}63 248 302 92 41 1 1240 1 12['train/8538edb45aaf7df78336aa5b49001be6.jpg']['train/377df0a7a9abc44e840e938521df3b54.jpg']

# 统计出所有的,以点为图中每个人标注的样本point_l = []for f in range(2000):if 'w' not in content['annotations'][f]['annotation'][0]:point_l.append(content['annotations'][f]['name'])# for p_name in point_l:#print(p_name)print(len(point_l))

#如果标注是一个坐标不是区域, 展示其中一幅图像上 是如何使用一个点来标注人的# name1 = 'train/b179764112252559b76a59db9fa18021.jpg'name1 = point_l[1]im1 = Image.open(name1)for j in range(len(content['annotations'])):if content['annotations'][j]['name'] == name1:print('id = ',content['annotations'][j]['id'])ann1 = content['annotations'][j]['annotation']# print(ann1)print('有标注的个数:',len(ann1))for a in range(len(ann1)):for x in range(im1.size[0]):for y in range(im1.size[1]):if(x > (ann1[a]['x']-10) and x < (ann1[a]['x']+10) and y > ann1[a]['y']-10 and y < (ann1[a]['y']+10)): #取坐标范围正负10的像素 im1.putpixel((x,y),(255,0,0)) #对所取范围的像素变成红色plt.imshow(im1)

输出:

id = 628有标注的个数: 7

# 上段代码块中的标注的gtgt = []for a in range(len(ann1)):gt.append([ann1[a]['x'],ann1[a]['y']])print(gt)gt = np.array(gt)print(gt.shape)

[[43, 257], [98, 206], [333, 247], [102, 236], [247, 1032], [660, 919], [1414, 1057]](7, 2)

# 使用高斯滤波变换生成密度图def gaussian_filter_density(gt):#Generates a density map using Gaussian filter transformation# 初始化密度图density = np.zeros(gt.shape, dtype=np.float32)# 获取gt中不为0的元素的个数gt_count = np.count_nonzero(gt)# 如果gt全为0,就返回全0的密度图if gt_count == 0:return density# FInd out the K nearest neighbours using a KDTreepts = np.array(list(zip(np.nonzero(gt)[1].ravel(), np.nonzero(gt)[0].ravel())))# if gt_count > 0 and gt_count < 20: # leafsize = 2048# # build kdtree# tree = scipy.spatial.KDTree(pts.copy(), leafsize=leafsize)# query kdtree# distances, locations = tree.query(pts, k=4)for i, pt in enumerate(pts):pt2d = np.zeros(gt.shape, dtype=np.float32)pt2d[pt[1],pt[0]] = 1.if gt_count > 1:# sigma = (distances[i][1]+distances[i][2]+distances[i][3])*0.1sigma = 25else:sigma = np.average(np.array(gt.shape))/2./2. #case: 1 point#Convolve with the gaussian filterdensity += scipy.ndimage.filters.gaussian_filter(pt2d, sigma, mode='constant')return density

print(gt.shape) img= plt.imread(name1)k = np.zeros((img.shape[0],img.shape[1]))for i in range(0,len(gt)):if int(gt[i][1])<img.shape[0] and int(gt[i][0])<img.shape[1]:k[int(gt[i][1]),int(gt[i][0])]=1# generate density mapk = gaussian_filter_density(k)

# 可视化 密度图print(k.shape)# groundtruth = np.asarray(k)groundtruth = k# groundtruth = groundtruth.resize((80,60))print(groundtruth.shape)plt.imshow(groundtruth,cmap=CM.jet)print("Sum = " ,np.sum(groundtruth))# print(groundtruth[0][59:100])

输出:

(1080, 1920)(1080, 1920)Sum = 6.7463903

#图片操作# 1. resize 到固定尺寸 本例用做448*448# 2. 归一化到0~1之间def picture_opt(img,ann):size_x,size_y = img.size# print("size_x:",size_x)# print("size_y:",size_y)train_img_size = (448,448)img = img.resize(train_img_size,Image.ANTIALIAS)img = np.array(img) img = img / 255.0gt = []for b_l in range(len(ann)):# print("ann[b_l]:",ann[b_l])# 假设人体是使用方框标注的,通过求均值的方法将框变为点if 'w' in ann[b_l].keys(): x = (ann[b_l]['x']+(ann[b_l]['x']+ann[b_l]['w']))/2y = ann[b_l]['y']+20x = (x*448/size_x)/4y = (y*448/size_y)/4gt.append((x,y)) else:x = ann[b_l]['x']y = ann[b_l]['y']x = (x*448/size_x)/4y = (y*448/size_y)/4gt.append((x,y)) # 返回resize后的图片 和 gt# print("img.shape:",img.shape)# print("gt:",gt)return img,gt

#密度图处理def ground(img,gt):imgs = imgx = imgs.shape[0]/4y = imgs.shape[1]/4k = np.zeros((int(x),int(y)))for i in range(0,len(gt)):if int(gt[i][1]) < int(x) and int(gt[i][0]) < int(y):k[int(gt[i][1]),int(gt[i][0])]=1# generate density mapk = gaussian_filter_density(k)return k

#方框变点qt = []img = Image.open(content['annotations'][2]['name']) ann = content['annotations'][2]['annotation']print(img.size)temp = img.resize((112, 112),Image.ANTIALIAS)im,qt = picture_opt(img,ann)print(im.shape)print(qt)plt.imshow(im)for a in range(len(qt)):for x in range(temp.size[0]):for y in range(temp.size[1]):if(x > (qt[a][0]-1) and x < (qt[a][0]+1) and y > qt[a][1]-1 and y < (qt[a][1]+1)): #取坐标范围正负10的像素 temp.putpixel((x,y),(255,0,0)) #对所取范围的像素变成红色# plt.imshow(temp)k = ground(im,qt)print(type(k))# plt.imshow(k)print(np.sum(k))print(len(ann))

输出:

(928, 576)(448, 448, 3)[(40.43103448275862, 48.416666666666664), (34.63793103448276, 36.94444444444444)]<class 'numpy.ndarray'>1.76357662

# 定义数据生成器def train_set():def inner():for ig_index in range(2000): #遍历所有图片if len(content['annotations'][ig_index]['annotation']) == 2:continueif len(content['annotations'][ig_index]['annotation']) == 3:continueif content['annotations'][ig_index]['name'] == 'train/8538edb45aaf7df78336aa5b49001be6.jpg':continueif content['annotations'][ig_index]['name'] == 'train/377df0a7a9abc44e840e938521df3b54.jpg':continueif content['annotations'][ig_index]['ignore_region']: #把忽略区域都用像素为0填上ig_list = [] #存放忽略区1的数据ig_list1 = [] #存放忽略区2的数据# print(content['annotations'][ig_index]['ignore_region'])if len(content['annotations'][ig_index]['ignore_region'])==1: #因为每张图的忽略区域最多2个,这里是为1的情况# print('ig1',ig_index)ign_rge = content['annotations'][ig_index]['ignore_region'][0] #取第一个忽略区的数据for ig_len in range(len(ign_rge)):#遍历忽略区坐标个数,组成多少变型ig_list.append([ign_rge[ig_len]['x'],ign_rge[ig_len]['y']]) #取出每个坐标的x,y然后组成一个小列表放到ig_listig_cv_img = cv2.imread(content['annotations'][ig_index]['name'])#用cv2读取一张图片pts = np.array(ig_list,np.int32) #把ig_list转成numpy.ndarray数据格式,为了填充需要cv2.fillPoly(ig_cv_img,[pts],(0,0,0),cv2.LINE_AA) #使用cv2.fillPoly方法对有忽略区的图片用像素为0填充ig_img = Image.fromarray(cv2.cvtColor(ig_cv_img,cv2.COLOR_BGR2RGB)) #cv2转PILann = content['annotations'][ig_index]['annotation']#把所有标注的信息读取出来ig_im,gt = picture_opt(ig_img,ann)k = ground(ig_im,gt)groundtruth = np.asarray(k)groundtruth = groundtruth.T.astype('float32')ig_im = ig_im.transpose().astype('float32')yield ig_im,groundtruthif len(content['annotations'][ig_index]['ignore_region'])==2: #有2个忽略区域# print('ig2',ig_index)ign_rge = content['annotations'][ig_index]['ignore_region'][0]ign_rge1 = content['annotations'][ig_index]['ignore_region'][1]for ig_len in range(len(ign_rge)):ig_list.append([ign_rge[ig_len]['x'],ign_rge[ig_len]['y']])for ig_len1 in range(len(ign_rge1)):ig_list1.append([ign_rge1[ig_len1]['x'],ign_rge1[ig_len1]['y']]) ig_cv_img2 = cv2.imread(content['annotations'][ig_index]['name'])pts = np.array(ig_list,np.int32)pts1 = np.array(ig_list1,np.int32)cv2.fillPoly(ig_cv_img2,[pts],(0,0,0),cv2.LINE_AA)cv2.fillPoly(ig_cv_img2,[pts1],(0,0,0),cv2.LINE_AA)ig_img2 = Image.fromarray(cv2.cvtColor(ig_cv_img2,cv2.COLOR_BGR2RGB)) #cv2转PILann = content['annotations'][ig_index]['annotation']#把所有标注的信息读取出来ig_im,gt = picture_opt(ig_img2,ann)k = ground(ig_im,gt)k = np.zeros((int(ig_im.shape[0]/4),int(ig_im.shape[1]/4))) # 8 --> 4groundtruth = np.asarray(k)groundtruth = groundtruth.T.astype('float32')ig_im = ig_im.transpose().astype('float32')yield ig_im,groundtruthelse:# print('else',ig_index,content['annotations'][ig_index]['name'])img = Image.open(content['annotations'][ig_index]['name'])ann = content['annotations'][ig_index]['annotation']#把所有标注的信息读取出来im,gt = picture_opt(img,ann)k = ground(im,gt)groundtruth = np.asarray(k)groundtruth = groundtruth.T.astype('float32')im = im.transpose().astype('float32')yield im,groundtruthreturn inner

BATCH_SIZE= 16#每次取16张# 设置训练readertrain_reader = paddle.batch(paddle.reader.shuffle(train_set(), buf_size=5),batch_size=BATCH_SIZE)

3.定义模型

Resnet去掉最后的全连接层,添加四层反卷积进行上采样,和标注生成的稠密图大小一致: 112 ∗ 112 112∗112 112∗112

模型:

# 定义卷积批归一化块class ConvBNLayer(fluid.dygraph.Layer):def __init__(self,num_channels,num_filters,filter_size,stride=1,groups=1,act=None,param_attr = fluid.initializer.Xavier(uniform=False)):"""name_scope, 模块的名字num_channels, 卷积层的输入通道数num_filters, 卷积层的输出通道数stride, 卷积层的步幅groups, 分组卷积的组数,默认groups=1不使用分组卷积act, 激活函数类型,默认act=None不使用激活函数"""super(ConvBNLayer, self).__init__()# 创建卷积层self._conv = Conv2D(num_channels=num_channels,num_filters=num_filters,filter_size=filter_size,stride=stride,padding=(filter_size - 1) // 2,groups=groups,act=None,bias_attr=False,param_attr=param_attr)# 创建BatchNorm层self._batch_norm = BatchNorm(num_filters, act=act)def forward(self, inputs):y = self._conv(inputs)y = self._batch_norm(y)return y# 定义残差块# 每个残差块会对输入图片做三次卷积,然后跟输入图片进行短接# 如果残差块中第三次卷积输出特征图的形状与输入不一致,则对输入图片做1x1卷积,将其输出形状调整成一致class BottleneckBlock(fluid.dygraph.Layer):def __init__(self,name_scope,num_channels,num_filters,stride,shortcut=True):super(BottleneckBlock, self).__init__(name_scope)# 创建第一个卷积层 1x1self.conv0 = ConvBNLayer(num_channels=num_channels,num_filters=num_filters,filter_size=1,act='leaky_relu')# 创建第二个卷积层 3x3self.conv1 = ConvBNLayer(num_channels=num_filters,num_filters=num_filters,filter_size=3,stride=stride,act='leaky_relu')# 创建第三个卷积 1x1,但输出通道数乘以4self.conv2 = ConvBNLayer(num_channels=num_filters,num_filters=num_filters * 4,filter_size=1,act=None)# 如果conv2的输出跟此残差块的输入数据形状一致,则shortcut=True# 否则shortcut = False,添加1个1x1的卷积作用在输入数据上,使其形状变成跟conv2一致if not shortcut:self.short = ConvBNLayer(num_channels=num_channels,num_filters=num_filters * 4,filter_size=1,stride=stride)self.shortcut = shortcutself._num_channels_out = num_filters * 4def forward(self, inputs):y = self.conv0(inputs)conv1 = self.conv1(y)conv2 = self.conv2(conv1)# 如果shortcut=True,直接将inputs跟conv2的输出相加# 否则需要对inputs进行一次卷积,将形状调整成跟conv2输出一致if self.shortcut:short = inputselse:short = self.short(inputs)y = fluid.layers.elementwise_add(x=short, y=conv2)layer_helper = LayerHelper(self.full_name(), act='relu')return layer_helper.append_activation(y)# 定义ResNet模型class ResNet(fluid.dygraph.Layer):def __init__(self, name_scope, layers=50, class_dim=1):"""name_scope,模块名称layers, 网络层数,可以是50, 101或者152class_dim,分类标签的类别数"""super(ResNet, self).__init__(name_scope)self.layers = layerssupported_layers = [18, 50, 101, 152]assert layers in supported_layers, \"supported layers are {} but input layer is {}".format(supported_layers, layers)if layers == 50:#ResNet50包含多个模块,其中第2到第5个模块分别包含3、4、6、3个残差块depth = [3, 4, 6, 3]elif layers == 101:#ResNet101包含多个模块,其中第2到第5个模块分别包含3、4、23、3个残差块depth = [3, 4, 23, 3]elif layers == 152:#ResNet50包含多个模块,其中第2到第5个模块分别包含3、8、36、3个残差块depth = [3, 8, 36, 3]elif layers == 18:#新建ResNet18depth = [2, 2, 2, 2]# 残差块中使用到的卷积的输出通道数num_filters = [64, 128, 256, 512]# ResNet的第一个模块,包含1个7x7卷积,后面跟着1个最大池化层self.conv = ConvBNLayer(num_channels=3,num_filters=64,filter_size=7,stride=2,act='relu')self.pool2d_max = Pool2D(pool_size=3,pool_stride=2,pool_padding=1,pool_type='max')# ResNet的第二到第五个模块c2、c3、c4、c5self.bottleneck_block_list = []num_channels = 64for block in range(len(depth)):shortcut = Falsefor i in range(depth[block]):bottleneck_block = self.add_sublayer('bb_%d_%d' % (block, i),BottleneckBlock(self.full_name(),num_channels=num_channels,num_filters=num_filters[block],stride=2 if i == 0 and block != 0 else 1, # c3、c4、c5将会在第一个残差块使用stride=2;其余所有残差块stride=1shortcut=shortcut))num_channels = bottleneck_block._num_channels_outself.bottleneck_block_list.append(bottleneck_block)shortcut = True# 在c5的输出特征图上使用全局池化# self.pool2d_avg = Pool2D(pool_size=7, pool_type='avg', global_pooling=False)self.pool2d_avg = Pool2D(pool_size=3,pool_stride=2,pool_padding=1,pool_type='max')self.upConv1 = Conv2DTranspose(num_channels=2048,num_filters=1024,filter_size=2,stride=2,padding=0,dilation=1,bias_attr=None,act="relu")self.exConv1 = ConvBNLayer(num_channels=3072,num_filters=1024,filter_size=3,stride=1,act='relu')self.upConv2 = Conv2DTranspose(num_channels=1024,num_filters=512,filter_size=2,stride=2,padding=0,dilation=1,bias_attr=None,act="relu")self.exConv2 = ConvBNLayer(num_channels=1536,num_filters=512,filter_size=3,stride=1,act='relu')self.upConv3 = Conv2DTranspose(num_channels=512,num_filters=256,filter_size=2,stride=2,padding=0,dilation=1,bias_attr=None,act="relu")self.exConv3 = ConvBNLayer(num_channels=768,num_filters=256,filter_size=3,stride=1,act='relu')self.upConv4 = Conv2DTranspose(num_channels=256,num_filters=128,filter_size=2,stride=2,padding=0,dilation=1,bias_attr=None,act="relu")self.exConv4 = ConvBNLayer(num_channels=384,num_filters=128,filter_size=3,stride=1,act='relu')self.feature1 = Conv2D(num_channels=128, num_filters=128, filter_size=3,padding=1,act="relu")self.feature2 = Conv2D(num_channels=128, num_filters=1, filter_size=3,padding=1,act="relu")# stdv用来作为全连接层随机初始化参数的方差import mathstdv = 1.0 / math.sqrt(2048 * 1.0)# 创建全连接层,输出大小为类别数目self.out = Linear(input_dim=2048, output_dim=class_dim,param_attr=fluid.param_attr.ParamAttr(initializer=fluid.initializer.Uniform(-stdv, stdv)))def forward(self, inputs):# print("inputs:",inputs.shape)contact_list = list()depth = [1, 3, 5, 7]y = self.conv(inputs)# print("y:",y.shape)y = self.pool2d_max(y)# print(len(self.bottleneck_block_list))# print("y:",y.shape)count = 0for bottleneck_block in self.bottleneck_block_list:y = bottleneck_block(y)if count in depth:contact_list.append(y)#print(y.shape)# print(count)count += 1# print("y:",y.shape)# print(len(contact_list))y = self.pool2d_avg(y)# print("y:",y.shape)y = self.upConv1(y) # 1024*14*14y = fluid.layers.concat((y, contact_list[3]), axis=1)y = self.exConv1(y)y = self.upConv2(y)y = fluid.layers.concat((y, contact_list[2]), axis=1)y = self.exConv2(y)y = self.upConv3(y)y = fluid.layers.concat((y, contact_list[1]), axis=1)y = self.exConv3(y)y = self.upConv4(y)y = fluid.layers.concat((y, contact_list[0]), axis=1)y = self.exConv4(y)y = self.feature1(y)y = self.feature2(y)return y

4.开始训练

with fluid.dygraph.guard():model = ResNet("ResNet", layers = 18)# # model = ResNet("ResNet", layers = 50, class_dim = 65) #尝试resnet50model.train() #训练模式opt=fluid.optimizer.AdamOptimizer(learning_rate=fluid.layers.cosine_decay( learning_rate = 1e-5, step_each_epoch=120, epochs=10), parameter_list=model.parameters())epochs_num= 10#迭代次数为2print("start")train_loss = list()for pass_num in range(epochs_num):for batch_id,datas in enumerate(train_reader()):for i in range(len(datas)):if i == 0:imgs, labels = datas[i] imgs = imgs[np.newaxis,:]labels = labels[np.newaxis,:]# 这个是通道labels = labels[np.newaxis,:]# 这个是batchelse:img, label = datas[i]img = img[np.newaxis, :]label = label[np.newaxis, :]label = label[np.newaxis, :]imgs = np.concatenate((imgs, img), axis=0)labels = np.concatenate((labels, label), axis=0)imgs = imgs.astype(np.float32)images = fluid.dygraph.to_variable(imgs)targets = fluid.dygraph.to_variable(labels)predict = model(images) # print(predict.shape)# print(targets.shape)cost = fluid.layers.square_error_cost(predict, targets)# cost = fluid.layers.sqrt(cost)avg_loss = fluid.layers.mean(cost)train_loss.append(avg_loss.numpy())if batch_id!=0 and batch_id%5==0:print("train_pass:{},batch_id:{},train_loss:{}".format(pass_num,batch_id,avg_loss.numpy()))avg_loss.backward()opt.minimize(avg_loss)model.clear_gradients() if (pass_num + 1) % 2 == 0: fluid.save_dygraph(model.state_dict(),'MyLeNet_{}'.format(pass_num))#保存模型fluid.save_dygraph(model.state_dict(),'MyLeNet_final')#保存模型print("finished")plt.figure(dpi = 120)x = range(len(train_loss))y = train_lossplt.plot(x, y, label='train')plt.legend(loc='upper right')plt.ylabel('loss')plt.xlabel('itear')plt.show()

5.模型测试

#测试图片import numpy as npfrom PIL import Imageimport paddle.fluid as fluidimport matplotlib.pyplot as pltimport zipfiletest_zfile = zipfile.ZipFile("/home/aistudio/data/data1917/test_new.zip")l_test = []for test_fname in test_zfile.namelist()[1:]:l_test.append(test_fname)test_img = Image.open(l_test[2])plt.imshow(test_img)test_img = test_img.resize((448,448))test_im = np.array(test_img)test_im = test_im / 255.0test_im = test_im.transpose().reshape(1,3,448,448).astype('float32')with fluid.dygraph.guard():model, _ = fluid.load_dygraph("MyLeNet_final")resnet = ResNet("ResNet", layers = 18)resnet.load_dict(model)resnet.eval()images = fluid.dygraph.to_variable(test_im)predict = resnet(images)print(predict.numpy().sum())

输出:8.57991

6.测试输出保存CSV

import numpy as npfrom PIL import Imageimport paddle.fluid as fluidimport matplotlib.pyplot as pltimport zipfiletest_zfile = zipfile.ZipFile("/home/aistudio/data/data1917/test_new.zip")l_test = []for test_fname in test_zfile.namelist()[1:]:# print(fname)l_test.append(test_fname)data_dict = {}with fluid.dygraph.guard():# 加载模型model, _ = fluid.load_dygraph("MyLeNet_final")resnet = ResNet("ResNet", layers = 18)resnet.load_dict(model)resnet.eval()for index in range(len(l_test)):test_img = Image.open(l_test[index])test_img = test_img.resize((448,448))test_im = np.array(test_img)test_im = test_im / 255.0test_im = test_im.transpose().reshape(1,3,448,448).astype('float32')images = fluid.dygraph.to_variable(test_im)predict = resnet(images)temp=predicttemp=temp.numpy()people =np.sum(temp)# print(people)name = l_test[index].split("/")[1]# print(name)data_dict[name]=int(people)import csvwith open('results2.csv', 'w') as csvfile:fieldnames = ['id', 'predicted']writer = csv.DictWriter(csvfile, fieldnames=fieldnames)writer.writeheader()for k,v in data_dict.items():writer.writerow({'id': k, 'predicted':v}) print("finished")

注意:稠密图有一个超参数sigma,这个具体设置大小和最后检测的人流密度息息相关,设置的越小,稠密度图与人流密度标注的误差就越小,但是稠密度图就越不明显,造成很难训练,因此本例使用的稠密度图是有很大限制的,可以尝试修改成目标检测网络,可以提高更大的精度。

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。