700字范文 > Python使用百度OCR接口进行验证码图像识别

Python使用百度OCR接口进行验证码图像识别

时间：2024-01-27 11:42:44

上次从pytesseract软件及其python库入门了OCR的图像识别，包括图像的读取、格式转换和图像处理，也进行了验证码的识别实验，包括验证码获取、登录验证以及不同图像处理的识别效果测试，具体内容可以点击下面的链接进行阅读：Python使用pytesseract进行验证码图像识别_Cameback_Tang的博客-CSDN博客_验证码图片识别

而这次将使用来自百度的OCR识别接口，采取互联网的方式，而不是基于软件及其接口的方式。另外，它也是免费的，可直接调用，通过测试上次的无干扰的验证码，直接识别率可达99%，这比pytesseract的76%可强太多了。当然如果加了干扰的验证码，自然要处理干扰后才能提高识别率。

本次使用的OCR接口来源：文字识别_通用场景文字识别-百度AI开放平台

单纯的OCR文字识别，主要有以下四种，以“通用文字识别”为主，可通过上述链接自行了解。

代码中需要通过type参数指定对应的OCR接口，默认使用“通用文字识别接口，高精度不带位置”。

跟上次一样，我这边写成两个识别函数，即通过图像文件、图像base64编码来进行识别。

import requestsimport base64from urllib.parse import urlencodedef get_result_by_baiduOCR(file_path):url = '/aidemo'headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36 Edg/94.0.992.47','Content-Type': 'application/x-www-form-urlencoded','Host': '','Origin': '',## 其他接口请详见/product/ocr_general'Referer': '/product/ocr/general', # 通用文字识别接口，高精度不带位置# 'Referer': '/product/ocr_others/handwriting', # 手写接口# 'Referer': '/product/ocr/doc_analysis_office', # 文档接口# 'Referer': '/product/ocr_others/webimage', # 网络图片接口# 'Connection':'keep-alive',# 'Cookie':'hadhsahjsaj',# # '':'',}with open(file_path, 'rb') as f:img_base64 = base64.b64encode(f.read())data = {'image':f'data:image/png;base64,{img_base64.decode()}','image_url':'xxxxxx','type':'/rest/2.0/ocr/v1/accurate_basic',# 'type':'/rest/2.0/ocr/v1/handwriting',# 'type':'/rest/2.0/ocr/v1/doc_analysis_office',# 'type':'/rest/2.0/ocr/v1/webimage','detect_direction': 'false',# 'language_type':'CHN_ENG','language_type': 'ENG',# 'detect_direction':False,}data = urlencode(data)data = data.replace('image_url=xxxxxx', 'image_url')html = requests.post(url, data, headers=headers)# print(html.text)# rsp = {#"errno": 0,#"msg": "success",#"data": {# "words_result": [{"words": "Pi15"}],# "words_result_num": 1,# "log_id": "1515968155725851265"}# }html = html.json()print(html)if html.get('errno') == 0:result = html.get('data').get('words_result')[0].get('words')result = ''.join(list(filter(str.isalnum, result))) # 只保留字母和数字else:result = ''return resultdef get_result_by_baiduOCR_base64(img_base64):url = '/aidemo'headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36 Edg/94.0.992.47','Content-Type': 'application/x-www-form-urlencoded','Host': '','Origin': '','Referer': '/product/ocr/general',# 'Connection': 'close' # http的连接数超过最大限制，默认的情况下连接是Keep-alive的，所以这就导致了服务器保持了太多连接而不能再新建连接。# 'Connection':'keep-alive',# 'Cookie':'',# # '':'',}data = {'image':f'data:image/png;base64,{img_base64}',# 'image_url':'xxxxxx','type':'/rest/2.0/ocr/v1/accurate_basic','detect_direction':'false',# 'language_type':'CHN_ENG','language_type':'ENG',}data = urlencode(data)html = requests.post(url, data, headers=headers)html = html.json()if html.get('errno') == 0:result = html.get('data').get('words_result')[0].get('words')result = ''.join(list(filter(str.isalnum, result))) # 只保留字母和数字else:result = ''return result

另外，您可能需要图像对象和图像base64编码之间的转换，否则就需要保存和读取文件了。

import base64from PIL import Imagefrom io import BytesIO# image：图像对象def image_to_base64(image, fmt='JPEG'):output_buffer = BytesIO()image.save(output_buffer, format=fmt)byte_data = output_buffer.getvalue()base64_str = base64.b64encode(byte_data).decode('utf-8')return base64_strdef base64_to_image(base64_str):byte_data = base64.b64decode(base64_str)image_data = BytesIO(byte_data)img = Image.open(image_data)return img

这里插个题外话，关于验证码获取，

# 获取验证码，保存html到图片文件session = requests.session()vpic_url = 'https://xxxxxxx/getVerify'html = session.get(vpic_url, headers=headers)with open("py016.jpeg", "wb") as f:f.write(html.content)img = Image.open("py016.jpeg")# 获取验证码，使用selenium.webdriver和谷歌浏览器方式from selenium import webdriverfrom selenium.webdriver.chrome.service import Servicefrom mon.by import Byurl = 'https://xxxxxx/login'driver = webdriver.Chrome()driver.get(url)img = driver.find_element_by_tag_name('img') img.screenshot('aaa.jpeg') # 神来之笔，保存为图像文件verifyCode = get_result_by_baiduOCR_base64(driver.find_element_by_tag_name('img').screenshot_as_base64 # 神来之笔，直接变成图像base64编码)

这里再插个题外话，关于验证码处理，发现它们的背景干扰简直不要太一样，而且还只有一种颜色，比较淡的颜色，试下用灰度处理并二值黑白化，结果太棒了。

采用阈值threshold为100，进行黑白二值的结果如下：

# 灰度化和其他阈值二值黑白化def gray_processing(img, threshold = 127):img = img.convert('L')# threshold = 127 # image.convert('1')# threshold = 125lookup_table = [0 if i < threshold else 1 for i in range(256)]img = img.point(lookup_table, '1')return img# 如果有干扰线，也可采用九宫格去噪，一次不行就两次，然后还可以膨胀腐蚀法# 九宫格法去噪音点def denoise(image, pixel_node):rows, cols = image.sizenoise_pos = []for i in range(1, rows-1):for j in range(1, cols-1):pixel_around = 0for m in range(i-1, i+2):for n in range(j-1, j+2):if image.getpixel((m,n)) != 1:pixel_around +=1if pixel_around <= pixel_node:noise_pos.append((i,j))for pos in noise_pos:image.putpixel(pos, 1)return image

至此，就不插入题外话了，大概就这样吧。

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。