1.目标
在中财网(/) 获取给定上市股票、给定年分的第一大股东持股比例,如下图所示:
分析xhr请求
查看payload
需要三个参数,但是非常简单哈,contenttype
、jzrq
非常简单,主要是stockid
为什么不是我们熟悉的六位的股票代码呢?
在网站上看到股票代码的页面如下:
从上面的网页源代码中,可以找到对应的stockid
将请求转化为python
代码
import requests,reheaders = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7','Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8','Connection': 'keep-alive','Referer': '/quote.aspx?actstockid=7&actcontenttype=gdtj&client=pc&searchcode=','Sec-Fetch-Dest': 'document','Sec-Fetch-Mode': 'navigate','Sec-Fetch-Site': 'same-origin','Sec-Fetch-User': '?1','Upgrade-Insecure-Requests': '1','User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36','sec-ch-ua': '"Google Chrome";v="111", "Not(A:Brand";v="8", "Chromium";v="111"','sec-ch-ua-mobile': '?0','sec-ch-ua-platform': '"Windows"',}def getTable(stockid,jzrq):params = {'stockid': stockid,'contenttype': 'gdtj','jzrq': jzrq,}response = requests.get('/quote.aspx', params=params, headers=headers)return response.textdef reg_find(text):"""</td><td>23.67%</td><td>"""anss = re.findall(r'</td><td>([\d|\.]*)%</td><td>',text)if len(anss) == 0:print("error")exit(0)return anss[0]def id2stkid(uid):params = {'t': '12',}response = requests.get('/stockList.aspx', params=params, headers=headers)ans = re.findall(rf"οnclick=\"stock_clickFunc\((\d+),\'{uid}\'\)",response.text)return ansif __name__ == "__main__":codes = ['000001','000002','000008']for i in codes:ncode = id2stkid(i)text = getTable(ncode,'-06-30')ans = reg_find(text)print(ans)
运行截图