散点图适合表现大量样本的多个属性的分布规律。散点图的每个点表示一个样本,每个坐标维度表示一个属性。
from pyecharts importScatterimportpandas as pd
dfboy=pd.DataFrame()
dfboy['weight'] = [56,67,65,70,57,60,80,85,76,64]
dfboy['height'] = [162,170,168,172,168,172,180,176,178,170]
dfgirl=pd.DataFrame()
dfgirl['weight'] = [50,62,60,70,57,45,62,65,70,56]
dfgirl['height'] = [155,162,165,170,166,158,160,170,172,165]
scatter= Scatter(title = "体格数据",width = 600,height = 420)
scatter.add(name= "boy", x_axis = dfboy['weight'], y_axis = dfboy['height'])
scatter.add(name= "girl", x_axis = dfgirl['weight'], y_axis = dfgirl['height'],
yaxis_min= 130,yaxis_max = 200,xaxis_min = 30,xaxis_max = 100)
scatter.render("散点图示范.html")
scatter
当样本属性维度多于2个时,散点图可以使用点的颜色或大小等方式来表达更多属性维度。下面示范使用点的大小表示第3个维度。
比如下面的例子,通过点的大小来代表国家的人口数量的多少。
from pyecharts import Scatter
import pandas as pd
def custom_formatter(params):
return (params.value[3] + ':' +
str(params.value[0]) +','
+str(params.value[1]) + ','
+str(params.value[2]))
df = pd.DataFrame()
df['country'] = ["中国",'美国','德国','法国','英国','日本','俄罗斯','印度','澳大利亚','加拿大']
df['life-expectancy'] = [76.9,79.1,81.1,81.9,81.4,83.5,73.13,66.8,81.8,81.7]
df['capita-gdp'] = [13334,53354,44053,37599,38225,36162,23038,5903,44056,43294]
df['population'] = [1376048943,321773631,80688545,64395345,64715810,126573481,143456918,
1311050527,23968973,35939927]
scatter = Scatter(title = "各国发展水平",width = 600,height = 420)
scatter.add(name = '',
x_axis = df['capita-gdp'], # params.values[0]
y_axis = df['life-expectancy'], # params.values[1]
extra_data = df['population'].values.tolist(), # params.values[2]
extra_name = df['country'].values.tolist(), # params.values[3]
tooltip_formatter=custom_formatter, #自定义提示框格式内容
is_visualmap=True,
visual_orient="horizontal",
visual_type = 'size', #可以是size或者color
visual_dimension=2,
visual_range=[20000000, 1500000000],
)
scatter.render("扩展散点图示范.html")
scatter