使用pysenal包中的read_json,read_jsonl,read_file,write_json,write_jsonl,write_file方法来对json文件和text文件进行读写
from pysenal import read_json,read_file,read_jsonlineimport osprint(os.getcwd())data1 = read_json('data/shiv.json')data2 = read_jsonline('data/test.jsonl')data3 = read_file('data/1.txt')print(data1[1][0].keys())print(data2[1].keys())print(data3)from pysenal import write_json,write_jsonline,write_filewrite_json('data/shiv1.json',data1)write_jsonline('data/test1.jsonl',data2)write_file('data/2.txt',data3)
使用pandas的dataframe结构打开csv文件:
import pandas as pddata = pd.read_csv("data/1.csv")print(type(data))print(data.keys())print(data['synonyms'][1])data = pd.read_table("data/1.csv",sep=",")print(type(data))
dataframe结构,相当于词典,第一个index是列字段,第二个index是行号。
name class max_speed num_legs0 falcon bird389.0 21 parrot bird 24.0 22 lion mammal 80.5 43 monkey mammal NaN 4
增加一行或者一列的操作:
df.loc['new_raw'] = ['yasuo','monkey',1,3]>>>name class max_speed num_legs0 falcon bird389.0 21 parrot bird 24.0 22lion mammal 80.5 43 monkey mammal NaN 4new_raw yasuo monkey 1.0 3df['attack'] = [5,6,7,8,9]>>>name class max_speed num_legs attack0 falcon bird389.0 2 51 parrot bird 24.0 2 62lion mammal 80.5 4 73 monkey mammal NaN 4 8new_raw yasuo monkey 1.0 3 9
删除一行或一列:
df = df.drop([1])df = df.drop(['name'],axis=1) #删除列需要加上axis=1这个参数>>>class max_speed num_legs0 bird389.0 22 mammal 80.5 43 mammal NaN 4
改变dataframe某个单元格的值或某片区域的值这样操作:
df.loc[0:1,('name')]=['modeganqing','xxx']>>>name class max_speed num_legs0 modeganqing bird389.0 21xxx bird 24.0 22 lion mammal 80.5 43 monkey mammal NaN 4df['name'][1] = 'modeganqing' #这种方法不提倡>>>name class max_speed num_legs0 falcon bird389.0 21 modeganqing bird 24.0 22 lion mammal 80.5 43 monkey mammal NaN 4
dataframe可以变成很多格式,如numpy,就可以使用双数字下标进行增删改查。
dn = df.to_numpy()>>>[['falcon' 'bird' 389.0 2]['parrot' 'bird' 24.0 2]['lion' 'mammal' 80.5 4]['monkey' 'mammal' nan 4]]
变成词典,这时候列名是key,value是列中所有的值组成的词典,行坐标是字词典里的key:
dn = df.to_dict()>>>{'name': {0: 'falcon', 1: 'parrot', 2: 'lion', 3: 'monkey'}, 'class': {0: 'bird', 1: 'bird', 2: 'mammal', 3: 'mammal'}, 'max_speed': {0: 389.0, 1: 24.0, 2: 80.5, 3: nan}, 'num_legs': {0: 2, 1: 2, 2: 4, 3: 4}}
可以把dataframe保存为csv,excel,json等文件:
df.to_csv('data/df.csv')df.to_excel('data/df.xlsx')df.to_html('data/df.html')df.to_json('data/df.json')