欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

json文件数据解析 -- Python3.6 字典数据的处理

程序员文章站 2024-01-12 19:37:46
...

 下面是距离现在两个月左右的一个入门级别的json数据解析,记录一下。

任务目标:将ori1.json和ori2.json这两个文件中的数据格式(见图1),转换成res1.json和res2.json的格式(见图2)

json文件数据解析 -- Python3.6 字典数据的处理
图1

 

json文件数据解析 -- Python3.6 字典数据的处理
图2

 这是字典类型的数据,本质是两种格式的转换就是字典的key和value之间互换。例如,第一组数据:在ori1.json中找到包含value为“vo”的所有的key:"K6","k8",然后"V0"作为key,"K6","k8"作为value。

下面是ori1.json的数据:

{
  "K2": [
    "V5",
    "V9",
    "V1",
    "V5"
  ],
  "K5": [
    "V8"
  ],
  "K9": [
    "V7",
    "V6",
    "V3"
  ],
  "K1": [
    "V7",
    "V6",
    "V5"
  ],
  "K3": [
    "V7"
  ],
  "K6": [
    "V0"
  ],
  "K4": [
    "V9",
    "V1"
  ],
  "K8": [
    "V3",
    "V0"
  ],
  "K0": [
    "V9",
    "V3",
    "V1"
  ]
}

res1.json的数据:

{
  "V0": [
    "K6",
    "K8"
  ],
  "V1": [
    "K0",
    "K2",
    "K4"
  ],
  "V3": [
    "K0",
    "K8",
    "K9"
  ],
  "V5": [
    "K1",
    "K2",
    "K2"
  ],
  "V6": [
    "K1",
    "K9"
  ],
  "V7": [
    "K1",
    "K3",
    "K9"
  ],
  "V8": [
    "K5"
  ],
  "V9": [
    "K0",
    "K2",
    "K4"
  ]
}

实现代码:

#!/usr/bin/python3
import json

def Rad():
    with open('ori1.json',mode='r+') as f:
        data = json.load(f)
    return data

def Wrt(data):
    with open('1.json', mode='w+') as f:
        json.dump(data,f,ensure_ascii=False,indent=2)  #json文件“建行”写入

def Opt1(data): #取出字典中每个key对应的值,并把每个列表中的值取出,去重,排序,得到新的key
    list_2 = []
    list_3 = []
    list_4 = []
    for i in data.values():
        list_2.append(i)
    for i in range(len(list_2)):
        for j in range(len(list_2[i])):
            list_3.append(list_2[i][j])
    list_4 = list(set(list_3))
    list_4.sort(reverse=False)
    return list_4

def Opt2(list_1,data):# 根据新的key 就是旧的value返回 旧的key,作为新的value
    dict1 ={}
    list_5 = []
    for k in range(len(list_1)):
        for i in data.values():
            for j in range(len(i)):
                if i[j] == list_1[k]:
                   list_5.append(list (data.keys()) [list (data.values()).index (i)])# 根据value 返回key
        for a in range(len(list_5)):
            dict1.setdefault(list_1[k],[]).append(str(list_5[a])) # 把value 写入对应的key下面
        list_5.clear()
    print(dict1)
    return dict1

def main():
  data=Rad()
  list_1=Opt1(data)
  dict1=Opt2(list_1,data)
  Wrt(dict1)

if __name__ == '__main__':
        main()

运行结果展示:

{
  "V0": [
    "K6",
    "K8"
  ],
  "V1": [
    "K2",
    "K4",
    "K0"
  ],
  "V3": [
    "K9",
    "K8",
    "K0"
  ],
  "V5": [
    "K2",
    "K2",
    "K1"
  ],
  "V6": [
    "K9",
    "K1"
  ],
  "V7": [
    "K9",
    "K1",
    "K3"
  ],
  "V8": [
    "K5"
  ],
  "V9": [
    "K2",
    "K4",
    "K0"
  ]
}

1.json文件中的数据格式和res1.json中的个数完全一致。✌


实现是实现了,但是看完大佬的代码,就会觉得自己的特别麻烦,下面是大佬的代码,清爽帅气:

import os
import json
import numpy as np
from collections import OrderedDict


keys = ['K{}'.format(i) for i in range(10)]
vals = ['V{}'.format(i) for i in range(10)]


def create_data(keys, vals, iters = 20):
  """randomly create data formatted as follow:
  {
    K1: [
      V1, 
      V2, 
      Vn
    ], 
    K2: [
      V2, 
      V4, 
      Vn
    ],
    Kn: [
      Vi,
      Vn
    ]
  }
  """
  data = {}
  for i in range(iters):
    k_rd = np.random.choice(len(keys))
    v_rd = np.random.choice(len(vals))
    if keys[k_rd] not in data.keys():
      data[keys[k_rd]] = [vals[v_rd]]
    else:
      data[keys[k_rd]].append(vals[v_rd])
  return data
  
ori_fp = open('ori2.json', 'w')
ori_data = create_data(keys, vals)
json.dump(ori_data, ori_fp, indent = 2)
ori_fp.close()


def convert_data(ori):
  """convert data from K-V to V-K
  traverse all keys and the value, so we can map the value to different keys, for example:
  K1: [
    V1, 
    V2
  ], 
  K3: [
    V2,
    V3
  ]
  we can catch that 
    V1 - K1
    V2 - K1, K3
    V3 - K3
    
  than we sort them and save in an OrderdDict
  """
  res = {}
  for kv in ori.items():
    for vl in kv[1]:
      if vl not in res.keys():
        res[vl] = [kv[0]]
      else:
        res[vl].append(kv[0])
  return res
  
res_fp = open('res2.json', 'w')
sorted_ori = OrderedDict()
for kv in sorted(ori_data.items(), key = lambda x: x[0]):
  sorted_ori[kv[0]] = kv[1]
  
res_data = convert_data(sorted_ori)
sorted_res = OrderedDict()
for kv in sorted(res_data.items(), key = lambda x: x[0]):
  sorted_res[kv[0]] = kv[1]

json.dump(sorted_res, res_fp, indent = 2)
res_fp.close()

感觉学习Python需要对每一个有用的库都非常的熟悉,熟练掌握,不然有些操作就会很头疼。实际上的思路是一样的,但是一步步来的就会很慢,很麻烦。