欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

Python和Excel结合的一些应用

程序员文章站 2022-08-18 08:34:33
数据展现import xlrdimport pandas as pdimport numpy as npurl ="C:\\Users\\FCX\\Desktop\\君子供货台账 (2).xlsx"data = xlrd.open_workbook(url)sheets = ['营收指标完成情况汇总', '君子合作业务汇总表', '共管账户流水(680)', '共管账户流水(946)', 'DL20190801', 'DL20190802', 'GH20190805', 'DL2019...

数据展现

import xlrd
import pandas as pd
import numpy as np

url ="C:\\Users\\FCX\\Desktop\\君子供货台账 (2).xlsx"

data = xlrd.open_workbook(url)

sheets = ['营收指标完成情况汇总', '君子合作业务汇总表', '共管账户流水(680)', '共管账户流水(946)', 'DL20190801', 'DL20190802', 'GH20190805', 'DL20190803', 'GH20190806', 'DL20190901', 'GH20190901', 'DL20190902', 'GH20190902', '20191001年底结算完毕', '20191002', '20191101', '20191102', '2019', '2020', '发票记录', 'WpsReserved_CellImgList']

# table = data.sheet_by_name("DL20190802")

data = pd.read_excel(url, sheet_name = "DL20190802", header = 0, skiprows = [1,2], skip_footer = 50)
#参数详见  https://blog.csdn.net/brucewong0516/article/details/79096633
#skiprows省略前两行--  skiprows= 2,可以数组,30,索设第一列引列—— index_col=0

#data[['open', 'close']].apply(lambda x: x.max() - x.min(), axis=0)
#axis=0代表往跨行(down),而axis=1代表跨列(across)
data['new'] = data.apply(lambda x: x['进货单价']  * x['数量'], axis=1)
#data.insert(0,'所求',data.apply(lambda x: x['进货单价']  * x['数量'], axis=1)
print(data)
# ls = []
# i = 0
# for i in range(5):
#     ls.append(table.row_values(i))
#     i+=1
    
# print(ls)

a = table.col(1)
# 列表显示第三行
# table.row_values(1)

#a中的每个元素是cell对象,下面用str转为字符,再用split切片,再用float转为数字
# float(str(a[2]).split(":")[1])

openpyxl中的获取表名

# 获取workbook中所有的sheet
import openpyxl

wb = openpyxl.load_workbook('C:\\Users\\FCX\\Desktop\\君子行供货台账 (2).xlsx')


sheets = wb.sheetnames

print(sheets)

pd的read_excel生成的是字典

# pd.read excel
import pandas as pd
url= "C:\\Users\\FCX\\Desktop\\000000\\2019年春节期间安全访维稳工作的通知_附件_117103.xlsx"
data = pd.read_excel(url,sheet_name= [0,1])
#参数为None时,返回全部的表格,是一个表格的字典;
#当参数为list = [0,1,2,3]此类时,返回的多表格同样是字典
data

pd的列计算,用lambda

import pandas as pd

url ="C:\\Users\\FCX\\Desktop\\君子供货台账 (2).xlsx"
data = pd.read_excel(url, sheet_name = "DL20190802", header = 0, skiprows = [1,2], skip_footer = 50)
data['new'] = data.apply(lambda x: x['进货单价']  * x['数量'], axis=1)
data.to_csv("C:\\Users\\FCX\\Desktop\\000.csv")

把excel填到word

import xlrd
from docxtpl import DocxTemplate

url ="C:\\Users\\FCX\\Desktop\\2020君子业务test.xlsm"

workbook = xlrd.open_workbook(url)
sheet1_object = workbook.sheet_by_name(sheet_name="业务详情")

nrows = sheet1_object.nrows

a = sheet1_object.cell_value(rowx=1,colx=1)
b = sheet1_object.cell_value(rowx=1,colx=3)

inputs = input("请输入业务编号在第几列,如A列为第0列:")
dict = {}
# dict[b] = a
for i in range(0,nrows):
    dict[sheet1_object.cell_value(rowx=i,colx=0)] = sheet1_object.cell_value(rowx = i,colx = int(inputs))

# for j in dict.items:
#     if type(j) == "float":
#         j = "{:g}".format(j)
    
# dict
docurl = "H:\\309君子备份\\代理填写模板.docx"
savepathname = "H:\\309君子备份\\代理业务"+ inputs +".docx"

doc = DocxTemplate(docurl) #加载模板文件
doc.render(dict) #填充数据
doc.save(savepathname) #保存目标文件

把固定格式的excel中的填写数据提取出来成一个矩阵,输出结果是数据类型和内容的列表

import pandas as pd
import openpyxl
import xlrd
# from xlrd.sheet import Cell
import numpy as np
import os

file = "C:\\Users\\FCX\\Desktop\\0农课题\\沃柑种植数据采集\\"

# for each in [f,t]:
f = "50亩-100亩"
t = "100亩以上"


filenamelist50 = []
for dir1 in os.walk(file+f):
    filenamelist50.append(dir1)
excelname50 = filenamelist50[0][2] # 很明显这里是文件名 不是路径
filenamelist100 = []
for dir2 in os.walk(file+t):
    filenamelist100.append(dir2) 
excelname100 = filenamelist100[0][2]


# excelname50 = ["沃柑数据采集表- 武鸣合丛-150亩.xlsx"


for filename in excelname50:
    t1 = []
    t2 = []
    t3 = []
    t4 = []
    t5 = []
    t6 = []
    
    data = xlrd.open_workbook(file+f+"\\"+filename)
#     names = data.sheet_names()    #返回book中所有工作表的名字

    table1 = data.sheet_by_name("附表1-种植基础数据采集表")
    for i in range(5):
        t1.append(table1.row_slice(3+i, 2, 5))
        t1.append(table1.row_slice(10+i, 3, 4))
        t1.append(table1.col_slice(3, 15+2*i, 17+2*i))


    table2 = data.sheet_by_name("附表2-农资投入数据采集表")
    for j in range(5):
        t2.append(table2.col_slice(3, 3+j, 4+j))
        t2.append(table2.col_slice(3, 8+j, 9+j))
        t2.append(table2.col_slice(3, 13+2*j, 15+2*j))

    table3 = data.sheet_by_name("附表3-农技投入数据采集表")
    for k in range(5):
        t3.append(table3.col_slice(3, 3+2*k, 5+2*k))
        t3.append(table3.col_slice(3, 13+3*k, 16+3*k))

    table4 = data.sheet_by_name("附表4-产量数据采集表")
    for l in range(5):
        t4.append(table4.col_slice(3, 3+2*l, 5+2*l))


    table5 = data.sheet_by_name("附表5-价格数据采集表")
    for m in range(10):
        t5.append(table5.row_slice(4+m, 1, 16))

    table6 = data.sheet_by_name("附表6-竞品价格采集表")
    for n in range(7):
        t6.append(table6.row_slice(4+n, 0, 6))

    # arr = np.array([t1,t2,t3,t4,t5,t6])
    arr = [t1,t2,t3,t4,t5,t6]
    print(arr)

补充下walk用法

# os.walk使用说明
# import os
url = "C:\\Users\\FCX\\Desktop\\0农课题\\沃柑种植数据采集\\"
a = []
for i in os.walk(url):
    a.append(i)
    
a

输出的是路径下文件夹、路径、文件名的元组列表

本文地址:https://blog.csdn.net/zuopinde/article/details/107442567

相关标签: Python