python:xlwings模块使用
转载翻译自:https://docs.xlwings.org/en/stable/index.html#
目录
通过Python与Excel自动化交互
快速开始
建立与工作簿的连接:
import xlwings as xw
wb = xw.Book() # this will create a new workbook
wb = xw.Book('FileName.xlsx') # connect to an existing file in the current working directory
wb = xw.Book(r'C:\path\to\file.xlsx') # on Windows: use raw strings to escape backslashes
wb.visible = False
wb.save()
wb.close()
# app = xw.App(visible=False, add_book=False)
# wb = app.books.add() # this will create a new workbook
# wb = app.books.open('test.xlsx')
# wb.save()
# wb.close()
# app.quit() # 退出程序,后台进程还在运行
# app.kill() # 强制退出,并杀死EXCEL进程
如果在两个Excel实例中打开了相同的文件,则需要完全限定它并包含应用程序实例。您可以通过xw.apps.keys()找到您的应用实例**(PID):
xw.apps[10559].books['FileName.xlsx']
实例化工作表对象:
sht = wb.sheets['Sheet1']
从ranges读取/写入值:
>>> sht.range('A1').value = 'Foo 1'
>>> sht.range('A1').value
'Foo 1'
>>> sht.range('A1').value = [['Foo 1', 'Foo 2', 'Foo 3'], [10.0, 20.0, 30.0]]
>>> sht.range('A1').expand().value
[['Foo 1', 'Foo 2', 'Foo 3'], [10.0, 20.0, 30.0]]
强大的转换器可以处理大多数感兴趣的数据类型,包括双向的Numpy数组和Pandas DataFrame:
>>> import pandas as pd
>>> df = pd.DataFrame([[1,2], [3,4]], columns=['a', 'b'])
>>> sht.range('A1').value = df
>>> sht.range('A1').options(pd.DataFrame, expand='table').value
a b
0.0 1.0 2.0
1.0 3.0 4.0
Matplotlib图形可以在Excel中显示为图片:
>>> import matplotlib.pyplot as plt
>>> fig = plt.figure()
>>> plt.plot([1, 2, 3, 4, 5])
[<matplotlib.lines.Line2D at 0x1071706a0>]
>>> sht.pictures.add(fig, name='MyPlot', update=True)
<Picture 'MyPlot' in <Sheet [Workbook4]Sheet1>>
连接到工作簿,在活动工作表中读取/写入数据时,不需要工作簿对象:
>>> import xlwings as xw
>>> xw.Range('A1').value = 'something'
xw.Book提供了连接工作簿的最简单方法:如果在多个实例中打开同工作簿,它会在所有应用实例中查找该工作簿并返回错误。要在**的应用程序实例中连接到某一工作簿,请使用xw.books并引用一个特定的应用程序,请使用:
>>> app = xw.App() # or something like xw.apps[10559] for existing apps, get the available PIDs via xw.apps.keys()
>>> app.books['Book1']
- | xw.Book | xw.books |
---|---|---|
New book | xw.Book() | xw.books.add() |
Unsaved book | xw.Book(‘Book1’) | xw.books[‘Book1’] |
Book by (full)name | xw.Book(r’C:/path/to/file.xlsx’) | xw.books.open(r’C:/path/to/file.xlsx’) |
注意:
在Windows上指定文件路径时,您应该在字符串的前面使用r来使用原始字符串,或者使用双反斜杠,例如:C:\\ path \\ to \\ file.xlsx。
语法概述
xlwings对象模型与VBA使用的模型非常相似。
Active Objects
# Active app (i.e. Excel instance)
>>> app = xw.apps.active
# Active book
>>> wb = xw.books.active # in active app
>>> wb = app.books.active # in specific app
# Active sheet
>>> sht = xw.sheets.active # in active book
>>> sht = wb.sheets.active # in specific book
# Range on active sheet
>>> xw.Range('A1') # on active sheet of active book of active app
范围表示
可以使用A1表示法,Excel基于1的索引的元组,命名范围或两个Range对象实例化Range:
xw.Range('A1')
xw.Range('A1:C3')
xw.Range((1,1))
xw.Range((1,1), (3,3))
xw.Range('NamedRange')
xw.Range(xw.Range('A1'), xw.Range('B2'))
圆括号遵循Excel的行为(即从1开始的索引),而方括号则使用Python从0开始的索引/切片。例如,以下表达式都引用相同的范围:
xw.apps[763].books[0].sheets[0].range('A1')
xw.apps(10559).books(1).sheets(1).range('A1')
xw.apps[763].books['Book1'].sheets['Sheet1'].range('A1')
xw.apps(10559).books('Book1').sheets('Sheet1').range('A1')
注意,应用程序**对于您来说是不同的,因为它们是进程ID(PID)。可以通过xw.apps.keys()获取PID列表。
范围对象支持索引和切片,下面是一些示例:
>>> rng = xw.Book().sheets[0].range('A1:D5')
>>> rng[0, 0]
<Range [Workbook1]Sheet1!$A$1>
>>> rng[1]
<Range [Workbook1]Sheet1!$B$1>
>>> rng[:, 3:]
<Range [Workbook1]Sheet1!$D$1:$D$5>
>>> rng[1:3, 1:3]
<Range [Workbook1]Sheet1!$B$2:$C$3>
范围快捷表示方式:
sheet对象通过使用对象上的索引/切片符号为范围对象提供快捷方式。根据传递的是字符串还是索引/切片,这将得出sheet.range或sheet.cells:
>>> sht = xw.Book().sheets['Sheet1']
>>> sht['A1']
<Range [Book1]Sheet1!$A$1>
>>> sht['A1:B5']
<Range [Book1]Sheet1!$A$1:$B$5>
>>> sht[0, 1]
<Range [Book1]Sheet1!$B$1>
>>> sht[:10, :10]
<Range [Book1]Sheet1!$A$1:$J$10>
对象层次结构
以下显示了对象层次结构的示例,即如何从应用程序获取到范围对象并一直返回:
>>> rng = xw.apps[10559].books[0].sheets[0].range('A1')
>>> rng.sheet.book.app
<Excel App 10559>
数据结构
单元格
默认情况下,单个单元格以浮点数,unicode,None或datetime对象的形式返回,具体取决于该单元格是包含数字,字符串,为空还是表示日期:
>>> import datetime as dt
>>> sht = xw.Book().sheets[0]
>>> sht.range('A1').value = 1
>>> sht.range('A1').value
1.0
>>> sht.range('A2').value = 'Hello'
>>> sht.range('A2').value
'Hello'
>>> sht.range('A3').value is None
True
>>> sht.range('A4').value = dt.datetime(2000, 1, 1)
>>> sht.range('A4').value
datetime.datetime(2000, 1, 1, 0, 0)
列表
一维列表:
表示Excel中行或列的范围将以简单列表的形式返回,这意味着一旦它们在Python中,您就会丢失有关方向的信息。如果这是一个问题,那么下一点将展示如何保存此信息:
>>> sht = xw.Book().sheets[0]
>>> sht.range('A1').value = [[1],[2],[3],[4],[5]] # Column orientation (nested list)
>>> sht.range('A1:A5').value
[1.0, 2.0, 3.0, 4.0, 5.0]
>>> sht.range('A1').value = [1, 2, 3, 4, 5]
>>> sht.range('A1:E1').value
[1.0, 2.0, 3.0, 4.0, 5.0]
强制转换单个单元格作为列表,请使用:
>>> sht.range('A1').options(ndim=1).value
[1.0]
注意:以列方向将列表写入Excel, 使用 transpose
sht.range('A1').options(transpose=True).value = [1,2,3,4]
二维列表:
如果必须保留行或列的方向,请在“范围”选项中设置ndim。这会将范围作为嵌套列表(“ 2d列表”)返回:
>>> sht.range('A1:A5').options(ndim=2).value
[[1.0], [2.0], [3.0], [4.0], [5.0]]
>>> sht.range('A1:E1').options(ndim=2).value
[[1.0, 2.0, 3.0, 4.0, 5.0]]
二维范围自动作为嵌套列表返回。将列表分配(嵌套)列表到Excel中的范围时,仅将左上方的单元格指定为目标地址就足够了。此示例还利用索引符号将值读回到Python中:
>>> sht.range('A10').value = [['Foo 1', 'Foo 2', 'Foo 3'], [10, 20, 30]]
>>> sht.range((10,1),(11,3)).value
[['Foo 1', 'Foo 2', 'Foo 3'], [10.0, 20.0, 30.0]]
注意:
尽量减少与Excel的交互次数。进行sht.range(‘A1’).value = [[1,2],[3,4]]总是比sht.range(‘A1’).value = [1,2]和sht.range(‘A2’).value = [3,4]更有效。
范围扩展
可以通过方法expand或options方法中的expand 关键字来动态获取Excel Ranges的尺寸。当expand返回扩展的Range对象时,仅在访问Range的值时才评估选项。最好用一个例子来说明差异:
>>> sht.range('A1').value = [[1,2], [3,4]]
>>> rng1 = sht.range('A1').expand('table') # or just .expand()
>>> rng2 = sht.range('A1').options(expand='table')
>>> rng1.value
[[1.0, 2.0], [3.0, 4.0]]
>>> rng2.value
[[1.0, 2.0], [3.0, 4.0]]
>>> sht.range('A3').value = [5, 6]
>>> rng1.value
[[1.0, 2.0], [3.0, 4.0]]
>>> rng2.value
[[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]]
‘table’扩展为’down’和’right’,其他可用选项分别可用于仅列或行扩展
注意:
将expand()与命名的Range一起用作左上角的单元格,可以在Excel中进行灵活的设置:可以在表格周围移动并更改表格的大小,而无需调整代码,例如通过使用sht.range(‘NamedRange’).expand().value。
NumPy arrays
NumPy arrays work similar to nested lists. However, empty cells are represented by nan instead of None. If you want to read in a Range as array, set convert=np.array in the options method:
>>> import numpy as np
>>> sht = xw.Book().sheets[0]
>>> sht.range('A1').value = np.eye(3)
>>> sht.range('A1').options(np.array, expand='table').value
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
```
#### Pandas DataFrames
```python
>>> sht = xw.Book().sheets[0]
>>> df = pd.DataFrame([[1.1, 2.2], [3.3, None]], columns=['one', 'two'])
>>> df
one two
0 1.1 2.2
1 3.3 NaN
>>> sht.range('A1').value = df
>>> sht.range('A1:C3').options(pd.DataFrame).value
one two
0 1.1 2.2
1 3.3 NaN
# options: work for reading and writing
>>> sht.range('A5').options(index=False).value = df
>>> sht.range('A9').options(index=False, header=False).value = df
Pandas Series
>>> import pandas as pd
>>> import numpy as np
>>> sht = xw.Book().sheets[0]
>>> s = pd.Series([1.1, 3.3, 5., np.nan, 6., 8.], name='myseries')
>>> s
0 1.1
1 3.3
2 5.0
3 NaN
4 6.0
5 8.0
Name: myseries, dtype: float64
>>> sht.range('A1').value = s
>>> sht.range('A1:B7').options(pd.Series).value
0 1.1
1 3.3
2 5.0
3 NaN
4 6.0
5 8.0
Name: myseries, dtype: float64
注意:
在将List,NumPy数组或Pandas DataFrame写入Excel时,只需指定左上角的单元格,如sht.range(‘A1’).value = np.eye(10)
Matplotlib
Using pictures.add(), it is easy to paste a Matplotlib plot as picture in Excel.
Getting started
The easiest sample boils down to:
>>> import matplotlib.pyplot as plt
>>> import xlwings as xw
>>> fig = plt.figure()
>>> plt.plot([1, 2, 3])
>>> sht = xw.Book().sheets[0]
>>> sht.pictures.add(fig, name='MyPlot', update=True)
Note
If you set update=True, you can resize and position the plot on Excel: subsequent calls to pictures.add() with the same name (‘MyPlot’) will update the picture without changing its position or size.
Full integration with Excel
Calling the above code with RunPython and binding it e.g. to a button is straightforward and works cross-platform.
However, on Windows you can make things feel even more integrated by setting up a UDF along the following lines:
@xw.func
def myplot(n):
sht = xw.Book.caller().sheets.active
fig = plt.figure()
plt.plot(range(int(n)))
sht.pictures.add(fig, name=‘MyPlot’, update=True)
return ‘Plotted with n={}’.format(n)
If you import this function and call it from cell B2, then the plot gets automatically updated when cell B1 changes:
Properties
Size, position and other properties can either be set as arguments within pictures.add(), or by manipulating the picture object that is returned, see xlwings.Picture().
For example:
>>> sht = xw.Book().sheets[0]
>>> sht.pictures.add(fig, name='MyPlot', update=True,
left=sht.range('B5').left, top=sht.range('B5').top)
or:
>>> plot = sht.pictures.add(fig, name='MyPlot', update=True)
>>> plot.height /= 2
>>> plot.width /= 2
Getting a Matplotlib figure
Here are a few examples of how you get a matplotlib figure object:
via PyPlot interface:
import matplotlib.pyplot as plt
fig = plt.figure()
plt.plot([1, 2, 3, 4, 5])
or:
import matplotlib.pyplot as plt
plt.plot([1, 2, 3, 4, 5])
fig = plt.gcf()
via object oriented interface:
from matplotlib.figure import Figure
fig = Figure(figsize=(8, 6))
ax = fig.add_subplot(111)
ax.plot([1, 2, 3, 4, 5])
via Pandas:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10, 4), columns=['a', 'b', 'c', 'd'])
ax = df.plot(kind='bar')
fig = ax.get_figure()