欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

理解pd.read_parquet

程序员文章站 2022-07-14 12:20:33
...
  • pandas.read_parquet(path, engine:str='auto', columns= None, **kwargs)

Load a parquet object from the file path, returning a DataFrame.

  • Parameters

Param 格式 意义
path str, path object or file-like object
engine {‘auto’, ‘pyarraw’, ‘fastparqut’}, default ‘auto’ try ‘pyarrow’, falling back to ‘fastparquet’ if ‘pyarrow’ is unavailable
columns list, default = None If not None, only these columns will be read from the file
**kwargs Any additional kwargs are passed to the engine
  • pyarrow || fastparqut

需要用到,pip安装

pip install pyarrow
pip install fastparqut
  • parqut

核心是parqut,作为一种新型列式存储格式。

详情参见《理解parqut及其生态