理解pd.read_parquet
程序员文章站
2022-07-14 12:20:33
...
Load a parquet object from the file path, returning a DataFrame.
Param | 格式 | 意义 |
---|---|---|
path | str, path object or file-like object | |
engine | {‘auto’, ‘pyarraw’, ‘fastparqut’}, default ‘auto’ | try ‘pyarrow’, falling back to ‘fastparquet’ if ‘pyarrow’ is unavailable |
columns | list, default = None | If not None, only these columns will be read from the file |
**kwargs | Any additional kwargs are passed to the engine |
需要用到,pip
安装
pip install pyarrow
pip install fastparqut
核心是parqut,作为一种新型列式存储格式。
详情参见《理解parqut及其生态》