欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

Python使用filetype精确判断文件类型

程序员文章站 2024-03-01 16:14:04
filetype.py Small and dependency free Python package to infer file type and MIME type...

filetype.py

Small and dependency free Python package to infer file type and MIME type checking the  magic numbers signature of a file or buffer.

This is a Python port from filetype Go package. Works in Python  +3 .

一个小巧*开放Python开发包,主要用来获得文件类型。包要求Python 3.+

功能特色

•简单友好的API
•支持宽范围文件类型
•提供文件扩展名和MIME类型判断
•文件的MIME类型扩展新增
•通过文件(图像、视频、音频…)简单分析
•可插拔:添加新的自定义类型的匹配
•快,即使处理大文件
•只需要前261个字节表示的最大文件头,这样你就可以通过一个单字节
•依赖*(只是Python代码,没有C的扩展,没有libmagic绑定)
•跨平台文件识别

安装

pip install filetype

API

详情请查看 annotated API reference .

实例

简单的文件类型识别

import filetype
 
def main():
 kind = filetype.guess('tests/fixtures/sample.jpg')
 if kind is None:
  print('Cannot guess file type!')
  return
 
 print('File extension: %s' % kind.extension)
 print('File MIME type: %s' % kind.mime)
 
if __name__ == '__main__':
 main()

支持类型

图片

• jpg  –  image/jpeg
• png  –  image/png
• gif  –  image/gif
• webp  –  image/webp
• cr2  –  image/x-canon-cr2
• tif  –  image/tiff
• bmp  –  image/bmp
• jxr  –  image/vnd.ms-photo
• psd  –  image/vnd.adobe.photoshop
• ico  –  image/x-icon

视频

• mp4  –  video/mp4
• m4v  –  video/x-m4v
• mkv  –  video/x-matroska
• webm  –  video/webm
• mov  –  video/quicktime
• avi  –  video/x-msvideo
• wmv  –  video/x-ms-wmv
• mpg  –  video/mpeg
• flv  –  video/x-flv

音频

• mid  –  audio/midi
• mp3  –  audio/mpeg
• m4a  –  audio/m4a
• ogg  –  audio/ogg
• flac  –  audio/x-flac
• wav  –  audio/x-wav
• amr  –  audio/amr

资料库

• epub  –  application/epub+zip
• zip  –  application/zip
• tar  –  application/x-tar
• rar  –  application/x-rar-compressed
• gz  –  application/gzip
• bz2  –  application/x-bzip2
• 7z  –  application/x-7z-compressed
• xz  –  application/x-xz
• pdf  –  application/pdf
• exe  –  application/x-msdownload
• swf  –  application/x-shockwave-flash
• rtf  –  application/rtf
• eot  –  application/octet-stream
• ps  –  application/postscript
• sqlite  –  application/x-sqlite3
• nes  –  application/x-nintendo-nes-rom
• crx  –  application/x-google-chrome-extension
• cab  –  application/vnd.ms-cab-compressed
• deb  –  application/x-deb
• ar  –  application/x-unix-archive
• Z  –  application/x-compress
• lz  –  application/x-lzip

字体

• woff  –  application/font-woff
• woff2  –  application/font-woff
• ttf  –  application/font-sfnt
• otf  –  application/font-sfnt

基准测试

使用链接中的文件进行测试,你可以点击获得到它: real files .

Environment: OSX x64 i7 2.7 Ghz
------------------------------------------------------------------------------------------ benchmark: 7 tests ------------------------------------------------------------------------------------------
Name (time in ns)                       Min                     Max                   Mean                StdDev                 Median                   IQR            Outliers(*)  Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_infer_image_from_bytes        357.6279 (1.0)       29,166.5395 (1.0)       1,642.3360 (1.0)        380.9934 (1.0)       1,509.9843 (1.0)        158.9457 (1.0)       9095;13752  102301           6
test_infer_audio_from_bytes        953.6743 (2.67)      96,082.6874 (3.29)     16,534.5880 (10.07)    3,002.1143 (7.88)     15,974.0448 (10.58)      953.6743 (6.00)       4514;6051   41528           1
test_infer_video_from_bytes     13,828.2776 (38.67)    272,989.2731 (9.36)     16,151.3144 (9.83)     3,361.2320 (8.82)     15,020.3705 (9.95)       953.6743 (6.00)       2522;2887   22193           1
test_infer_image_from_disk      15,974.0448 (44.67)    108,957.2906 (3.74)     18,621.0844 (11.34)    3,895.4441 (10.22)    17,166.1377 (11.37)    1,192.0929 (7.50)       1528;1804   10206           1
test_infer_video_from_disk      23,841.8579 (66.67)    229,120.2545 (7.86)     28,691.3476 (17.47)    6,242.9901 (16.39)    25,987.6251 (17.21)    4,053.1158 (25.50)      1987;1247   15651           1
test_infer_zip_from_disk        26,941.2994 (75.33)    230,073.9288 (7.89)     32,123.3861 (19.56)    7,524.4988 (19.75)    29,087.0667 (19.26)    4,768.3716 (30.00)      1349;1292   16132           1
test_infer_tar_from_disk        33,855.4382 (94.67)    164,031.9824 (5.62)     36,884.4401 (22.46)    4,489.4443 (11.78)    36,001.2054 (23.84)      953.6743 (6.00)       1036;1828   14666           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------