欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

python pipe模块用法

程序员文章站 2022-07-07 07:59:16
pipe并不是python内置的库,如果你安装了easy_install,直接可以安装它,否则你需要自己下载它:https://pypi.python.org/pypi/pipe...

pipe并不是python内置的库,如果你安装了easy_install,直接可以安装它,否则你需要自己下载它:https://pypi.python.org/pypi/pipe

之所以要介绍这个库,是因为它向我们展示了一种很有新意的使用迭代器和生成器的方式:流。pipe将可迭代的数据看成是流,类似于linux,pipe使用'|'传递数据流,并且定义了一系列的“流处理”函数用于接受并处理数据流,并最终再次输出数据流或者是将数据流归纳得到一个结果。我们来看一些例子。

第一个,非常简单的,使用add求和:

[python]
  1. >>> from pipe import *
  2. >>> range(5) | add
  3. 10

    求偶数和需要使用到where,作用类似于内建函数filter,过滤出符合条件的元素:

    [python]
    1. >>> range(5) | where(lambda x: x % 2 == 0) | add
    2. 6

      还记得我们定义的斐波那契数列生成器吗?求出数列中所有小于10000的偶数和需要用到take_while,与itertools的同名函数有类似的功能,截取元素直到条件不成立:

      def fibonacci():
      a=b=1
      yield a
      yield b
      while true:
      a, b = b, a+b
      yield b

      [python]
      1. >>> fib = fibonacci
      2. >>> fib() | where(lambda x: x % 2 == 0)
      3. ... | take_while(lambda x: x < 10000)
      4. ... | add
      5. 3382


        需要对元素应用某个函数可以使用select,作用类似于内建函数map;需要得到一个列表,可以使用as_list:

        [python]
        1. >>> fib() | select(lambda x: x ** 2) | take_while(lambda x: x < 100) | as_list
        2. [1, 1, 4, 9, 25, 64]


          pipe中还包括了更多的流处理函数。你甚至可以自己定义流处理函数,只需要定义一个生成器函数并加上修饰器pipe。如下定义了一个获取元素直到索引不符合条件的流处理函数:

          [python]
          1. >>> @pipe
          2. ... def take_while_idx(iterable, predicate):
          3. ... for idx, x in enumerate(iterable):
          4. ... if predicate(idx): yield x
          5. ... else: return
          6. ...

            使用这个流处理函数获取fib的前10个数字:

            [python]
            1. >>> fib() | take_while_idx(lambda x: x < 10) | as_list
            2. [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]


              更多的函数就不在这里介绍了,你可以查看pipe的源文件,总共600行不到的文件其中有300行是文档,文档中包含了大量的示例。

              pipe实现起来非常简单,使用pipe装饰器,将普通的生成器函数(或者返回迭代器的函数)代理在一个实现了__ror__方法的普通类实例上即可,但是这种思路真的很有趣。

               

               

              一道面试题:

              读取文件,统计文件中每个单词出现的次数,然后按照次数高低排序。

               

              本来蛮平淡无奇的一题,但一跟刚刚介绍的 pipe 结合起来,就有意思了,这类数据流的处理,相当适合用 pipe 来处理,花了点时间,写代码如下:


               

              #coding=utf-8
              from re import split
              from pipe import *
              
              with open(r'c:usersadministratordesktop.py') as f:  
                  print(f.read()  
                      | pipe(lambda x:split('w+', x))  
                      | pipe(lambda x:(i for i in x if i.strip()))  
                      | groupby(lambda x:x)  
                      | select(lambda x:(x[0], (x[1] | count)))  
                      | sort(key=lambda x:x[1], reverse=true)  
                      )  

              输出结果:

               

              [('request', 91), ('post', 81), ('and', 38), ('u', 36), ('if', 33), ('in', 32), ('team', 29), ('line', 23), ('objects', 20), ('gcmgroups', 16), ('get', 14), ('import', 14), ('save', 13), ('str', 12), ('0', 11), ('1', 11), ('i', 11), ('false', 10), ('gcwgroups', 9), ('from', 9), ('group_name', 9), ('path', 9), ('team_groups', 9), ('add', 8), ('else', 8), ('extra_context', 8), ('form2', 8), ('return', 8), ('area', 7), ('baoming', 7), ('cname', 7), ('cname1', 7), ('cname2', 7), ('form1', 7), ('mysql_cur', 7), ('8', 6), ('gender', 6), ('is_del', 6), ('time', 6), ('user', 6), ('20', 5), ('7', 5), ('def', 5), ('depth', 5), ('for', 5), ('gcwteam', 5), ('radio1', 5), ('13', 4), ('16', 4), ('2', 4), ('2013', 4), ('5', 4), ('gb2312', 4), ('gcwmember', 4), ('gcwmemberform', 4), ('gcwteam', 4), ('gcwteamform', 4), ('httpresponseredirect', 4), ('age', 4), ('append', 4), ('area1', 4), ('cad_id', 4), ('csv', 4), ('django', 4), ('email', 4), ('encode', 4), ('fax', 4), ('gr_name', 4), ('lines', 4), ('name', 4), ('ob', 4), ('phone', 4), ('qq', 4), ('response', 4), ('status', 4), ('team_user', 4), ('template_name', 4), ('116', 3), ('12', 3), ('4', 3), ('requestcontext', 3), ('true', 3), ('a', 3), ('areas', 3), ('cname3', 3), ('community', 3), ('create', 3), ('csa', 3), ('diyi', 3), ('filter', 3), ('gcmmember', 3), ('gcw', 3), ('hd_cont', 3), ('id', 3), ('list', 3), ('mysql_db', 3), ('pp', 3), ('radio2', 3), ('radio3', 3), ('radio4', 3), ('radio9', 3), ('render_to_response', 3), ('result', 3), ('shiyun', 3), ('sys', 3), ('t_id', 3), ('textfield10', 3), ('textfield11', 3), ('textfield12', 3), ('textfield13', 3), ('textfield14', 3), ('textfield15', 3), ('textfield16', 3), ('textfield5', 3), ('textfield6', 3), ('textfield7', 3), ('textfield8', 3), ('textfield9', 3), ('title', 3), ('topic', 3), ('writers', 3), ('3', 2), ('50', 2), ('from', 2), ('http404', 2), ('httpresponse', 2), ('mysqldb', 2), ('select', 2), ('where', 2), ('all', 2), ('area2', 2), ('area3', 2), ('baoming_user', 2), ('close', 2), ('commit', 2), ('context_instance', 2), ('cut_pages', 2), ('diqu', 2), ('except', 2), ('execute', 2), ('ftp', 2), ('ftp_status', 2), ('gcw_baoming_list', 2), ('gcw_team', 2), ('get_full_area', 2), ('group_community', 2), ('group_farmer', 2), ('group_org', 2), ('group_other', 2), ('group_pupils', 2), ('group_students', 2), ('group_tertiary', 2), ('group_troops', 2), ('is_valid', 2), ('len', 2), ('login_required', 2), ('models', 2), ('not', 2), ('page', 2), ('pk', 2), ('recommend_type', 2), ('resu', 2), ('root', 2), ('select_sql', 2), ('select_sql_mem', 2), ('set_gcw_ftpd', 2), ('st2', 2), ('todo', 2), ('try', 2), ('url', 2), ('username', 2), ('utf', 2), ('10', 1), ('11', 1), ('168', 1), ('17', 1), ('18', 1), ('192', 1), ('210', 1), ('9', 1), ('content', 1), ('disposition', 1), ('e', 1), ('qq', 1), ('arraysize', 1), ('attachment', 1), ('auth', 1), ('baoshaowei', 1), ('break', 1), ('charset', 1), ('cleaned_data', 1), ('coding', 1), ('connect', 1), ('contrib', 1), ('cursor', 1), ('d', 1), ('datetime', 1), ('db', 1), ('decorators', 1), ('en', 1), ('excel', 1), ('extend', 1), ('fetchall', 1), ('fetchmany', 1), ('filename', 1), ('forms', 1), ('ftpd', 1), ('gcw130', 1), ('gcw_baoming', 1), ('gcw_baoming_csv', 1), ('gcw_shipin_status', 1), ('gcwteam_set', 1), ('get_object_or_404', 1), ('hbl', 1), ('hbl_cassi', 1), ('host', 1), ('html', 1), ('http', 1), ('insert', 1), ('int', 1), ('is_captain', 1), ('m_author', 1), ('m_name', 1), ('mail', 1), ('method', 1), ('mimetype', 1), ('order_by', 1), ('pages', 1), ('passwd', 1), ('print', 1), ('raise', 1), ('range', 1), ('re', 1), ('recommend_name', 1), ('reload', 1), ('setdefaultencoding', 1), ('shortcuts', 1), ('team_age', 1), ('team_area', 1), ('team_area_id', 1), ('team_man_num', 1), ('team_name', 1), ('team_num', 1), ('team_woman_num', 1), ('template', 1), ('text', 1), ('textfield21', 1), ('textfield22', 1), ('textfield23', 1), ('textfield24', 1), ('textfield25', 1), ('textfield26', 1), ('textfield61', 1), ('textfield71', 1), ('textfield81', 1), ('topic_gcwmember', 1), ('topic_gcwteam', 1), ('userdb', 1), ('users', 1), ('utf8', 1), ('util', 1), ('views', 1), ('while', 1), ('wohnort3', 1), ('works_long', 1), ('works_name', 1), ('works_type', 1), ('writer', 1), ('writerow', 1), ('writerows', 1)]