欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

WSGI协议(PEP3333)学习笔记

程序员文章站 2022-06-06 18:27:03
...

wsgi协议概述

wsgi全称为Web Server Gateway Interface,协议提供了web servers与python web application或者是python framework之间的标准接口,是的python web application可以支持多种web servers.

This document specifies a proposed standard interface between web servers and Python web applications or frameworks, to promote web application portability across a variety of web servers.

wsgi接口包含两个方面:

  • 服务端
  • 应用端

服务端调用应用端产生的可调用对象。至于应用端是如何提供可调用对象的不同服务端会有不同的要求。总结归纳下来主要有两种形式:

  • 服务端要求应用部署者写一小段脚本,脚本中实例化服务端,同时实例化过程中填充应用端提供的可调用对象。
  • 服务端通过配置文件或者其他类似的方式来获取应用端可调用对象。

The WSGI interface has two sides: the "server" or "gateway" side, and the "application" or "framework" side. The server side invokes a callable object that is provided by the application side. The specifics of how that object is provided are up to the server or gateway. It is assumed that some servers or gateways will require an application's deployer to write a short script to create an instance of the server or gateway, and supply it with the application object. Other servers and gateways may use configuration files or other mechanisms to specify where an application object should be imported from, or otherwise obtained.

此外,还可以添加middleware组件,middleware组件其实是同时实现了服务端和应用端。对于服务端来讲,middleware就是应用端; 而对于应用端来说其又是服务端。并且可以提供更多的api或者是内容转换等功能。

In addition to "pure" servers/gateways and applications/frameworks, it is also possible to create "middleware" components that implement both sides of this specification. Such components act as an application to their containing server, and as a server to a contained application, and can be used to provide extended APIs, content transformation, navigation, and other useful functions.

wsgi中定义了两种字符串:

  • "Native" strings (which are always implemented using the type named str ) that are used for request/response headers and metadata.
  • "Bytestrings" (which are implemented using the bytes type in Python 3, and str elsewhere), that are used for the bodies of requests and responses (e.g. POST/PUT input data and HTML page outputs).

应用端

应用端需要提供一个接受两个位置参数可以被多次调用的可调用对象。这里说的对象可以是:

  • function
  • method
  • class
  • 实现了__call__方法的实例

官方提供了两个应用端可调用对象实例,一个是以函数的形式给出的,另外一个是一个类的形式给出的。

HELLO_WORLD = b"Hello world!\n"

# 函数形式应用对象
def simple_app(environ, start_response):
    """Simplest possible application object"""
    status = '200 OK'
    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)
    return [HELLO_WORLD]
    
# 类形式应用对象
class AppClass:
    """Produce the same output, but using a class

    (Note: 'AppClass' is the "application" here, so calling it
    returns an instance of 'AppClass', which is then the iterable
    return value of the "application callable" as required by
    the spec.

    If we wanted to use *instances* of 'AppClass' as application
    objects instead, we would have to implement a '__call__'
    method, which would be invoked to execute the application,
    and we would need to create an instance for use by the
    server or gateway.
    """

    def __init__(self, environ, start_response):
        self.environ = environ
        self.start = start_response

    def __iter__(self):
        status = '200 OK'
        response_headers = [('Content-type', 'text/plain')]
        self.start(status, response_headers)
        yield HELLO_WORLD

从应用对象可以学习到的知识有:

  • 调用应用对象时需要传入两个位置参数,environstart_response,从名字上可以大概看出来两者的用途。第一个参数是服务端传过来的环境信息,第二个是服务端的回调函数。
  • 返回值是一个可迭代对象,也就是可以使用for循环进行遍历的对象。

服务端

服务端对象每次收到来自http客户端(浏览器)的请求后,会调用一次应用端对象。官方同样以一个函数的实行给出了一个服务端应用。

import os, sys

enc, esc = sys.getfilesystemencoding(), 'surrogateescape'

def unicode_to_wsgi(u):
    # Convert an environment variable to a WSGI "bytes-as-unicode" string
    return u.encode(enc, esc).decode('iso-8859-1')

def wsgi_to_bytes(s):
    return s.encode('iso-8859-1')

def run_with_cgi(application):
    # 服务端应用
    
    """environ的构造"""
    environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()}
    environ['wsgi.input']        = sys.stdin.buffer
    environ['wsgi.errors']       = sys.stderr
    environ['wsgi.version']      = (1, 0)
    environ['wsgi.multithread']  = False
    environ['wsgi.multiprocess'] = True
    environ['wsgi.run_once']     = True

    if environ.get('HTTPS', 'off') in ('on', '1'):
        environ['wsgi.url_scheme'] = 'https'
    else:
        environ['wsgi.url_scheme'] = 'http'

    headers_set = []
    headers_sent = []

    def write(data):
        out = sys.stdout.buffer

        if not headers_set:
             raise AssertionError("write() before start_response()")

        elif not headers_sent:
             # Before the first output, send the stored headers
             status, response_headers = headers_sent[:] = headers_set
             out.write(wsgi_to_bytes('Status: %s\r\n' % status))
             for header in response_headers:
                 out.write(wsgi_to_bytes('%s: %s\r\n' % header))
             out.write(wsgi_to_bytes('\r\n'))

        out.write(data)
        out.flush()

    def start_response(status, response_headers, exc_info=None):
        if exc_info:
            try:
                if headers_sent:
                    # Re-raise original exception if headers sent
                    raise exc_info[1].with_traceback(exc_info[2])
            finally:
                exc_info = None     # avoid dangling circular ref
        elif headers_set:
            raise AssertionError("Headers already set!")

        headers_set[:] = [status, response_headers]

        # Note: error checking on the headers should happen here,
        # *after* the headers are set.  That way, if an error
        # occurs, start_response can only be re-called with
        # exc_info set.

        return write

    result = application(environ, start_response)
    try:
        for data in result:
            if data:    # don't send headers until body appears
                write(data)
        if not headers_sent:
            write('')   # send headers now if body was empty
    finally:
        if hasattr(result, 'close'):
            result.close()

从服务端应用可以学习到的知识有:

  • 服务端对象只需要接受一个位置参数即可,application也就是应用端对象。
  • 服务端在调用应用端对象之前,会生成应用端所需要的两个位置参数,于此同时会有另外一个输出应用端返回内容的write函数。
  • 服务端调用应用端对象从而获取响应结果,获取结果后调用write函数将结果展示出来。

中间件(middleware)

中间件可以完成的功能有:

  • Routing a request to different application objects based on the target URL, after rewriting the environ accordingly.
  • Allowing multiple applications or frameworks to run side-by-side in the same process
  • Load balancing and remote processing, by forwarding requests and responses over a network
  • Perform content postprocessing, such as applying XSL stylesheets

官网给出的中间件的实例的效果是将返回结果重新进行编码。

from piglatin import piglatin

class LatinIter:

    """Transform iterated output to piglatin, if it's okay to do so

    Note that the "okayness" can change until the application yields
    its first non-empty bytestring, so 'transform_ok' has to be a mutable
    truth value.
    """

    def __init__(self, result, transform_ok):
        if hasattr(result, 'close'):
            self.close = result.close
        self._next = iter(result).__next__
        self.transform_ok = transform_ok

    def __iter__(self):
        return self

    def __next__(self):
        if self.transform_ok:
            return piglatin(self._next())   # call must be byte-safe on Py3
        else:
            return self._next()

class Latinator:

    # by default, don't transform output
    transform = False

    def __init__(self, application):
        self.application = application

    def __call__(self, environ, start_response):

        transform_ok = []

        def start_latin(status, response_headers, exc_info=None):

            # Reset ok flag, in case this is a repeat call
            del transform_ok[:]

            for name, value in response_headers:
                if name.lower() == 'content-type' and value == 'text/plain':
                    transform_ok.append(True)
                    # Strip content-length if present, else it'll be wrong
                    response_headers = [(name, value)
                        for name, value in response_headers
                            if name.lower() != 'content-length'
                    ]
                    break

            write = start_response(status, response_headers, exc_info)

            if transform_ok:
                def write_latin(data):
                    write(piglatin(data))   # call must be byte-safe on Py3
                return write_latin
            else:
                return write

        return LatinIter(self.application(environ, start_latin), transform_ok)

# Run foo_app under a Latinator's control, using the example CGI gateway
from foo_app import foo_app
run_with_cgi(Latinator(foo_app))

从中间件可以学习到的知识有:

  • 中间件需要调用应用对象(实现了服务端应用部分)
  • 中间件同样必须是一个可调用对象(实现了应用对象部分), 每次调用中间件时需要出入两个位置参数environstart_response
  • 服务端,中间件,应用端调用关系:中间件作为应用端的“服务端”调用应用对象;实例化的结果作为服务端的“应用端”供服务端进行调用。

wsgi细节

  • environ变量必须是一个built-in python dict,需要包含一些wsgi变量,除此之外还可以包含一些服务端特定的变量。
  • start_response函数有两个必须的位置参数,还可以有一个可选的位置参数,为了方便一般取名为statusresponse_headersexc_info。应用对象必须调用start_response
  • The status parameter is a status string of the form "999 Message here"
  • The response_headers is a list of (header_name, header_value) tuples describing the HTTP response header.
  • The optional exc_info parameter is used only when the application has trapped an error and is attempting to display an error message to the browser.
  • The start_response callable must return a write(body_data) callable that takes one positional parameter: a bytestring to be written as part of the HTTP response body.
  • When called by the server, the application object must return an iterable yielding zero or more bytestrings. This can be accomplished in a variety of ways, such as by returning a list of bytestrings, or by the application being a generator function that yields bytestrings, or by the application being a class whose instances are iterable.
  • The server or gateway must transmit the yielded bytestrings to the client in an unbuffered fashion, completing the transmission of each bytestring before requesting another one.
  • The server or gateway should treat the yielded bytestrings as binary byte sequences: in particular, it should ensure that line endings are not altered. The application is responsible for ensuring that the bytestring(s) to be written are in a format suitable for the client.
  • If the iterable returned by the application has a close() method, the server or gateway must call that method upon completion of the current request, whether the request was completed normally, or terminated early due to an application error during iteration or an early disconnect of the browser.

转载于:https://my.oschina.net/alazyer/blog/737969