基于Python的Web应用程序的Web服务器比较
基于Python的Web应用程序的Web服务器比较
介绍
在本文中,我们将讨论三个主要内容:Python,Web服务器,最重要的是两者之间的比较。
Python Web服务器网关接口v1.0(WSGI)
问题
Today, there exists web servers (or modules for servers) in ever growing numbers specifically designed (or adapted) to work with Python web applications interchangeably. However, this has not always been the case. In ye olde days, developers did not really have the possibility to switch web servers at will easily, and each switch came with a cost due to dependencies and limitations. Upon deciding on a framework to build on, you would have also decided, not always willingly nor consciously, on the server(s) you could use to serve the application as well. This was due to the lack of existence of a universally accepted interface specification: a common ground which applications (frameworks) and web servers alike would adapt and use to communicate, allowing interchangeability of components when necessary with possibly zero code change.
标准的诞生
Python增强提案*(PEP)333*出台。
This document specifies a proposed standard
interface between web servers and Python web
applications or frameworks, to promote web
application portability across a variety of
web servers.
允许在[web]服务器和[Python web]应用程序之间(和之间)的可移植性。
比较
在对基于Python的Web应用程序的Web服务器的比较中,我们将讨论一些可用的选择以及使它们脱颖而出的因素。
Web服务器(字母顺序)
-
CherryPy WSGI Server
What is it?
CherryPy is actually a web framework. Yet it is a fully self-contained one – meaning that it can run on its own, including in production scenarios without the need of additional software. This is achieved thanks to its own WSGI, HTTP/1.1-compliant web server. CherryPy project describes it as a “A high-speed, production ready, thread pooled, generic HTTP server”. As it is a WSGI server, it can be used to serve any other WSGI Python application as well, without being bound to CherryPy’s application development framework.
Why should you consider using it?
- It is compact and simple.
- It can serve any Python web applications running on WSGI.
- It can handle static files and it can just be used to serve files and folders alone.
- It is thread-pooled.
- It comes with support for SSL.
- It is an easy to adapt, easy to use pure-Python alternative which is robust and reliable.
Gunicorn
What is it?
Gunicorn is a stand-alone web server which offers quite a bit of functionality in a significantly easy to operate fashion. It uses the pre-fork model – meaning that a central master process (Gunicorn) is tasked with managing the initiated worker processes (of differing types), which then handle and deal with the requests directly. And all this can be configured and adapted to suit your needs and diverse production scenarios.
Why should you consider using it?
- It supports WSGI and can be used with any WSGI running Python application and framework.
- It can also be used as a drop-in replacement for Paster (ex: Pyramid), Django’s Development Server, web2py, et alia.
- Offers the choice of various worker types/configurations and automatic worker process management.
- HTTP/1.0 and HTTP/1.1 (Keep-Alive) support through synchronous and asynchronous workers.
- Comes with SSL support
- Extensible with hooks.
- It is transparent and has a clear architecture.
- Supports Python versions 2.6, 2.7, 3, 3.2, 3.3
Tornado (HTTP Server via wsgi.WSGIContainer)
What is it?
Tornado is an application development framework and a networking library designed for handling asynchrnous operations, allowing servers to maintain a lot of open connections. It also comes with a WSGI server which other WSGI Python applications (and frameworks) can use to run.
Why should you consider using it?
- If you are building on top Tornado framework; or
- Your application needs asynchronous functionality.
Although under these circumstances you might want to choose Tornado’s WSGI server for your project, you can also opt to use Gunicorn with Tornado [Asynchronous] workers.
Twisted Web
What is it?
Twisted Web is the web server that comes with the Twisted networking library. Whereas Twisted itself is “an event-driven networking engine”, the Twisted Web server runs on WSGI and it is capable of powering other Python web applications.
Why should you consider using it?
- It is a simple to use, stable and mature product.
- It will run WSGI Python applications.
- It can act like a Python web server framework, allowing you to program it with the language for custom HTTP serving purposes.
- It offers simple and fast prototyping ability through
Python Scrips (.rpy)
which are executed upon HTTP requests. - It comes with proxy and reverse-proxy capabilities.
- It supports Virtual Hosts.
- It can even serve Perl, PHP et cetera via twisted.web.twcgi API.
uWSGI
What is it?
Despite its very confusing naming conventions, uWSGI itself is a vast project with many components, aiming to provide a
full [software] stack
forbuilding hosting services
. One of these components, the uWSGI server, runs Python WSGI applications. It is capable of using various protocols, including its own uwsgi wire protocol, which is quasi-identical to SCGI. In order to fulfil the (understandable) demand to use stand-alone HTTP servers in front of application servers, NGINX and Cherokee web servers are modularised to support uWSGI’s (best performing) uwsgi protocol to have direct control over its processes.Why should you consider using it?
- uWSGI comes with a WSGI adapter and it fully supports Python applications running on WSGI.
- It links with libpython. It loads the application code on startup and acts like a Python interpreter. It parses the incoming requests and invokes the Python callable.
- It comes with direct support for popular NGINX web server (along with Cherokee* and lighttpd).
- It is written in C.
- Its various components can do much more than running an application, which might be handy for expansion.
- Currently (as of late 2013), it is actively developed and has fast release cycles.
- It has various engines for running applications (asynchronous and synchronous).
- It can mean lower memory footprint to run.
Waitress WSGI Server
What is it?
Waitress is a pure-Python WSGI server. At a first glance it might not appear to be that much different than many others; however, its development philosophy separates it from the rest. Its aim for easing the production (and development) burden caused by web servers for Python web-application developers. Waitress achieves this by neutralizing issues caused by platform (ex. Unix vs. Windows), interpreter (CPython vs. PyPy) and Python (version 2 vs. 3) differences.
Why should you consider using it?
- It is a very lean, pure-Python solution.
- It supports HTTP/1.0 and HTTP/1.1 (Keep-Alive).
- It comes ready to be deployed for production with a wide array of platform support.
- Unlike CherryPy, it actually is framework-independent in its nature.
- It runs on Windows and Unix, and on CPython interpreter and PyPy (Unix only).
- It supports Python versions 2 and 3.
Modules for Stand-Alone Servers
mod_python with a WSGI adapter (Apache) (Embedding Python)
What is it?
Simply put, mod_python is an Apache module that embeds Python within the server itself. Although not recommended for various reasons (project is dead and outdated with only very recent intentions to continue the development by original author), it can be used to run WSGI applications on Apache via wrappers.
Why should you consider using it?
You might want to program and extend Apache using Python for a specific reason.
mod_wsgi (Apache) (Embedding Python)
What is it?
Being a WSGI compliant module, modwsgi allows you to run Python WSGI applications on Apache HTTP Server. It achieves this in two ways: the first being similar to how modpython works, by embedding the code and executing it within the child process. The other method offers a daemon based operation mode whereby the WSGI application has its own distinct process, managed automatically by mod_wsgi.
Why should you consider using it?
- Existing experience with Apache can mean a stable production environment for your operations when it comes to running Python as well. This alone can save the day, making it worth it.
- If you are dependant on Apache or want to take advantage of its stable and rich extension modules, it will be the way to go.
- It can run applications under different system users for further security.
- It is a tried and tested, reliable software.
- World Wide Web contains tonnes of information and Q&A’s related to it which can save you a lot of time when you encounter a real production problem.
- And it comes with all other functionality that Apache offers.
结论
我们的python框架版本是3.x,所以,选择了兼容性很好的web服务器Gunicorn;同时,Gunicorn配置的异步工作模式,可以把性能发挥到极致;唯一缺点是慢速网络环境的性能下降比较快,但是,对于局域网应用来说慢速访问可以忽略,即使存在慢速访问,也可以选择中间件(nginx)合并使用来规避。另外,gunicorn的兼容进程管理工具supervisor对进程的健康检查和自动拉起,保证了业务应用达到了服务级别(服务至少在 99.9% 的时间内都可用,如果使用F5负载均衡器设计可以达到99.99%)。
Gunicorn 详细介绍
授权协议:MIT
开发语言:Python
操作系统:Linux
Gunicorn 绿色独角兽’是一个Python WSGI UNIX的HTTP服务器。这是一个pre-fork worker的模型,从Ruby的独角兽(Unicorn )项目移植。该Gunicorn服务器大致与各种Web框架兼容,只需非常简单的执行,轻量级的资源消耗,以及相当迅速。
结构图:
与 uWSGI 的性能比较:
特点:
- 本身支持WSGI、Django、Paster
- 自动辅助进程管理
- 简单的 Python配置
- 允许配置多个工作环境
- 各种服务器的可扩展钩子
- 与 Python 2.x > = 2.5,3.x >= 3.2 兼容
安装:
$ pip install gunicorn
$ cat myapp.py
def app(environ, start_response):
data = b"Hello, World!\n"
start_response("200 OK", [
("Content-Type", "text/plain"),
("Content-Length", str(len(data)))
])
return iter([data])
$ gunicorn -w 4 myapp:app
[2014-09-10 10:22:28 +0000] [30869] [INFO] Listening at: http://127.0.0.1:8000 (30869)
[2014-09-10 10:22:28 +0000] [30869] [INFO] Using worker: sync
[2014-09-10 10:22:28 +0000] [30874] [INFO] Booting worker with pid: 30874
[2014-09-10 10:22:28 +0000] [30875] [INFO] Booting worker with pid: 30875
[2014-09-10 10:22:28 +0000] [30876] [INFO] Booting worker with pid: 30876
[2014-09-10 10:22:28 +0000] [30877] [INFO] Booting worker with pid: 30877 Gunicorn的架构
服务模型(Server Model)
Gunicorn是基于 pre-fork 模型的。也就意味着有一个中心管理进程( master process )用来管理 worker 进程集合。Master从不知道任何关于客户端的信息。所有的请求和响应处理都是由 worker 进程来处理的。
Master(管理者)
主程序是一个简单的循环,监听各种信号以及相应的响应进程。master管理着正在运行的worker集合,通过监听各种信号比如TTIN, TTOU, and CHLD. TTIN and TTOU响应的增加和减少worker的数目。CHLD信号表明一个子进程已经结束了,在这种情况下master会自动的重启失败的worker。
Worker类型:
Sync Workers
The most basic and the default worker type is a synchronous worker class that handles a single request at a time. This model is the simplest to reason about as any errors will affect at most a single request. Though as we describe below only processing a single request at a time requires some assumptions about how applications are programmed.
Async Workers
The asynchronous workers available are based on Greenlets (via Eventlet and Gevent). Greenlets are an implementation of cooperative multi-threading for Python. In general, an application should be able to make use of these worker classes with no changes.
Tornado Workers
There’s also a Tornado worker class. It can be used to write applications using the Tornado framework. Although the Tornado workers are capable of serving a WSGI application, this is not a recommended configuration.
Choosing a Worker Type
The default synchronous workers assume that your application is resource bound in terms of CPU and network bandwidth. Generally this means that your application shouldn’t do anything that takes an undefined amount of time. For instance, a request to the internet meets this criteria. At some point the external network will fail in such a way that clients will pile up on your servers.
This resource bound assumption is why we require a buffering proxy in front of a default configuration Gunicorn. If you exposed synchronous workers to the internet, a DOS attack would be trivial by creating a load that trickles data to the servers. For the curious, Slowloris is an example of this type of load.
Some examples of behavior requiring asynchronous workers:
- Applications making long blocking calls (Ie, external web services)
- Serving requests directly to the internet
- Streaming requests and responses
- Long polling
- Web sockets
- Comet
How Many Workers?
DO NOT scale the number of workers to the number of clients you expect to have. Gunicorn should only need 4-12 worker processes to handle hundreds or thousands of requests per second.
Gunicorn relies on the operating system to provide all of the load balancing when handling requests. Generally we recommend (2 x $num_cores) + 1
as the number of workers to start off with. While not overly scientific, the formula is based on the assumption that for a given core, one worker will be reading or writing from the socket while the other worker is processing a request.
Obviously, your particular hardware and application are going to affect the optimal number of workers. Our recommendation is to start with the above guess and tune using TTIN and TTOU signals while the application is under load.
Always remember, there is such a thing as too many workers. After a point your worker processes will start thrashing system resources decreasing the throughput of the entire system。
Supervisor详细介绍
介绍 |
---|
概观 |
Supervisor是一个客户端/服务器系统,允许其用户在类UNIX操作系统上控制许多进程。它受到以下灵感的启发: |
方便 |
为每个流程实例编写rc.d 脚本通常很不方便。 rc.d 脚本是进程初始化/自动启动/管理的最低通用分母形式,但编写和维护可能会很痛苦。此外,rc.d 脚本无法自动重新启动崩溃的进程,并且许多程序在崩溃时无法正常重新启动。Supervisord将进程作为其子进程启动,并且可以配置为在崩溃时自动重新启动它们。它还可以自动配置为在其自己的调用上启动进程。 |
准确性 |
在UNIX上的进程上,通常很难获得准确的上/下状态。Pidfiles经常撒谎。Supervisord将进程作为子进程启动,因此它始终知道其子进程的真正上/下状态,并且可以方便地查询此数据。 |
代表团 |
需要控制进程状态的用户通常只需要这样做。他们不希望或需要对运行进程的机器进行全面的shell访问。侦听“低”TCP端口的进程通常需要以root用户身份启动和重新启动(UNIX错误)。通常情况下,允许“普通”人员停止或重新启动此类进程是完全正常的,但为他们提供shell访问权限通常是不切实际的,并且通常无法为他们提供root访问权限或sudo访问权限。它(正确地)很难向他们解释为什么存在这个问题。如果以root身份启动supervisord,则可以允许“普通”用户控制此类进程,而无需向他们解释问题的复杂性。Supervisorctl允许以非常有限的方式访问机器, |
流程组 |
流程通常需要分组启动和停止,有时甚至是“优先顺序”。通常很难向人们解释如何做到这一点。Supervisor允许您为进程分配优先级,并允许用户通过supervisorctl客户端发出命令,如“start all”和“restart all”,以预先分配的优先级顺序启动它们。此外,可以将流程分组为“流程组”,并且可以停止一组逻辑相关流程并将其作为一个单元启动。 |
特征 |
简单 |
Supervisor通过简单的INI样式配置文件进行配置,该文件易于学习。它提供了许多每个进程选项,使您的生活更轻松,如重新启动失败的进程和自动日志轮换。 |
集中 |
主管为您提供一个启动,停止和监控流程的位置。流程可以单独控制,也可以成组控制。您可以将Supervisor配置为提供本地或远程命令行和Web界面。 |
高效 |
主管通过fork / exec启动其子进程,子进程不进行守护。当进程终止时,操作系统会立即向Supervisor发出信号,这与某些依赖麻烦的PID文件和定期轮询重新启动失败进程的解决方案不同。 |
扩展 |
Supervisor有一个简单的事件通知协议,用任何语言编写的程序都可以用它来监视它,以及一个用于控制的XML-RPC接口。它还使用可由Python开发人员利用的扩展点构建。 |
兼容 |
除了Windows之外,Supervisor几乎可以处理所有事情。它在Linux,Mac OS X,Solaris和FreeBSD上经过测试和支持。它完全用Python编写,因此安装不需要C编译器。 |
久经考验 |
虽然Supervisor今天非常活跃,但它并不是新软件。已在许多服务器上使用。 |