An HTTP handler for urllib2 that supports HTTP 1.1 and keepalive.
This is keepalive.py from the urlgrabber project, available under the GNU LGPL ported to python 3 (with help from 2to3).
>>> import urllib2 >>> from keepalive import HTTPHandler >>> keepalive_handler = HTTPHandler() >>> opener = urllib2.build_opener(keepalive_handler) >>> urllib2.install_opener(opener) >>> >>> fo = urllib2.urlopen('http://www.python.org')
To remove the handler, simply re-run build_opener with no arguments, and install that opener.
You can explicitly close connections by using the close_connection() method of the returned file-like object (described below) or you can use the handler methods:
close_connection(host) close_all() open_connections()
EXTRA ATTRIBUTES AND METHODS
Upon a status of 200, the object returned has a few additional attributes and methods, which should not be used if you want to remain consistent with the normal urllib2-returned objects:close_connection() - close the connection to the host readlines() - you know, readlines() status - the return status (ie 404) reason - english translation of status (ie ‘File not found’)
If you want the best of both worlds, use this inside an AttributeError-catching try:>>> try: status = fo.status >>> except AttributeError: status = None
Unfortunately, these are ONLY there if status == 200, so it’s not easy to distinguish between non-200 responses. The reason is that urllib2 tries to do clever things with error codes 301, 302, 401, and 407, and it wraps the object upon return.
You can optionally set the module-level global HANDLE_ERRORS to 0, in which case the handler will always return the object directly. If you like the fancy handling of errors, don’t do this. If you prefer to see your error codes, then do.
close all open connections
close connection to <host> host is the host:port spec, as in ‘www.cnn.com:8080’ as passed in. no error occurs if there is no connection to that host.
return a list of connected hosts