Skip to content
This repository has been archived by the owner on Jan 13, 2021. It is now read-only.

Commit

Permalink
Merge pull request #92 from Lukasa/http11
Browse files Browse the repository at this point in the history
[WIP] HTTP/1.1
  • Loading branch information
Lukasa committed Apr 3, 2015
2 parents d374985 + a81e0a2 commit 1a248b5
Show file tree
Hide file tree
Showing 38 changed files with 2,366 additions and 415 deletions.
9 changes: 6 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,16 @@ python:
- pypy

env:
- TEST_RELEASE=false
- TEST_RELEASE=true
- TEST_RELEASE=false HYPER_FAST_PARSE=false
- TEST_RELEASE=false HYPER_FAST_PARSE=true
- TEST_RELEASE=true HYPER_FAST_PARSE=false
- TEST_RELEASE=true HYPER_FAST_PARSE=true
- NGHTTP2=true

matrix:
allow_failures:
- env: TEST_RELEASE=true
- env: TEST_RELEASE=true HYPER_FAST_PARSE=true
- env: TEST_RELEASE=true HYPER_FAST_PARSE=false
exclude:
- env: NGHTTP2=true
python: pypy
Expand Down
4 changes: 4 additions & 0 deletions .travis/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -39,5 +39,9 @@ if [[ "$NGHTTP2" = true ]]; then
sudo ldconfig
fi

if [[ "$HYPER_FAST_PARSE" = true ]]; then
pip install pycohttpparser~=1.0
fi

pip install .
pip install -r test_requirements.txt
25 changes: 25 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,31 @@
Release History
===============

Upcoming
--------

*New Features*

- HTTP/1.1 support! See the documentation for more. (`Issue #75`_)
- Implementation of a ``HTTPHeaderMap`` data structure that provides dictionary
style lookups while retaining all the semantic information of HTTP headers.

*Major Changes*

- Various changes in the HTTP/2 APIs:

- The ``getheader``, ``getheaders``, ``gettrailer``, and ``gettrailers``
methods on the response object have been removed, replaced instead with
simple ``.headers`` and ``.trailers`` properties that contain
``HTTPHeaderMap`` structures.
- Headers and trailers are now bytestrings, rather than unicode strings.
- An ``iter_chunked()`` method was added to repsonse objects that allows
iterating over data in units of individual data frames.
- Changed the name of ``getresponse()`` to ``get_response()``, because
``getresponse()`` was a terrible name forced upon me by httplib.

.. _Issue #75: https://github.com/Lukasa/hyper/issues/75

0.2.2 (2015-04-03)
------------------

Expand Down
53 changes: 41 additions & 12 deletions docs/source/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,25 +13,54 @@ may want to keep your connections alive only as long as you know you'll need
them. In HTTP/2 this is generally not something you should do unless you're
very confident you won't need the connection again anytime soon. However, if
you decide you want to avoid keeping the connection open, you can use the
:class:`HTTP20Connection <hyper.HTTP20Connection>` as a context manager::
:class:`HTTP20Connection <hyper.HTTP20Connection>` and
:class:`HTTP11Connection <hyper.HTTP11Connection>` as context managers::

with HTTP20Connection('http2bin.org') as conn:
conn.request('GET', '/get')
data = conn.getresponse().read()

analyse(data)

You may not use any :class:`HTTP20Response <hyper.HTTP20Response>` objects
obtained from a connection after that connection is closed. Interacting with
these objects when a connection has been closed is considered undefined
behaviour.
You may not use any :class:`HTTP20Response <hyper.HTTP20Response>` or
:class:`HTTP11Response <hyper.HTTP11Response>` objects obtained from a
connection after that connection is closed. Interacting with these objects when
a connection has been closed is considered undefined behaviour.

Chunked Responses
-----------------

Plenty of APIs return chunked data, and it's often useful to iterate directly
over the chunked data. ``hyper`` lets you iterate over each data frame of a
HTTP/2 response, and each chunk of a HTTP/1.1 response delivered with
``Transfer-Encoding: chunked``::

for chunk in response.read_chunked():
do_something_with_chunk(chunk)

There are some important caveats with this iteration: mostly, it's not
guaranteed that each chunk will be non-empty. In HTTP/2, it's entirely legal to
send zero-length data frames, and this API will pass those through unchanged.
Additionally, by default this method will decompress a response that has a
compressed ``Content-Encoding``: if you do that, each element of the iterator
will no longer be a single chunk, but will instead be whatever the decompressor
returns for that chunk.

If that's problematic, you can set the ``decode_content`` parameter to
``False`` and, if necessary, handle the decompression yourself::

for compressed_chunk in response.read_chunked(decode_content=False):
decompress(compressed_chunk)

Very easy!

Multithreading
--------------

Currently, ``hyper``'s :class:`HTTP20Connection <hyper.HTTP20Connection>` class
is **not** thread-safe. Thread-safety is planned for ``hyper``'s core objects,
but in this early alpha it is not a high priority.
Currently, ``hyper``'s :class:`HTTP20Connection <hyper.HTTP20Connection>` and
:class:`HTTP11Connection <hyper.HTTP11Connection>` classes are **not**
thread-safe. Thread-safety is planned for ``hyper``'s core objects, but in this
early alpha it is not a high priority.

To use ``hyper`` in a multithreaded context the recommended thing to do is to
place each connection in its own thread. Each thread should then have a request
Expand Down Expand Up @@ -130,7 +159,7 @@ In order to receive pushed resources, the
with ``enable_push=True``.

You may retrieve the push promises that the server has sent *so far* by calling
:meth:`getpushes() <hyper.HTTP20Connection.getpushes>`, which returns a
:meth:`get_pushes() <hyper.HTTP20Connection.get_pushes>`, which returns a
generator that yields :class:`HTTP20Push <hyper.HTTP20Push>` objects. Note that
this method is not idempotent; promises returned in one call will not be
returned in subsequent calls. If ``capture_all=False`` is passed (the default),
Expand All @@ -143,11 +172,11 @@ the original response, or when also processing the original response in a
separate thread (N.B. do not do this; ``hyper`` is not yet thread-safe)::

conn.request('GET', '/')
response = conn.getheaders()
for push in conn.getpushes(): # all pushes promised before response headers
response = conn.get_response()
for push in conn.get_pushes(): # all pushes promised before response headers
print(push.path)
conn.read()
for push in conn.getpushes(): # all other pushes
for push in conn.get_pushes(): # all other pushes
print(push.path)

To cancel an in-progress pushed stream (for example, if the user already has
Expand Down
6 changes: 6 additions & 0 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,12 @@ Primary HTTP/2 Interface
.. autoclass:: hyper.HTTP20Push
:inherited-members:

.. autoclass:: hyper.HTTP11Connection
:inherited-members:

.. autoclass:: hyper.HTTP11Response
:inherited-members:

Headers
-------

Expand Down
3 changes: 2 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ Simple. ``hyper`` is written in 100% pure Python, which means no C extensions.
For recent versions of Python (3.4 and onward, and 2.7.9 and onward) it's
entirely self-contained with no external dependencies.

``hyper`` supports Python 3.4 and Python 2.7.9.
``hyper`` supports Python 3.4 and Python 2.7.9, and can speak HTTP/2 and
HTTP/1.1.

Caveat Emptor!
--------------
Expand Down
58 changes: 47 additions & 11 deletions docs/source/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@
Quickstart Guide
================

First, congratulations on picking ``hyper`` for your HTTP/2 needs. ``hyper``
is the premier (and, as far as we're aware, the only) Python HTTP/2 library.
First, congratulations on picking ``hyper`` for your HTTP needs. ``hyper``
is the premier (and, as far as we're aware, the only) Python HTTP/2 library,
as well as a very servicable HTTP/1.1 library.

In this section, we'll walk you through using ``hyper``.

Installing hyper
Expand Down Expand Up @@ -46,8 +48,8 @@ instructions from the `cryptography`_ project, replacing references to

.. _cryptography: https://cryptography.io/en/latest/installation/#installation

Making Your First Request
-------------------------
Making Your First HTTP/2 Request
--------------------------------

With ``hyper`` installed, you can start making HTTP/2 requests. At this
stage, ``hyper`` can only be used with services that *definitely* support
Expand All @@ -61,7 +63,7 @@ Begin by getting the homepage::
>>> c = HTTP20Connection('http2bin.org')
>>> c.request('GET', '/')
1
>>> resp = c.getresponse()
>>> resp = c.get_response()

Used in this way, ``hyper`` behaves exactly like ``http.client``. You can make
sequential requests using the exact same API you're accustomed to. The only
Expand All @@ -72,13 +74,12 @@ HTTP/2 *stream identifier*. If you're planning to use ``hyper`` in this very
simple way, you can choose to ignore it, but it's potentially useful. We'll
come back to it.

Once you've got the data, things continue to behave exactly like
``http.client``::
Once you've got the data, things diverge a little bit::

>>> resp.getheader('content-type')
'text/html; charset=utf-8'
>>> resp.getheaders()
[('server', 'h2o/1.0.2-alpha1')...
>>> resp.headers['content-type']
[b'text/html; charset=utf-8']
>>> resp.headers
HTTPHeaderMap([(b'server', b'h2o/1.0.2-alpha1')...
>>> resp.status
200

Expand Down Expand Up @@ -111,6 +112,41 @@ For example::

``hyper`` will ensure that each response is matched to the correct request.

Making Your First HTTP/1.1 Request
-----------------------------------

With ``hyper`` installed, you can start making HTTP/2 requests. At this
stage, ``hyper`` can only be used with services that *definitely* support
HTTP/2. Before you begin, ensure that whichever service you're contacting
definitely supports HTTP/2. For the rest of these examples, we'll use
Twitter.

You can also use ``hyper`` to make HTTP/1.1 requests. The code is very similar.
For example, to get the Twitter homepage::

>>> from hyper import HTTP11Connection
>>> c = HTTP11Connection('twitter.com:443')
>>> c.request('GET', '/')
>>> resp = c.get_response()

The key difference between HTTP/1.1 and HTTP/2 is that when you make HTTP/1.1
requests you do not get a stream ID. This is, of course, because HTTP/1.1 does
not have streams.

Things behave exactly like they do in the HTTP/2 case, right down to the data
reading::

>>> resp.headers['content-encoding']
[b'deflate']
>>> resp.headers
HTTPHeaderMap([(b'x-xss-protection', b'1; mode=block')...
>>> resp.status
200
>>> body = resp.read()
b'<!DOCTYPE html>\n<!--[if IE 8]><html clas ....

That's all it takes.

Requests Integration
--------------------

Expand Down
10 changes: 9 additions & 1 deletion hyper/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,21 @@

from .http20.connection import HTTP20Connection
from .http20.response import HTTP20Response, HTTP20Push
from .http11.connection import HTTP11Connection
from .http11.response import HTTP11Response

# Throw import errors on Python <2.7 and 3.0-3.2.
import sys as _sys
if _sys.version_info < (2,7) or (3,0) <= _sys.version_info < (3,3):
raise ImportError("hyper only supports Python 2.7 and Python 3.3 or higher.")

__all__ = [HTTP20Response, HTTP20Push, HTTP20Connection]
__all__ = [
HTTP20Response,
HTTP20Push,
HTTP20Connection,
HTTP11Connection,
HTTP11Response,
]

# Set default logging handler.
import logging
Expand Down
2 changes: 1 addition & 1 deletion hyper/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ def get_content_type_and_charset(response):
def request(args):
conn = HTTP20Connection(args.url.host, args.url.port)
conn.request(args.method, args.url.path, args.body, args.headers)
response = conn.getresponse()
response = conn.get_response()
log.debug('Response Headers:\n%s', pformat(response.getheaders()))
ctype, charset = get_content_type_and_charset(response)
data = response.read().decode(charset)
Expand Down
31 changes: 31 additions & 0 deletions hyper/http20/bufsocket.py → hyper/common/bufsocket.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,20 @@ def can_read(self):

return False

@property
def buffer(self):
"""
Get access to the buffer itself.
"""
return self._buffer_view[self._index:self._buffer_end]

def advance_buffer(self, count):
"""
Advances the buffer by the amount of data consumed outside the socket.
"""
self._index += count
self._bytes_in_buffer -= count

def new_buffer(self):
"""
This method moves all the data in the backing buffer to the start of
Expand Down Expand Up @@ -145,6 +159,23 @@ def recv(self, amt):

return data

def fill(self):
"""
Attempts to fill the buffer as much as possible. It will block for at
most the time required to have *one* ``recv_into`` call return.
"""
if not self._remaining_capacity:
self.new_buffer()

count = self._sck.recv_into(self._buffer_view[self._buffer_end:])
if not count:
raise ConnectionResetError()

self._bytes_in_buffer += count

return


def readline(self):
"""
Read up to a newline from the network and returns it. The implicit
Expand Down
48 changes: 48 additions & 0 deletions hyper/common/decoder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# -*- coding: utf-8 -*-
"""
hyper/common/decoder
~~~~~~~~~~~~~~~~~~~~
Contains hyper's code for handling compressed bodies.
"""
import zlib


class DeflateDecoder(object):
"""
This is a decoding object that wraps ``zlib`` and is used for decoding
deflated content.
This rationale for the existence of this object is pretty unpleasant.
The HTTP RFC specifies that 'deflate' is a valid content encoding. However,
the spec _meant_ the zlib encoding form. Unfortunately, people who didn't
read the RFC very carefully actually implemented a different form of
'deflate'. Insanely, ``zlib`` handles them using two wbits values. This is
such a mess it's hard to adequately articulate.
This class was lovingly borrowed from the excellent urllib3 library under
license: see NOTICES. If you ever see @shazow, you should probably buy him
a drink or something.
"""
def __init__(self):
self._first_try = True
self._data = b''
self._obj = zlib.decompressobj(zlib.MAX_WBITS)

def __getattr__(self, name):
return getattr(self._obj, name)

def decompress(self, data):
if not self._first_try:
return self._obj.decompress(data)

self._data += data
try:
return self._obj.decompress(data)
except zlib.error:
self._first_try = False
self._obj = zlib.decompressobj(-zlib.MAX_WBITS)
try:
return self.decompress(self._data)
finally:
self._data = None
Loading

0 comments on commit 1a248b5

Please sign in to comment.