gh-135056: Add a --header CLI argument to http.server#135057
gh-135056: Add a --header CLI argument to http.server#135057picnixz merged 61 commits intopython:mainfrom
Conversation
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Add a --cors command line argument to the stdlib http.server module, which will add an `Access-Control-Allow-Origin: *` header to all responses. As part of this implementation, add a `response_headers` argument to SimpleHTTPRequestHandler and HttpServer, which allows callers to add addition headers to the response.
3f11652 to
0d02fbe
Compare
hugovk
left a comment
There was a problem hiding this comment.
(I'd prefer a general headers option, but will comment on the issue or Discourse topic)
|
|
This fixes the breakage to HttpServer as used by wsgiref.
|
test_wsgiref fixed in a3256fd. This should fix any backwards incompatibility errors erroneously introduced in the first commit. |
|
I think it's worth adding to this |
|
For me, I don't think add
|
|
@Zheaoli Please comment in the discussion thread: https://discuss.python.org/t/any-interest-in-adding-a-cors-option-to-python-m-http-server/92120. |
https://discuss.python.org/t/any-interest-in-adding-a-cors-option-to-python-m-http-server/92120/24 |
picnixz
left a comment
There was a problem hiding this comment.
I'm not very fond of how the HTTP server class is growing more and more with more __init__ parameters, but I don't have a better idea for now. Maybe a generic configuration object but this would be an overkill for this class in particular I think.
|
|
||
| .. versionadded:: 3.14 | ||
|
|
||
| .. option:: --cors |
There was a problem hiding this comment.
As Hugo said, since we're anyway exposing response-headers, I think we should also expose it from the CLI. It could be useful for users in general (e.g., --add-header NAME VALUE with the -H alias).
| @@ -0,0 +1,2 @@ | |||
| Add a ``--cors`` cli option to :program:`python -m http.server`. Contributed by | |||
There was a problem hiding this comment.
Let's also update What's New/3.15.rst
There was a problem hiding this comment.
I used blurb to make this entry in NEWS.d, not knowing when it's appropriate to edit the main 3.15.rst file. I think once we know if we're doing --cors / --header , or both, I can make the appropriate update to What's New/3.15.rst
| def __init__(self, server_address, RequestHandlerClass, | ||
| bind_and_activate=True, *, certfile, keyfile=None, | ||
| password=None, alpn_protocols=None): | ||
| password=None, alpn_protocols=None, response_headers=None): |
There was a problem hiding this comment.
| password=None, alpn_protocols=None, response_headers=None): | |
| password=None, alpn_protocols=None, **http_server_kwargs): |
And pass http_server_kwargs to super()
| args = (request, client_address, self) | ||
| kwargs = {} | ||
| response_headers = getattr(self, 'response_headers', None) | ||
| if response_headers: | ||
| kwargs['response_headers'] = self.response_headers | ||
| self.RequestHandlerClass(*args, **kwargs) |
There was a problem hiding this comment.
| args = (request, client_address, self) | |
| kwargs = {} | |
| response_headers = getattr(self, 'response_headers', None) | |
| if response_headers: | |
| kwargs['response_headers'] = self.response_headers | |
| self.RequestHandlerClass(*args, **kwargs) | |
| kwargs = {} | |
| if hasattr(self, 'response_headers'): | |
| kwargs['response_headers'] = self.response_headers | |
| self.RequestHandlerClass(request, client_address, self, **kwargs) |
There was a problem hiding this comment.
@picnixz I made this requested change in 77b5fff. Note though that now HTTPServer will pass response_headers to the RequestHandler class even if response_headers is None or {}. Most RequestHandler implementation constructor implementations don't take this argument, so in order for this to work I added **kwargs as an argument to BaseRequestHandler.__init__. My earlier implementation was trying to prevent this, to keep any changes to only http/server.py and not need to touch anything in socketserver.py. Perhaps the **kwargs addition is ok, or I'm open to other solutions if we can think of better ones.
| def __init__(self, *args, directory=None, response_headers=None, **kwargs): | ||
| if directory is None: | ||
| directory = os.getcwd() | ||
| self.directory = os.fspath(directory) | ||
| self.response_headers = response_headers or {} |
There was a problem hiding this comment.
| def __init__(self, *args, directory=None, response_headers=None, **kwargs): | |
| if directory is None: | |
| directory = os.getcwd() | |
| self.directory = os.fspath(directory) | |
| self.response_headers = response_headers or {} | |
| def __init__(self, *args, directory=None, response_headers=None, **kwargs): | |
| if directory is None: | |
| directory = os.getcwd() | |
| self.directory = os.fspath(directory) | |
| self.response_headers = response_headers |
You're already checking for is not None later
| ServerClass=ThreadingHTTPServer, | ||
| protocol="HTTP/1.0", port=8000, bind=None, | ||
| tls_cert=None, tls_key=None, tls_password=None): | ||
| tls_cert=None, tls_key=None, tls_password=None, response_headers=None): |
There was a problem hiding this comment.
| tls_cert=None, tls_key=None, tls_password=None, response_headers=None): | |
| tls_cert=None, tls_key=None, tls_password=None, | |
| response_headers=None): |
Let's group the parameters per purpose
| handler_args = (request, client_address, self) | ||
| handler_kwargs = dict(directory=args.directory) | ||
| if self.response_headers: | ||
| handler_kwargs['response_headers'] = self.response_headers | ||
| self.RequestHandlerClass(*handler_args, **handler_kwargs) |
There was a problem hiding this comment.
| handler_args = (request, client_address, self) | |
| handler_kwargs = dict(directory=args.directory) | |
| if self.response_headers: | |
| handler_kwargs['response_headers'] = self.response_headers | |
| self.RequestHandlerClass(*handler_args, **handler_kwargs) | |
| self.RequestHandlerClass(request, client_address, self, | |
| directory=args.directory, | |
| response_headers=self.response_headers) |
| ) | ||
| else: | ||
| self.server = HTTPServer(('localhost', 0), self.request_handler) | ||
| self.server = HTTPServer(('localhost', 0), self.request_handler, |
There was a problem hiding this comment.
You must also modify create_https_server appropriately
| server_kwargs = dict( | ||
| response_headers = {'Access-Control-Allow-Origin': '*'} | ||
| ) |
There was a problem hiding this comment.
| server_kwargs = dict( | |
| response_headers = {'Access-Control-Allow-Origin': '*'} | |
| ) | |
| server_kwargs = { | |
| 'response_headers': {'Access-Control-Allow-Origin': '*'} | |
| } |
| server_kwargs = dict( | ||
| response_headers = {'Access-Control-Allow-Origin': '*'} | ||
| ) | ||
| def test_cors(self): |
There was a problem hiding this comment.
| def test_cors(self): | |
| def test_cors(self): |
3024d3d to
5f89c97
Compare
|
@picnixz I have made all your suggested changes in 77b5fff . I have also implemented a generic |
|
I think we should just have |
|
And what are your thoughts on positional args like HTTPie? https://discuss.python.org/t/any-interest-in-adding-a-cors-option-to-python-m-http-server/92120/24 |
| Support of the ``'If-Modified-Since'`` header. | ||
|
|
||
| .. versionchanged:: next | ||
| Support ``response_headers`` as an instance argument. |
There was a problem hiding this comment.
Isn’t this redundant with the entry already under the constructor heading?
There was a problem hiding this comment.
Perhaps - it seems the constructor documentation is used to make a brief mention of each argument and when it was added, with more detail being filled in later. My latest commits make several changes requested elsewhere for other reasons, but if the current version is still too redundant in multiple places I can make some more edits.
There was a problem hiding this comment.
I don't think we should add this information. There is no notion of an "instance argument": it should rather be an instance attribute, and this should be documented through a .. attribute::, below .. attribute:: extensions_map
| self.RequestHandlerClass(request, client_address, self, | ||
| directory=args.directory) | ||
| directory=args.directory, | ||
| response_headers=self.response_headers) |
There was a problem hiding this comment.
Why not do this the same way the --directory or --protocol options are implemented? Either way should avoid adding internal parameters to unrelated HTTPServer and BaseRequestHandler classes.
You could build the response_headers dictionary before the DualStackServerMixin class, and then pass it by referencing the outer scope like is already done with args.directory:
| response_headers=self.response_headers) | |
| response_headers=response_headers) |
Or set the response_headers attribute on the SimpleHTTPRequestHandler class rather than in its constructor, like is done for protocol_verison in the test function.
There was a problem hiding this comment.
I followed your advice, see b1026d2. I made response_headers an argument to SimpleHTTPRequestHandler only, and send the argument to it in the DualStackServerMixin class.
| The *directory* parameter accepts a :term:`path-like object`. | ||
|
|
||
| .. versionchanged:: next | ||
| The *response_headers* parameter accepts an optional dictionary of |
There was a problem hiding this comment.
In previous versions, this was not a valid parameter at all.
| The *response_headers* parameter accepts an optional dictionary of | |
| Added *response_headers*, which accepts an optional dictionary of |
Also, did you consider accepting a list or iterable of (name, value) pairs instead, like returned by http.client.HTTPResponse.getheaders? That would be better for sending multiple Set-Cookie fields.
There was a problem hiding this comment.
Ah yes, sending multiple headers of the same name would indeed be necessary. I updated to use an iterable of name value pairs instead in 7a793f2
|
|
||
| .. versionchanged:: next | ||
| The *response_headers* parameter accepts an optional dictionary of | ||
| additional HTTP headers to add to each response. |
There was a problem hiding this comment.
Might be worth clarifying how these fields interact with other fields such as Server specified under BaseHTTPRequestHandler.send_response, and Last-Modified under do_GET.
Also clarify which responses the fields are included in, assuming it wasn’t your intention to include them for 404 Not Found, 304 Not Modified, lower-level errors, etc.
There was a problem hiding this comment.
In the latest commits, I've noted that the custom headers are only sent in success cases. What do you mean by interaction though? The custom headers currently get sent after Last-Modified, should I mention the placement of the custom headers and that they appear after Last-Modified?
| if self.response_headers is not None: | ||
| for header, value in self.response_headers.items(): | ||
| self.send_header(header, value) |
There was a problem hiding this comment.
Or is moving this to an extended send_response override an option? That way you would include the fields for all responses.
| if directory is None: | ||
| directory = os.getcwd() | ||
| self.directory = os.fspath(directory) | ||
| self.response_headers = response_headers |
There was a problem hiding this comment.
Clarify as an internal private attribute:
| self.response_headers = response_headers | |
| self._response_headers = response_headers |
Or document SimpleHTTPRequestHandler.response_headers as a public attribute.
There was a problem hiding this comment.
So far I have intended response_headers to be a public attribute. Do you mean add documentation to the docstring of SimpleHTTPRequestHandler or more documentation in Doc/library/http.server.rst?
| else: | ||
| self.server = HTTPServer(('localhost', 0), self.request_handler) | ||
| self.server = HTTPServer(('localhost', 0), self.request_handler, | ||
| **self.server_kwargs) |
There was a problem hiding this comment.
This appears to only be testing the undocumented or internal HTTPServer parameter. It would be good to test the new documented SimpleHTTPRequestHandler parameter instead or as well.
There was a problem hiding this comment.
I have removed server_kwargs in the latest updates, and updated the tests. The only external change now is response_headers as an instance arg to SimpleHTTPRequestHandler
@picnixz The new attributes are on the request handler, shouldn't a new request handler be made on each request, so the instances shouldn't be shared on concurrent requests? I did make a test that makes 8 threads to make 8 concurrent requests, and asserts the response headers aren't duplicated. See commit 280cea3 |
|
In seend_request() there is a while loop so I was worried about that. |
|
Also the new test is leaking some resources causing it to break, I'll fix but will have to look at it tomorrow. |
|
@hugovk It seems there was a transient failure (I think?) of Github actions communicating to Ubuntu PPAs, and many of the test runs just failed in this PR. Can you manually re-run the tests please? |
|
Restarted! Canonical/Ubuntu are having some issues: |
|
Looks like it might still be down, Canonical is under a massive attack. I guess I'll keep an eye on the situation and reach out once things are ready again. |
|
I've merged in |
|
@picnixz OK, all tests are passing now. The new test |
There was a problem hiding this comment.
Thanks for adding the tests. I don't remember if I asked but:
- L797: we are not sending additional headers for
MOVED_PERMANENTLY - L847: neither are we for
NOT_MODIFIED
I don't think it's important that custom headers are sent in the above cases but I wanted to note them down. I think we can merge this and then we'll just fix the issues during the beta.
| if not _is_extra and hasattr(self, '_default_response_headers'): | ||
| self._default_response_headers.append((keyword, value)) |
There was a problem hiding this comment.
Did we consider using not hasattr(...) check as for _headers_buffer? it's mostly for a follow-up/refactoring. If we go that path, we would remove the _default_response_headers = [] in handle() and put it in flush_headers() instead, right? It would re-construct the default_response_headers for each request but this also mean that if a request does (in the future) send_header(..., is_extra=True), that is_extra is "local" to the current request and won't pollute next requests. For now it doesn't matter though but I'll write it here to remember.
There was a problem hiding this comment.
I actually didn't look at the pattern used for _headers_buffer. For consistency and correctness, yes it does seem like the right thing to do it that way. I've made that change in 597b973.
picnixz
left a comment
There was a problem hiding this comment.
One minor comment on the test because threaded stuff can fail badly on buildbots. But we can address it later.
| server_thread = threading.Thread( | ||
| target=threaded_server.serve_forever, args=(0.05,), daemon=True | ||
| ) |
There was a problem hiding this comment.
Ah, maybe this can hang. I don't know if this test was written as for other tests, but this may hang in some build bots. The server_thread.join call below should be added as a finalizer just in case. And we should have a short timeout before aborting the thread.
There was a problem hiding this comment.
Ok, I added some addCleanup calls in c40f8b1 to help prevent a hang. If you think there's still a risk of the test hanging, I can just remove the test for now until we come up with a better idea.
There was a problem hiding this comment.
I still saw some resource leaking, so I just ended up removing test_extra_response_headers_concurrent_requests in 347b968 . We can add a more robust test later if need be, but I didn't want to leave a flaky test in that risked breaking.
After that checkin, there was still an env changed leakage in a windows build. I refactored a different test test_extra_response_headers_arg in a249eea to not create a server and thus also less likely to leak. After that checkin, all tests are passing.
|
I will run tje build bots to see if tests are hangijg somehoe ans then I will merge. Come back in a few hours to check but do not commit anything or update the branch meanwhile |
|
🤖 New build scheduled with the buildbot fleet by @picnixz for commit a249eea 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F135057%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
|
Ok, so we still have some file leakge but I don't think it's coming from you in particular. We can fix this is in a follow-up. Other failures are unrelated. |
|
@aisipos Thank you for your contribution! |
|
Thank you for the PR! It was not straightforward, so thank you for your patience! We will fix bugs or other incompatibilities during the beta (but for that we need people testing the interface) |
As proposed in #135056, Add a --cors command line argument to the stdlib http.server module, which will add an
Access-Control-Allow-Origin: *header to all responses.Invocation:
As part of this implementation, add a
response_headersargument toSimpleHTTPRequestHandlerandHTTPServer, which allows callers to add addition headers to the response. Ideally it would have been possible to just have made aCorsHttpServerclass, but a couple of issues made that difficult:http.serverCLI uses more than one HTTP Server class, in order to support TLS/HTTPS. So a single CorsHttpServer child class wouldn't work to support both use cases.RequestHandlerclasses. However, theHttpServerclasses didn't have an easy way to pass arguments down into the instantiated handlers.As a result, this PR updates both
HTTPServerandSimpleHTTPRequestHandlerto accept aresponse_headersargument, which allows callers to specify an additional set of HTTP headers to pass in the response.HTTPServernow overridesfinish_requestto pass this new kwarg down to itsRequestHandler.SimpleHTTPRequestHandlernow accepts aresposnse_headerskwarg, to optionally specify a dictionary of additional headers to send in the response.Care is taken to not pass the
response_headersargument to any instance constructors when not provided, to ensure backwards compatibility. I tried to keep the implementation as short and simple as possible.With the addition of a
response_headersargument, we allow ourselves to have a future possible custom header http argument, such as:📚 Documentation preview 📚: https://cpython-previews--135057.org.readthedocs.build/