The port number in the following URL is clearly malformed, but Hyperlink does this:
>>> hyperlink.URL.from_text("http://example.com: -໑_1\v").port
-11
This comes from the fact that ports are parsed with int. This leads to the following unintuitive consequences:
- Whitespace, including all of
(' ', '\t', '\v', '\r', '\n') (plus a bunch of unicode whitespace) will be stripped and from either side of the port number.
'-' or '+' can appear just before the first digit in the port number
'_' can appear between digits in the port number
- Some unicode digits, such as
'໑' can appear in port numbers
All of this violates both the RFC and the WHATWG standard.
The port number in the following URL is clearly malformed, but Hyperlink does this:
This comes from the fact that ports are parsed with
int. This leads to the following unintuitive consequences:(' ', '\t', '\v', '\r', '\n')(plus a bunch of unicode whitespace) will be stripped and from either side of the port number.'-'or'+'can appear just before the first digit in the port number'_'can appear between digits in the port number'໑'can appear in port numbersAll of this violates both the RFC and the WHATWG standard.