Skip to content

v0.10.0

Compare
Choose a tag to compare
@jwarcbot jwarcbot released this 30 Mar 13:47

New features

  • WarcParser, HttpParser and ChunkedBody now report the context of parse errors making them much easier to debug. (Sebastian Nagel)
  • HttpParser now has a lenient parsing mode which copes with various deviations from the HTTP standards including:
    • LF as a separator rather than CRLF
    • spaces between field names and the colon separator
    • normally disallowed characters in field values, request target
    • variation of the number of spaces in the request-line and status-line

Bugs fixed

  • The chunked encoding parser now handles last-chunk with multiple zreoes (reported by Sebastian Nagel)
  • WarcTargetRecord.target() and targetURI() now trim angle brackets from WARC-Target-URI for compatibility with implementations that followed the WARC 1.0 grammar.