Skip to content

v0.19.0: Release 0.19.0

Compare
Choose a tag to compare
@ato ato released this 12 Sep 06:26
· 100 commits to master since this release

New features

  • jwarc will now attempt to leniently parse HTTP messages with Transfer-Encoding: chunked but where the body does not begin with a valid chunk header by assuming the body is not actually chunked encoded. This improves compatibility with tools like Browsertrix that strip chunked encoding but leave the HTTP header in place.

  • ExtractTool will now extract multiple records when given multiple offsets

  • CdxTool gained support for the 'N' (normalized SURT) field

  • CdxTool gained partial support for pywb's method of encoding request bodies in CDX records. This is still a work in progress and not yet fully compatible with pywb in all cases.