v0.19.0: Release 0.19.0
New features
-
jwarc will now attempt to leniently parse HTTP messages with Transfer-Encoding: chunked but where the body does not begin with a valid chunk header by assuming the body is not actually chunked encoded. This improves compatibility with tools like Browsertrix that strip chunked encoding but leave the HTTP header in place.
-
ExtractTool will now extract multiple records when given multiple offsets
-
CdxTool gained support for the 'N' (normalized SURT) field
-
CdxTool gained partial support for pywb's method of encoding request bodies in CDX records. This is still a work in progress and not yet fully compatible with pywb in all cases.