Skip to content

Releases: iipc/jwarc

v0.16.2: Release 0.16.2

08 Sep 01:47
Compare
Choose a tag to compare

Bugs fixed

  • Fixed calculation of position for uncompressed (W)ARCs when the trailer is not 4 bytes
  • ARC parser now accepts any character except CTLs and spaces in the URL field

v0.16.1: Release 0.16.1

02 Sep 23:19
Compare
Choose a tag to compare

Bug fixes

  • The ARC parser now tries to recover if the trailer is missing
  • The ARC parser now copes with the MIME field being missing or a single token
  • Lenient HTTP parser now ignores multiple CRs at the end of header lines
  • CDXTool now skips the record and continues if it fails to parse the HTTP message

v0.16.0: Release 0.16.0

02 Sep 11:31
Compare
Choose a tag to compare

New features

  • WarcReader will now emit a warning and attempt to recover when encountering a record with a missing or truncated trailer
  • Added WarcReader.onWarning(handler) which can be used to report recoverable errors

Bugs fixed

  • The ARC parser now handles the special value "no-type" in the MIME field
  • The ARC parser now accepts URLs containing "[" or "]"

v0.15.0: Release 0.15.0

31 Aug 03:47
@ato ato
Compare
Choose a tag to compare

New features:

  • Added validate tool which checks parse errors, validating digests and other headers #60 (Sebastian Nagel)
  • WarcReader gained a calculateBlockDigest() mode which populates a corresponding WarcRecord.calculatedBlockDigest() #60 (Sebastian Nagel)
  • WarcDigest: SHA-2 support, base64 and encoding auto-detection #59 (Sebastian Nagel)

Bugs fixed:

  • Setting the record version after calling date() would produce the incorrect WARC-Date precision #58
  • The lenient HTTP parser now accepts requests missing the HTTP version field (improves compatibility with the non-standard records produced by the ArchiveWeb.page browser extension)

v0.14.0: Release 0.14.0

25 Mar 03:41
Compare
Choose a tag to compare

New features:

  • Saveback tool for reconstructing WARC records from replay systems

Bugs fixed:

  • Replay proxy doesn't start because of sw.js file not found #57

v0.13.1: Release 0.13.1

28 Jan 00:30
Compare
Choose a tag to compare

Bugs fixed:

  • GunzipChannel fails on payload with uncompressed size exceeding int_max #54 (Sebastian Nagel)

v0.13.0: Release 0.13.0

28 Jan 00:30
Compare
Choose a tag to compare

New features

  • New tool to extract a WARC record, headers or payload #41 #47 (Sebastian Nagel)
  • Improved logging of MediaType parse errors #43 (Sebastian Nagel)

Bugs fixed

  • Lenient http parser now accepts header names that are empty or contain invalid characters #51
  • GzipChannel.write() now returns the number of consumed bytes instead of compressed bytes written #46 (Sebastian Nagel)

v0.12.0: Release 0.12.0

28 May 08:15
Compare
Choose a tag to compare

New features

  • Added contains(name, value) to MessageHeaders for look for values in comma-list headers

Bugs fixed

  • Eliminated IllegalArgumentException on duplicate media type parameters #40
  • Eliminated usages of sole() when accessing HTTP headers to reduce unnecessary exceptions
  • Allow optional space after chunk-size in chunked transfer-encoding #33 (Sebastian Nagel)

v0.11.0: Release 0.11.0

30 Apr 13:38
Compare
Choose a tag to compare

New features

  • Added http() and UUID refersTo() setter variants to WarcRevisit

Bugs fixed

  • Corrected WarcRevisit constructor to include targetURI. The original constructor is deprecated and will be removed in a future major release.

v0.10.3

30 Apr 00:37
Compare
Choose a tag to compare

Bugs fixed

  • Payload body had size 0 when the HTTP Content-Length header was missing on SeekableByteStreams (an extra case of #36)