Releases · Hk669/bpetokenizer

added a tokens visibilty feature to the developers to view their splitting of the tokens and as well as the text chunks split using the pattern.
added more samples

Assets 2

28 May 08:28

Hk669

v1.0.3

781c43d

v1.0.3

added the mode parameter in the save and load methods to help developers, save and load their vocab and the merges of the tokenizer in their desired format .

Full Changelog: v1.0.21...v1.0.3

Assets 2

27 May 20:43

Hk669

v1.0.21

707d65b

v1.0.2

build working correctly, ensuring the upload to pypi working.

Assets 2

27 May 17:44

Hk669

v1.0.10

0e47dfb

v1.0.10

testing the pypi package auto upload

Assets 2

27 May 17:28

Hk669

v1.0.1

731f834

v1.0.1

first release

adds the following functionalities:

BPETokenizer: which can be used to build your tokenizer for the LLM
Tokenizer: a base class which leverages the save and load of the vocab and merges

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New PRs

Contributors

Releases: Hk669/bpetokenizer

v1.2.1

What's Changed

Contributors

v1.2.0

What's Changed

Contributors

v1.0.4

What's Changed

New PRs

Contributors

v1.0.32

v1.0.31

v1.0.3

v1.0.2

v1.0.10

v1.0.1