Skip to content

Commit

Permalink
Merge branch 'main' into stable
Browse files Browse the repository at this point in the history
  • Loading branch information
scott-griffiths committed May 25, 2024
2 parents b026f7e + 2cf71ac commit cb8faea
Show file tree
Hide file tree
Showing 28 changed files with 1,268 additions and 1,184 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ jobs:
- name: Run pytest
run: |
python -m pytest tests/
python -m pytest tests/ --benchmark-disable
all:
name: All successful
Expand Down
2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
include tests/test.m1v
include tests/smalltestfile
include tests/__init__.py
include release_notes.txt
include release_notes.md
include README.md
include bitstring/py.typed
prune doc
Expand Down
120 changes: 53 additions & 67 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,26 +16,13 @@ It has been actively maintained since 2006.
[![Pepy Total Downlods](https://img.shields.io/pepy/dt/bitstring?logo=python&logoColor=white&labelColor=blue&color=blue)](https://www.pepy.tech/projects/bitstring)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/bitstring?label=%40&labelColor=blue&color=blue)](https://pypistats.org/packages/bitstring)


News
----
**May 2024**: bitstring 4.2.2 released.

New in version 4.2:

* Dropped support for Python 3.7. Minimum version is now 3.8.
* A new `Dtype` class can be optionally used to specify types.
* The `bitstring.options` object is now the preferred method for changing module options.
* New `fromstring` method as another way to create bitstrings from formatted strings.
* More types can now be pretty printed.
* A range of 8-bit, 6-bit and even 4-bit floating point formats added (beta):
* Performance improvements.

See the [release notes](https://github.com/scott-griffiths/bitstring/blob/main/release_notes.txt) for details. Please let me know if you encounter any problems.
> [!NOTE]
> To see what been added, improved or fixed, and also to see what's coming in the next version, see the [release notes](https://github.com/scott-griffiths/bitstring/blob/main/release_notes.md).

Overview
--------
# Overview

* Efficiently store and manipulate binary data in idiomatic Python.
* Create bitstrings from hex, octal, binary, files, formatted strings, bytes, integers and floats of different endiannesses.
Expand All @@ -46,8 +33,8 @@ Overview
* Rich API - chances are that whatever you want to do there's a simple and elegant way of doing it.
* Open source software, released under the MIT licence.

Documentation
-------------
# Documentation

Extensive documentation for the bitstring module is available.
Some starting points are given below:

Expand All @@ -57,67 +44,66 @@ Some starting points are given below:

There is also an introductory walkthrough notebook on [binder](https://mybinder.org/v2/gh/scott-griffiths/bitstring/main?labpath=doc%2Fwalkthrough.ipynb).

Release Notes
-------------

To see what been added, improved or fixed, and also to see what's coming in the next version, see the [release notes](https://github.com/scott-griffiths/bitstring/blob/main/release_notes.txt).

Examples
--------
# Examples

### Installation

$ pip install bitstring
```
$ pip install bitstring
```

### Creation

>>> from bitstring import Bits, BitArray, BitStream, pack
>>> a = BitArray(bin='00101')
>>> b = Bits(a_file_object)
>>> c = BitArray('0xff, 0b101, 0o65, uint6=22')
>>> d = pack('intle16, hex=a, 0b1', 100, a='0x34f')
>>> e = pack('<16h', *range(16))
```pycon
>>> from bitstring import Bits, BitArray, BitStream, pack
>>> a = BitArray(bin='00101')
>>> b = Bits(a_file_object)
>>> c = BitArray('0xff, 0b101, 0o65, uint6=22')
>>> d = pack('intle16, hex=a, 0b1', 100, a='0x34f')
>>> e = pack('<16h', *range(16))
```

### Different interpretations, slicing and concatenation

>>> a = BitArray('0x3348')
>>> a.hex, a.bin, a.uint, a.float, a.bytes
('3348', '0011001101001000', 13128, 0.2275390625, b'3H')
>>> a[10:3:-1].bin
'0101100'
>>> '0b100' + 3*a
BitArray('0x866906690669, 0b000')
```pycon
>>> a = BitArray('0x3348')
>>> a.hex, a.bin, a.uint, a.float, a.bytes
('3348', '0011001101001000', 13128, 0.2275390625, b'3H')
>>> a[10:3:-1].bin
'0101100'
>>> '0b100' + 3*a
BitArray('0x866906690669, 0b000')
```

### Reading data sequentially

>>> b = BitStream('0x160120f')
>>> b.read(12).hex
'160'
>>> b.pos = 0
>>> b.read('uint12')
352
>>> b.readlist('uint12, bin3')
[288, '111']
```pycon
>>> b = BitStream('0x160120f')
>>> b.read(12).hex
'160'
>>> b.pos = 0
>>> b.read('uint12')
352
>>> b.readlist('uint12, bin3')
[288, '111']
```

### Searching, inserting and deleting

>>> c = BitArray('0b00010010010010001111') # c.hex == '0x1248f'
>>> c.find('0x48')
(8,)
>>> c.replace('0b001', '0xabc')
>>> c.insert('0b0000', pos=3)
>>> del c[12:16]
```pycon
>>> c = BitArray('0b00010010010010001111') # c.hex == '0x1248f'
>>> c.find('0x48')
(8,)
>>> c.replace('0b001', '0xabc')
>>> c.insert('0b0000', pos=3)
>>> del c[12:16]
```

### Arrays of fixed-length formats

>>> from bitstring import Array
>>> a = Array('uint7', [9, 100, 3, 1])
>>> a.data
BitArray('0x1390181')
>>> a[::2] *= 5
>>> a
Array('uint7', [45, 100, 15, 1])

```pycon
>>> from bitstring import Array
>>> a = Array('uint7', [9, 100, 3, 1])
>>> a.data
BitArray('0x1390181')
>>> a[::2] *= 5
>>> a
Array('uint7', [45, 100, 15, 1])
```


<sub>Copyright (c) 2006 - 2024 Scott Griffiths</sub>
47 changes: 33 additions & 14 deletions bitstring/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@
THE SOFTWARE.
"""

__version__ = "4.2.2"
__version__ = "4.2.3"

__author__ = "Scott Griffiths"

Expand Down Expand Up @@ -111,85 +111,104 @@ def lsb0(self, value: bool) -> None:

sys.modules[__name__].__class__ = _MyModuleType

"""These methods convert a bit length to the number of characters needed to print it for different interpretations."""

# These methods convert a bit length to the number of characters needed to print it for different interpretations.
def hex_bits2chars(bitlength: int):
# One character for every 4 bits
return bitlength // 4


def oct_bits2chars(bitlength: int):
# One character for every 3 bits
return bitlength // 3


def bin_bits2chars(bitlength: int):
# One character for each bit
return bitlength


def bytes_bits2chars(bitlength: int):
# One character for every 8 bits
return bitlength // 8


def uint_bits2chars(bitlength: int):
# How many characters is largest possible int of this length?
return len(str((1 << bitlength) - 1))


def int_bits2chars(bitlength: int):
# How many characters is largest negative int of this length? (To include minus sign).
return len(str((-1 << (bitlength - 1))))


def float_bits2chars(bitlength: Literal[16, 32, 64]):
# These bit lengths were found by looking at lots of possible values
if bitlength in [16, 32]:
return 23 # Empirical value
else:
return 24 # Empirical value

def p3binary_bits2chars(bitlength: Literal[8]):

def p3binary_bits2chars(_: Literal[8]):
return 19 # Empirical value

def p4binary_bits2chars(bitlength: Literal[8]):

def p4binary_bits2chars(_: Literal[8]):
# Found by looking at all the possible values
return 13 # Empirical value

def e4m3mxfp_bits2chars(bitlength: Literal[8]):

def e4m3mxfp_bits2chars(_: Literal[8]):
return 13

def e5m2mxfp_bits2chars(bitlength: Literal[8]):

def e5m2mxfp_bits2chars(_: Literal[8]):
return 19

def e3m2mxfp_bits2chars(bitlength: Literal[6]):

def e3m2mxfp_bits2chars(_: Literal[6]):
# Not sure what the best value is here. It's 7 without considering the scale that could be applied.
return 7

def e2m3mxfp_bits2chars(bitlength: Literal[6]):

def e2m3mxfp_bits2chars(_: Literal[6]):
# Not sure what the best value is here.
return 7

def e2m1mxfp_bits2chars(bitlength: Literal[4]):

def e2m1mxfp_bits2chars(_: Literal[4]):
# Not sure what the best value is here.
return 7

def e8m0mxfp_bits2chars(bitlength: Literal[8]):
# Can range same as float32

def e8m0mxfp_bits2chars(_: Literal[8]):
# Has same range as float32
return 23

def mxint_bits2chars(bitlength: Literal[8]):

def mxint_bits2chars(_: Literal[8]):
# Not sure what the best value is here.
return 10


def bfloat_bits2chars(bitlength: Literal[16]):
def bfloat_bits2chars(_: Literal[16]):
# Found by looking at all the possible values
return 23 # Empirical value


def bits_bits2chars(bitlength: int):
# For bits type we can see how long it needs to be printed by trying any value
temp = Bits(bitlength)
return len(str(temp))

def bool_bits2chars(bitlength: Literal[1]):

def bool_bits2chars(_: Literal[1]):
# Bools are printed as 1 or 0, not True or False, so are one character each
return 1


dtype_definitions = [
# Integer types
DtypeDefinition('uint', Bits._setuint, Bits._getuint, int, False, uint_bits2chars,
Expand Down
2 changes: 1 addition & 1 deletion bitstring/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,4 +47,4 @@ def main() -> None:


if __name__ == '__main__':
main()
main()
18 changes: 12 additions & 6 deletions bitstring/array_.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ def __init__(self, dtype: Union[str, Dtype], initializer: Optional[Union[int, Ar
self.data += BitArray._create_from_bitstype(trailing_bits)

_largest_values = None

@staticmethod
def _calculate_auto_scale(initializer, name: str, length: Optional[int]) -> float:
# Now need to find the largest power of 2 representable with this format.
Expand All @@ -112,12 +113,15 @@ def _calculate_auto_scale(initializer, name: str, length: Optional[int]) -> floa
}
if f'{name}{length}' in Array._largest_values.keys():
float_values = Array('float64', initializer).tolist()
if not float_values:
raise ValueError("Can't calculate an 'auto' scale with an empty Array initializer.")
max_float_value = max(abs(x) for x in float_values)
if max_float_value == 0:
# This special case isn't covered in the standard. I'm choosing to return no scale.
return 1.0
log2 = int(math.log2(max_float_value))
lp2 = int(math.log2(Array._largest_values[f'{name}{length}']))
# We need to find the largest power of 2 that is less than the max value
log2 = math.floor(math.log2(max_float_value))
lp2 = math.floor(math.log2(Array._largest_values[f'{name}{length}']))
lg_scale = log2 - lp2
# Saturate at values representable in E8M0 format.
if lg_scale > 127:
Expand Down Expand Up @@ -155,7 +159,7 @@ def _set_dtype(self, new_dtype: Union[str, Dtype]) -> None:
except ValueError:
name_length = utils.parse_single_struct_token(new_dtype)
if name_length is not None:
dtype = Dtype(*name_length)
dtype = Dtype(name_length[0], name_length[1])
else:
raise ValueError(f"Inappropriate Dtype for Array: '{new_dtype}'.")
if dtype.length is None:
Expand Down Expand Up @@ -270,7 +274,7 @@ def astype(self, dtype: Union[str, Dtype]) -> Array:
return new_array

def tolist(self) -> List[ElementType]:
return [self._dtype.read_fn(self.data, start=start)
return [self._dtype.read_fn(self.data, start=start)
for start in range(0, len(self.data) - self._dtype.length + 1, self._dtype.length)]

def append(self, x: ElementType) -> None:
Expand Down Expand Up @@ -408,9 +412,11 @@ def pp(self, fmt: Optional[str] = None, width: int = 120,
token_list = utils.preprocess_tokens(fmt)
if len(token_list) not in [1, 2]:
raise ValueError(f"Only one or two tokens can be used in an Array.pp() format - '{fmt}' has {len(token_list)} tokens.")
dtype1 = Dtype(*utils.parse_name_length_token(token_list[0]))
name1, length1 = utils.parse_name_length_token(token_list[0])
dtype1 = Dtype(name1, length1)
if len(token_list) == 2:
dtype2 = Dtype(*utils.parse_name_length_token(token_list[1]))
name2, length2 = utils.parse_name_length_token(token_list[1])
dtype2 = Dtype(name2, length2)

token_length = dtype1.bitlength
if dtype2 is not None:
Expand Down
Loading

0 comments on commit cb8faea

Please sign in to comment.