Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Reading tdms data chunks works only for certain chunk sizes #337

Open
Nikolai-Hlubek opened this issue Sep 25, 2024 · 0 comments
Open

Comments

@Nikolai-Hlubek
Copy link

Nikolai-Hlubek commented Sep 25, 2024

I have a tdms file from which I'm trying to read a channel in chunks.

For certain chunk sizes it works for others not.

data_read_sliced = []

with nptdms.TdmsFile.open(fp) as tdms_file:
    len_data = len(tdms_file['Messdaten'][sensor])

    dt = tdms_file['Messdaten'][sensor].properties['wf_increment']

    len_slice = 2705
    len_slice = 2725
    len_slice = 4000  # Hardcoding 4000 reading works

    for idx in range(int(np.floor(len_data/len_slice))):
        data_slice = tdms_file['Messdaten'][sensor].read_data(offset=len_slice*idx,length=len_slice)
        data_read_sliced.append(data_slice)

    data_read_sliced = np.concatenate(data_read_sliced)
    
    data_read_once = tdms_file['Messdaten'][sensor].read_data()

np.sum(data_read_sliced - data_read_once)

len_slice = 2705

----> 1 np.sum(data_read_sliced - data_read_once)
ValueError: operands could not be broadcast together with shapes (16359840,) (16360000,) 

len_slice = 2725

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[19], line 11
      8 len_slice = 2725
     10 for idx in range(int(np.floor(len_data/len_slice))):
---> 11     data_slice = tdms_file['Messdaten'][sensor].read_data(offset=len_slice*idx,length=len_slice)
     12     data_read_sliced.append(data_slice)
     14 data_read_sliced = np.concatenate(data_read_sliced)

File /opt/pyenvs/DSS04/lib/python3.10/site-packages/nptdms/tdms.py:604, in TdmsChannel.read_data(self, offset, length, scaled)
    591 """ Reads data for this channel from the TDMS file and returns it as a numpy array
    592 
    593 Indexing into the channel with a slice should be preferred over using
   (...)
    601     For DAQmx data a dictionary of scaler id to raw scaler data will be returned.
    602 """
    603 if self._raw_data is None:
--> 604     raw_data = self._read_channel_data(offset, length)
    605 else:
    606     raw_data = slice_raw_data(self._raw_data, offset, length)

File /opt/pyenvs/DSS04/lib/python3.10/site-packages/nptdms/tdms.py:810, in TdmsChannel._read_channel_data(self, offset, length)
    808 for chunk in self._reader.read_raw_data_for_channel(self.path, offset, length):
    809     if chunk.data is not None:
--> 810         channel_data.append_data(chunk.data)
    811     if chunk.scaler_data is not None:
    812         for scaler_id, scaler_data in chunk.scaler_data.items():

File /opt/pyenvs/DSS04/lib/python3.10/site-packages/nptdms/channel_data.py:92, in NumpyDataReceiver.append_data(self, new_data)
     90 start_pos = self._data_insert_position
     91 end_pos = self._data_insert_position + len(new_data)
---> 92 self.data[start_pos:end_pos] = new_data
     93 self._data_insert_position += len(new_data)

ValueError: could not broadcast input array from shape (200,) into shape (0,)

len_slice = 4000

Works and gives a sum of 0.

Reading the data in one go always works.

For len_slice = 2725 (the once I actually want) the shown error is that new_data should be appended to self.data at 2725:2925 but self.data has only 2725 elements. In the reader.py somehow num_chunk is 2 for the last chunk of 200 and hence is tried to read twice. Also the end_segment is too large and hence the trimming code for the segment doesn't trigger.
So far for my debug attempts. I tried changing some things but it didn't get better as I don't know anything about the internals of the tdms format.

I could provide you the file in question if required but it is >2GB so I can't just upload it here.

I tried with nptdms 1.7.1 and 1.9.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant