tools.htk API

class nsds_lab_to_nwb.tools.htk.htk_reader.HTKReader(path, channels=None)[source]

Bases: object

HTK interface

Parameters
  • path (str) – Path to HTK folder

  • channels (list, optional) – List of channel ids to import. Defaults to None.

get_data(*, stream=None, dev_conf=None)[source]

Get specified data

Parameters
  • stream (str) – Stream name (not used by HTKReader)

  • dev_conf ((dict) metadata for the device.) – nwb_builder.metadata[‘device’][device_name]. Not used for TDT.

Returns

  • data (ndarray) – Data array

  • meta (dict) – Meta data for the data array.

class nsds_lab_to_nwb.tools.htk.readers.htkcollection.HTKChannelIterator(**kwargs)[source]

Bases: hdmf.data_utils.AbstractDataChunkIterator

Custom data chunk iterator to iterate over the channels of an HTK collection.

property dtype

Define the data type of the array

Returns

NumPy style dtype or otherwise compliant dtype string

classmethod from_htk_collection(collection, time_axis_first=False, has_bands=True)[source]

Convenience function to generate a HTKChannelIterator from an existing HTKCollection :param collection: The input HTKCollection for which we should create an iterator :type collection: HTKCollection :return: HTKChannelIterator for the input HTKCollection

property maxshape

Property describing the maximum shape of the data array that is being iterated over

Returns

NumPy-style shape tuple indicating the maxiumum dimensions up to which the dataset may be resized. Axes with None are unlimited.

recommended_chunk_shape()[source]
Recommend a chunk shape. This will typcially be the most common shape of chunk returned by __next__

but may also be some other value in case one wants to recommend chunk shapes to optimize read rather than write.

recommended_data_shape()[source]

Recommend an initial shape of the data. This is useful when progressively writing data and we want to recommend and initial size for the dataset

class nsds_lab_to_nwb.tools.htk.readers.htkcollection.HTKCollection(directory, prefix=None, layout=None, anatomy_file=None, bands_file=None, guess_bands=False, check_consistency=False, sample_rate_base=10000.0, noblock=True, postfix=None)[source]

Bases: object

Class for management of a directory of HTK files from raw or processed neural recordings. All HTK files are expected to have the same size.

Variables
  • directory – Directory where the raw HTK data files are located

  • htk_files – Python list of strings of the paths to all HTK files

  • channel_to_file_map – 2D numpy array of shape (#blocks, #channels) indicating the index of the file associated with the corresponding channel.

  • file_to_channel_map – List of two-valued tuples indicating for each file the block and channel they are associated with. blockindex=self.file_to_channel_map[i][0].

  • data – 2D numpy array with the full data from all channels of None in case read_data() has not been called.

  • layout – Numpy array describing the physical layout of the grid. By default a rectangular layout is assumed with channels starting at the bottom right of the grid and channel numbers growing from bottom to top.

  • num_samples – Number of samples per channel

  • sample_period – Sample period in 100ns units

  • sample_rate – Sampling rate in Hz. This is the same as 10000/sample_period.

  • sample_size – Number of bytes per sample

  • parameter_kind – Code indicating the sample kind (see HTKFormat for details on parmKind)

  • anatomy – Dictionary describing for different regions of the brain the electrodes that are located in the given region.

  • dtype – Numpy dtype of the HTK data

  • sample_rate_base – None if the sample_period is given in the header. Set to the number that should we should divide the sampling rate given in the header by in order to convert the rate to the appropriate value in Hz.

  • bands – 1D numpy array with center of the frequency bands

clear_data()[source]

Clear the self.data instance variable to free up memory.

get_anatomy_dict()[source]

Get the anatomy dicitionary describing for each region the list of electrodes in the region.

static get_anatomy_map(anatomy_dict, num_electrodes)[source]

Get numpy array of string, indicating for each electrode the name of the region it is located in . ‘unknown’ is added for electrodes with an unknown region assignment.

get_block_index(fileindex)[source]

Get the block index for the file with the given index.

Parameters

fileindex – Index of the file of interest

Returns

integer indicting the block index for the file.

get_channel_index(fileindex)[source]

Get the channel index with a block for the file with the given index.

Parameters

fileindex – Index of the file of interest

Returns

integer indicting the channel index for the file.

static get_layout(num_electrodes)[source]

Internal helper function used to define the default layout of the brain grid.

Parameters

num_electrodes – The number of electrodes to be arranged in the layout

get_number_of_blocks()[source]

Get the number of blocks in which the all channels are organized.

get_number_of_channels_per_block()[source]

Get the number of channels per block.

get_number_of_files()[source]

Get the number of HTK files associated with the current collection of raw data.

Returns

Integer indicating the number of HTK files. (len(self.htk_files))

has_anatomy()[source]

Check whether anatomy data is available for the collection.

static read_anatomy(anatomy_file)[source]

Read .mat file describing the anatomy of the data and return a dict describing for different brain regions (keys) the set of electrodes that are located in that region (values, stored as numpy arrays).

Parameters

anatomy_file – The name of the .mat file with the description of the anatomy

read_channel(fileindex)[source]

Get the data for the file with the given index.

read_data(print_status=False)[source]

Read all data from file and return the numpy array. This function modifies self.data to safe the data retrieved.

Parameters

print_status – One of [True, False, ‘jupyter’]. True means-Print status message on read progress on screen. ‘jupyter’ means create a progress bar in a Jupyter notebook. False means, don’t show process. Default is False.

class nsds_lab_to_nwb.tools.htk.readers.htkfile.HTKFile(filename, sample_rate_base=10000.0)[source]

Bases: object

Class used for reading HTK format files.

NOTE: The original HTK specification specifies that the sample_period is given in the header in 100ns units. In some cases however, users appear to write the sampling rate in the header with a different base. We therefore allow users to specify the base for the sampling rate and if given we assume that the header contains the sampling rate and we convert accordingly.

Instance Variables:

Variables
  • filename – Name of the HTK file

  • data – Numpy array of the data or None in case that read_data has not been called yet

  • num_samples – Number of samples in the file

  • sample_period – Sample period in 100ns units

  • sample_rate – Sampling rate in Hz. This is the same as 10000/sample_period.

  • sample_size – Number of bytes per sample

  • parameter_kind – Code indicating the sample kind (see HTKFormat for details on parmKind)

  • dtype – Data type

  • vector_length – Vector length

  • A – Compression parameter

  • B – Compression parameter

  • header_length – Total header length

Internal Variables:

Variables
  • __file – The handle to the HTK file

  • __current_pos – Internal variable used to store the current sample position during iteration

read_data()[source]

Get a numpy data array of all the samples

Returns

Numpy data array of all the samples

read_sample(sample_index)[source]

Read the data of a single sample with the given index.

Parameters

sample_index – The index of the sample to be read

Returns

The vector with the data for the sample.

class nsds_lab_to_nwb.tools.htk.readers.htkfile.HTKFormat[source]

Bases: object

Specification of base information about the HTK file format.

byte_order = '>'

Byte-order in which the HTK data is written

header = [{'name': 'num_samples', 'format': 'I', 'description': 'Number of samples in the File'}, {'name': 'sample_period', 'format': 'I', 'description': 'Sample period in 100ns units'}, {'name': 'sample_size', 'format': 'H', 'description': 'Number of bytes per sample'}, {'name': 'parameter_kind', 'format': 'H', 'description': 'A code indicating the sample kind'}]

List describing the contents of the HTK file header.

classmethod header_format()[source]

Get the format string to unpack the header of the HTK file.

:returns string—e.g. ‘>IIHH’—describing the format to be used for unpacking the header.

header_length = 12

Total length in bytes of the file header.

param_kind_base = {'DISCRETE': 10, 'FBANK': 7, 'IREFC': 5, 'LPC': 1, 'LPCDELCEP': 4, 'LPCEPSTRA': 3, 'LPCREFC': 2, 'MELSPEC': 8, 'MFCC': 6, 'USER': 9, 'WAVEFORM': 0}

Dictionary describing the basic parameter kind codes.

  • WAVEFORM = 0 : sampled waveform

  • LPC = 1 : linear prediction filter coefficients

  • LPCREFC = 2 : linear prediction reflection coefficients

  • LPCEPSTRA = 3 : LPC cepstral coefficients

  • LPCDELCEP = 4 : LPC cepstra plus delta coefficients

  • IREFC = 5 : LPC reflection coefficient in 16 bit integer format

  • MFCC = 6 : mel-frequency cepstral coefficients

  • FBANK = 7 : log mel-filter bank channel outputs

  • MELSPEC = 8 : linear mel-filter bank channel outputs

  • USER = 9 : user-defined sample kind

  • DISCRETE = 10 : vector quantised data

param_kind_encoding = {'_A': 512, '_C': 1024, '_D': 256, '_E': 64, '_K': 4096, '_N': 128, '_O': 8192, '_Z': 2048}

Dictionary describing the parameter kind encodings.

  • _E = 0000100 : has energy

  • _N = 0000200 : absolute energy suppressed

  • _D = 0000400 : has delta coefficients

  • _A = 0001000 : has acceleration (delta-delta) coefficients

  • _C = 0002000 : is compressed

  • _Z = 0004000 : has zero mean static coefficients

  • _K = 0010000 : has CRC checksum

  • _O = 0020000 : has 0th cepstral coefficient

class nsds_lab_to_nwb.tools.htk.readers.instrument.EPhysInstrumentData(htkdir, prefix, postfix, device_name=None, device_image_name=None, description=None, layout=None, location=None, read_on_create=True, **kwargs)[source]

Bases: object

Describe the data for a particular recording device

read_data(create_iterator=False, print_status=False, time_axis_first=True, has_bands=True)[source]

Read the data for all channels

Parameters
  • create_iterator – If set to True, then instead of reading the HTK data we create a HTKChannelIterator object that we can use to iteratively read the channels when we need them

  • time_axis_first – If set to True, use the dimension order (time, electrode, band) if False use the default ordering (electrode, time, band)

  • print_status – One of [True, False, ‘jupyter’]. True means-Print status message on read progress on screen. ‘jupyter’ means create a progress bar in a Jupyter notebook. False means, don’t show process. Default is False.

Returns

class nsds_lab_to_nwb.tools.htk.readers.instrument.EPhysInstrumentLayout[source]

Bases: object

Define the layout for different EPhys Instrument

static grid(orientation, nelect=64, xspacing=0.2, yspacing=0.2)[source]
Parameters
  • orientation – Char with the channel orientation. One of ‘S’ or ‘R’

  • nelect – Number of electrodes in the grid

  • xspacing – The spacing to be used to compute the electrodes x positions

  • yspacing – The spacing to be used to compute the electrodes y positions

Returns

  • array with the layout index for the polytrode

  • array with the positions of the electrodes or None

static polytrode(ncols=2)[source]

Get the layout for the polytrode

Parameters

ncols – Integer indicating the number of columns in the polytrode. One of 2,3.

Returns

  • array with the layout index for the polytrode

  • array with the positions of the electrodes or None

static polytrode_position_in_grid(orientiation, xspacing=0.2, yspacing=0.2)[source]

The location of the polytrode with respect to the ephys grid

Parameters
  • orientiation – Char with the channel orientation of the ephys grid. One of ‘S’ or ‘R’

  • xspacing – Spacing of the electrodes in x

  • yspacing – Spacing of the electrodes in y

Returns

Two numpy arrays of two floats. The first array is the (x,y) index location in the grid, and the second array is the spatial (x,y) location is the spatial location.