tools.htk API

class nsds_lab_to_nwb.tools.htk.htk_reader.HTKReader(path, channels=None)[source]

Bases: object

HTK interface

Parameters

path (str) – Path to HTK folder
channels (list, optional) – List of channel ids to import. Defaults to None.

get_data(*, stream=None, dev_conf=None)[source]

Get specified data

Parameters

stream (str) – Stream name (not used by HTKReader)
dev_conf ((dict) metadata for the device.) – nwb_builder.metadata[‘device’][device_name]. Not used for TDT.

Returns

data (ndarray) – Data array
meta (dict) – Meta data for the data array.

class nsds_lab_to_nwb.tools.htk.readers.htkcollection.HTKChannelIterator(**kwargs)[source]

Bases: hdmf.data_utils.AbstractDataChunkIterator

Custom data chunk iterator to iterate over the channels of an HTK collection.

property dtype

Define the data type of the array

Returns: NumPy style dtype or otherwise compliant dtype string

classmethod from_htk_collection(collection, time_axis_first=False, has_bands=True)[source]: Convenience function to generate a HTKChannelIterator from an existing HTKCollection :param collection: The input HTKCollection for which we should create an iterator :type collection: HTKCollection :return: HTKChannelIterator for the input HTKCollection

property maxshape

Property describing the maximum shape of the data array that is being iterated over

Returns: NumPy-style shape tuple indicating the maxiumum dimensions up to which the dataset may be resized. Axes with None are unlimited.

recommended_chunk_shape()[source]

Recommend a chunk shape. This will typcially be the most common shape of chunk returned by __next__: but may also be some other value in case one wants to recommend chunk shapes to optimize read rather than write.

recommended_data_shape()[source]: Recommend an initial shape of the data. This is useful when progressively writing data and we want to recommend and initial size for the dataset

class nsds_lab_to_nwb.tools.htk.readers.htkcollection.HTKCollection(directory, prefix=None, layout=None, anatomy_file=None, bands_file=None, guess_bands=False, check_consistency=False, sample_rate_base=10000.0, noblock=True, postfix=None)[source]

Bases: object

Class for management of a directory of HTK files from raw or processed neural recordings. All HTK files are expected to have the same size.

Variables

directory – Directory where the raw HTK data files are located
htk_files – Python list of strings of the paths to all HTK files
channel_to_file_map – 2D numpy array of shape (#blocks, #channels) indicating the index of the file associated with the corresponding channel.
file_to_channel_map – List of two-valued tuples indicating for each file the block and channel they are associated with. blockindex=self.file_to_channel_map[i][0].
data – 2D numpy array with the full data from all channels of None in case read_data() has not been called.
layout – Numpy array describing the physical layout of the grid. By default a rectangular layout is assumed with channels starting at the bottom right of the grid and channel numbers growing from bottom to top.
num_samples – Number of samples per channel
sample_period – Sample period in 100ns units
sample_rate – Sampling rate in Hz. This is the same as 10000/sample_period.
sample_size – Number of bytes per sample
parameter_kind – Code indicating the sample kind (see HTKFormat for details on parmKind)
anatomy – Dictionary describing for different regions of the brain the electrodes that are located in the given region.
dtype – Numpy dtype of the HTK data
sample_rate_base – None if the sample_period is given in the header. Set to the number that should we should divide the sampling rate given in the header by in order to convert the rate to the appropriate value in Hz.
bands – 1D numpy array with center of the frequency bands

clear_data()[source]: Clear the self.data instance variable to free up memory.

get_anatomy_dict()[source]: Get the anatomy dicitionary describing for each region the list of electrodes in the region.

static get_anatomy_map(anatomy_dict, num_electrodes)[source]: Get numpy array of string, indicating for each electrode the name of the region it is located in . ‘unknown’ is added for electrodes with an unknown region assignment.

get_block_index(fileindex)[source]

Get the block index for the file with the given index.

Parameters: fileindex – Index of the file of interest
Returns: integer indicting the block index for the file.

get_channel_index(fileindex)[source]

Get the channel index with a block for the file with the given index.

Parameters: fileindex – Index of the file of interest
Returns: integer indicting the channel index for the file.

static get_layout(num_electrodes)[source]

Internal helper function used to define the default layout of the brain grid.

Parameters: num_electrodes – The number of electrodes to be arranged in the layout

get_number_of_blocks()[source]: Get the number of blocks in which the all channels are organized.

get_number_of_channels_per_block()[source]: Get the number of channels per block.

get_number_of_files()[source]

Get the number of HTK files associated with the current collection of raw data.

Returns: Integer indicating the number of HTK files. (len(self.htk_files))

has_anatomy()[source]: Check whether anatomy data is available for the collection.

static read_anatomy(anatomy_file)[source]

Read .mat file describing the anatomy of the data and return a dict describing for different brain regions (keys) the set of electrodes that are located in that region (values, stored as numpy arrays).

Parameters: anatomy_file – The name of the .mat file with the description of the anatomy

read_channel(fileindex)[source]: Get the data for the file with the given index.

read_data(print_status=False)[source]

Read all data from file and return the numpy array. This function modifies self.data to safe the data retrieved.

Parameters: print_status – One of [True, False, ‘jupyter’]. True means-Print status message on read progress on screen. ‘jupyter’ means create a progress bar in a Jupyter notebook. False means, don’t show process. Default is False.

class nsds_lab_to_nwb.tools.htk.readers.htkfile.HTKFile(filename, sample_rate_base=10000.0)[source]

Bases: object

Class used for reading HTK format files.

NOTE: The original HTK specification specifies that the sample_period is given in the header in 100ns units. In some cases however, users appear to write the sampling rate in the header with a different base. We therefore allow users to specify the base for the sampling rate and if given we assume that the header contains the sampling rate and we convert accordingly.

Instance Variables:

Variables

filename – Name of the HTK file
data – Numpy array of the data or None in case that read_data has not been called yet
num_samples – Number of samples in the file
sample_period – Sample period in 100ns units
sample_rate – Sampling rate in Hz. This is the same as 10000/sample_period.
sample_size – Number of bytes per sample
parameter_kind – Code indicating the sample kind (see HTKFormat for details on parmKind)
dtype – Data type
vector_length – Vector length
A – Compression parameter
B – Compression parameter
header_length – Total header length

Internal Variables:

Variables

__file – The handle to the HTK file
__current_pos – Internal variable used to store the current sample position during iteration

read_data()[source]

Get a numpy data array of all the samples

Returns: Numpy data array of all the samples

read_sample(sample_index)[source]

Read the data of a single sample with the given index.

Parameters: sample_index – The index of the sample to be read
Returns: The vector with the data for the sample.

class nsds_lab_to_nwb.tools.htk.readers.htkfile.HTKFormat[source]

Bases: object

Specification of base information about the HTK file format.

byte_order = '>': Byte-order in which the HTK data is written

header = [{'name': 'num_samples', 'format': 'I', 'description': 'Number of samples in the File'}, {'name': 'sample_period', 'format': 'I', 'description': 'Sample period in 100ns units'}, {'name': 'sample_size', 'format': 'H', 'description': 'Number of bytes per sample'}, {'name': 'parameter_kind', 'format': 'H', 'description': 'A code indicating the sample kind'}]: List describing the contents of the HTK file header.

classmethod header_format()[source]

Get the format string to unpack the header of the HTK file.

:returns string—e.g. ‘>IIHH’—describing the format to be used for unpacking the header.

header_length = 12: Total length in bytes of the file header.

param_kind_base = {'DISCRETE': 10, 'FBANK': 7, 'IREFC': 5, 'LPC': 1, 'LPCDELCEP': 4, 'LPCEPSTRA': 3, 'LPCREFC': 2, 'MELSPEC': 8, 'MFCC': 6, 'USER': 9, 'WAVEFORM': 0}

Dictionary describing the basic parameter kind codes.

WAVEFORM = 0 : sampled waveform
LPC = 1 : linear prediction filter coefficients
LPCREFC = 2 : linear prediction reflection coefficients
LPCEPSTRA = 3 : LPC cepstral coefficients
LPCDELCEP = 4 : LPC cepstra plus delta coefficients
IREFC = 5 : LPC reflection coefficient in 16 bit integer format
MFCC = 6 : mel-frequency cepstral coefficients
FBANK = 7 : log mel-filter bank channel outputs
MELSPEC = 8 : linear mel-filter bank channel outputs
USER = 9 : user-defined sample kind
DISCRETE = 10 : vector quantised data

param_kind_encoding = {'_A': 512, '_C': 1024, '_D': 256, '_E': 64, '_K': 4096, '_N': 128, '_O': 8192, '_Z': 2048}

Dictionary describing the parameter kind encodings.

_E = 0000100 : has energy
_N = 0000200 : absolute energy suppressed
_D = 0000400 : has delta coefficients
_A = 0001000 : has acceleration (delta-delta) coefficients
_C = 0002000 : is compressed
_Z = 0004000 : has zero mean static coefficients
_K = 0010000 : has CRC checksum
_O = 0020000 : has 0th cepstral coefficient

class nsds_lab_to_nwb.tools.htk.readers.instrument.EPhysInstrumentData(htkdir, prefix, postfix, device_name=None, device_image_name=None, description=None, layout=None, location=None, read_on_create=True, **kwargs)[source]

Bases: object

Describe the data for a particular recording device

read_data(create_iterator=False, print_status=False, time_axis_first=True, has_bands=True)[source]

Read the data for all channels

Parameters

create_iterator – If set to True, then instead of reading the HTK data we create a HTKChannelIterator object that we can use to iteratively read the channels when we need them
time_axis_first – If set to True, use the dimension order (time, electrode, band) if False use the default ordering (electrode, time, band)
print_status – One of [True, False, ‘jupyter’]. True means-Print status message on read progress on screen. ‘jupyter’ means create a progress bar in a Jupyter notebook. False means, don’t show process. Default is False.

Returns

class nsds_lab_to_nwb.tools.htk.readers.instrument.EPhysInstrumentLayout[source]

Bases: object

Define the layout for different EPhys Instrument

static grid(orientation, nelect=64, xspacing=0.2, yspacing=0.2)[source]

Parameters

orientation – Char with the channel orientation. One of ‘S’ or ‘R’
nelect – Number of electrodes in the grid
xspacing – The spacing to be used to compute the electrodes x positions
yspacing – The spacing to be used to compute the electrodes y positions

Returns

array with the layout index for the polytrode
array with the positions of the electrodes or None

static polytrode(ncols=2)[source]

Get the layout for the polytrode

Parameters

ncols – Integer indicating the number of columns in the polytrode. One of 2,3.

Returns

array with the layout index for the polytrode
array with the positions of the electrodes or None

static polytrode_position_in_grid(orientiation, xspacing=0.2, yspacing=0.2)[source]

The location of the polytrode with respect to the ephys grid

Parameters

orientiation – Char with the channel orientation of the ephys grid. One of ‘S’ or ‘R’
xspacing – Spacing of the electrodes in x
yspacing – Spacing of the electrodes in y

Returns

Two numpy arrays of two floats. The first array is the (x,y) index location in the grid, and the second array is the spatial (x,y) location is the spatial location.