API Reference
This section documents the main user-facing functions and classes provided by SpikeSift. All internal modules and helper functions are omitted for clarity.
- class spikesift.Recording(*, binary_file, data_type, probe_geometry, sampling_frequency, num_samples=None, header=0, sample_offset=0, recording_offset=0)[source]
Represents an extracellular recording stored in a flat binary file.
This class manages metadata and provides efficient access to raw voltage data for spike sorting. It assumes a flat binary layout with channels interleaved sample-wise.
- Parameters:
binary_file (str) – Path to the binary file containing the raw recording.
data_type (dtype) – NumPy-compatible data type (e.g.,
float32,int16).probe_geometry (ndarray of shape (recording_channels, 2)) – Spatial coordinates (in micrometers) of each recording channel.
sampling_frequency (float) – Sampling rate in Hz. Must be at least 1000.
num_samples (int, optional) – Total number of samples to load. If omitted, the number is inferred from file size and header.
header (int, optional (default=0)) – Number of bytes to skip at the beginning of the file.
sample_offset (int, optional (default=0)) – Number of samples to skip after the header.
recording_offset (int, optional (default=0)) – Logical start time in samples, used for aligning or merging segments. Does not affect how data are read.
Warning
After creation, this object should be treated as read-only.
Binary layout must be flat and channel-interleaved sample-wise.
The order of channels in probe_geometry must match the binary file.
- VALID_DATA_TYPES = ('int8', 'uint8', 'int16', 'uint16', 'int32', 'uint32', 'int64', 'uint64', 'float32', 'float64')
- read(*, start, num_samples)[source]
Reads a segment of the binary recording.
- Parameters:
start (int) – Sample index to begin reading, after accounting for
headerandsample_offset.num_samples (int) – Number of consecutive samples to read.
- Returns:
Extracted signal data as a NumPy array.
- Return type:
ndarray, shape
(num_samples, recording_channels)
Warning
This method is intended for debugging and manual inspection only.
SpikeSift handles all necessary data access internally during sorting.
- validate(*, verbose=False)[source]
Finalizes setup and verifies recording consistency.
- Parameters:
verbose (bool, optional) – If True, prints a summary of the recording.
- Raises:
ValueError – If any of the file, geometry, or offset parameters are invalid.
Warning
This method is called automatically during spike sorting.
Manual calls are typically only necessary for debugging or inspection.
- class spikesift.core.SortedRecording(*, sorted_segments, assignment_chain, probe_geometry)[source]
Represents a fully sorted and drift-corrected extracellular recording.
This class merges spike clusters across multiple independently sorted segments, and provides access to global spike times, amplitude vectors, and segment boundaries.
- Parameters:
sorted_segments (list of SortedSegment (internal)) – List of sorted segments, each containing spike clusters and amplitude representations.
assignment_chain (list of ndarray of shape
(num_clusters,)) –One-to-one mappings between adjacent segments.
Each array maps cluster indices from one segment to the next.
Unassigned entries are marked with -1.
probe_geometry (ndarray of shape
(recording_channels, 2)) – 2D electrode layout used for drift compensation.
Warning
Do not modify
sorted_segments,assignment_chain, orprobe_geometryin place. They are shared across recordings and treated as immutable.
- all_spikes()[source]
Returns spike times for all valid clusters.
- Returns:
Dictionary mapping cluster IDs to spike times.
- Return type:
dict of int -> ndarray
- amplitude_vectors(cluster_id)[source]
Returns the amplitude vectors for a single cluster across all segments.
- Parameters:
cluster_id (int) – ID of the cluster.
- Returns:
Amplitude vector for each segment.
- Return type:
ndarray of shape (num_segments, recording_channels)
- Raises:
ValueError – If the cluster ID is not valid for this recording.
Warning
Values reflect both spike-related activity and background fluctuations, and may be nonzero even on channels where the neuron is inactive.
- cluster_ids()[source]
Returns all valid cluster IDs for this recording.
- Returns:
Set of cluster IDs that are valid across the entire recording.
- Return type:
set of int
Warning
IDs may refer to different units across different SortedRecording objects.
To compare clusters between recordings, use
map_clusters().
- end_time()[source]
Returns the global end time of the recording (in samples).
- Returns:
End time in samples.
- Return type:
int
- segment_boundaries()[source]
Returns start and end sample indices for all segments.
- Returns:
List of
(start_sample, end_sample)pairs, one per segment.- Return type:
list of tuple
- spikes(cluster_id)[source]
Returns spike times for the specified cluster.
- Parameters:
cluster_id (int) – The cluster ID to retrieve.
- Returns:
1D NumPy array of spike times for the selected cluster.
- Return type:
ndarray
- Raises:
ValueError – If the cluster ID is not valid for this recording.
Warning
Cluster IDs are only valid within this SortedRecording instance.
To avoid invalid lookups, use .cluster_ids() to retrieve the set of valid cluster IDs.
- split_into_segments()[source]
Splits the recording into its original unmerged segments.
- Returns:
Each entry corresponds to one original segment.
- Return type:
list of SortedRecording
- start_time()[source]
Returns the global start time of the recording (in samples).
- Returns:
Start time in samples.
- Return type:
int
- valid_cluster_id(cluster_id)[source]
Checks whether a cluster ID is valid across the entire recording.
- Parameters:
cluster_id (int) – The cluster ID to validate.
- Returns:
True if the cluster is consistently matched across all segments; False otherwise.
- Return type:
bool
Warning
A cluster is considered valid only if it is present in every segment of the recording.
Clusters that disappear or fragment in later segments will return False.
- spikesift.perform_spike_sorting(recording, *, min_segment_length=10, detection_sensitivity=10, min_spikes_per_cluster=5, merging_threshold=0.4, max_drift=30, detection_polarity=-1, verbose=True)[source]
Performs complete spike sorting on an extracellular recording.
- Parameters:
recording (Recording) – The input recording object.
min_segment_length (float, optional (default=10)) –
Minimum segment duration (in seconds) for adaptive segmentation.
Controls how the recording is partitioned
Must be at least 0.1 seconds
Values below 0.1 are automatically clipped
If the recording itself is shorter than this, it is processed as a single segment
detection_sensitivity (float, optional (default=10)) –
Multiplier for spike detection thresholds.
Must be positive
Higher values reduce false positives, but may miss weaker spikes
Lower values increase sensitivity, but may introduce noise
min_spikes_per_cluster (float, optional (default=5)) –
Minimum number of spikes required for a cluster to be considered valid.
Must be at least 2
Values below 2 are silently clipped
Although spike counts are integers, this threshold is treated as a float and compared directly
merging_threshold (float, optional (default=0.4)) –
Similarity threshold for merging clusters based on spatial waveform differences.
Must be between 0 and 1 (exclusive)
Higher values allow more aggressive merging
Lower values enforce stricter separation
max_drift (float, optional (default=30)) –
Maximum vertical shift (in micrometers) used for aligning clusters across segments.
Must be non-negative
Internally rounded to the nearest multiple of 5
Larger values enable alignment over larger displacements
detection_polarity (float, optional (default=-1)) –
Scalar applied to the signal prior to spike detection.
Use -1.0 to detect negative-going spikes (default)
Use +1.0 to detect positive-going spikes
Any other nonzero value is allowed; only the sign affects detection
verbose (bool, optional (default=True)) – If True, displays progress bar and recording information.
- Returns:
A fully sorted recording, including spike times, cluster identities, and amplitude vectors.
- Return type:
- Raises:
ValueError – If any input parameter is invalid or improperly typed.
Warning
Recordings shorter than 10 milliseconds cannot be processed and will raise an error.
SpikeSift requires at least 4 channels for spike sorting.
- spikesift.merge_recordings(sorted_recordings, *, max_drift=30)[source]
Aligns and merges multiple independently sorted recordings into a unified result.
- Parameters:
sorted_recordings (list of SortedRecording) –
List of independently sorted recordings to be merged. Each entry must:
Contain at least one valid segment
Use the same probe geometry
Be sorted in time and have non-overlapping segments
max_drift (float, optional (default=30)) –
Maximum vertical shift (in micrometers) allowed when aligning clusters across segments.
Must be non-negative
Internally rounded to the nearest multiple of 5
Higher values allow alignment over larger displacements
- Returns:
A single merged recording containing all aligned spike clusters.
- Return type:
- Raises:
ValueError – If the input list is empty, contains invalid types, includes inconsistent geometries, or includes overlapping segment time ranges.
Warning
This function assumes all inputs were produced by SpikeSift and remain unmodified.
- spikesift.map_clusters(source, target, *, max_drift=30)[source]
Computes a one-to-one mapping from clusters in
sourceto their counterparts intarget.- Parameters:
source (SortedRecording) – First sorted recording to compare.
target (SortedRecording) – Second sorted recording to compare.
max_drift (float, optional (default=30)) –
Maximum vertical displacement (in micrometers) used during alignment.
Must be non-negative
Internally rounded to the nearest multiple of 5
Higher values permit alignment across larger drift magnitudes
- Returns:
Mapping from cluster IDs in
sourceto corresponding cluster IDs intarget. Only valid, unambiguous one-to-one matches are included.- Return type:
dict of int -> int
- Raises:
ValueError – If inputs are invalid or incompatible (e.g., mismatched geometry).
Warning
This function assumes that both
sourceandtargetwere generated using SpikeSift and have not been manually modified.