torch_specinv.methods¶

torch_specinv.methods.ADMM(spec, max_iter=1000, tol=1e-06, rho=0.1, verbose=1, eva_iter=10, metric='sc', **stft_kwargs)[source]¶

Reconstruct spectrogram phase using Griffin–Lim Like Phase Recovery via Alternating Direction Method of Multipliers .

Parameters:

spec (Tensor) – the input tensor of size \((N \times T)\) (magnitude) or \((N \times T \times 2)\) (complex input). If a magnitude spectrogram is given, the phase will first be intialized using torch_specinv.methods.phase_init(); otherwise start from the complex input.
max_iter (int) – maximum number of iterations before timing out.
tol (float) – tolerance of the stopping condition base on L2 loss. Default: 1e-6
rho (float) – non-negative speedup parameter. Small value is preferable when the input spectrogram is noisy (inperfect); set it to 1 will behave similar to griffin_lim. Default: 0.1
verbose (bool) – whether to be verbose. Default: True
eva_iter (int) – steps size for evaluation. After each step, the function defined in metric will evaluate. Default: 10
metric (str) – evaluation function. Currently available functions: 'sc' (spectral convergence), 'snr' or 'ser'. Default: 'sc'
**stft_kwargs – other arguments that pass to torch.stft().

Returns:

A 1d tensor converted from the given spectrogram

torch_specinv.methods.L_BFGS(spec, transform_fn, samples=None, init_x0=None, outer_max_iter=1000, tol=1e-06, verbose=1, eva_iter=10, metric='sc', **kwargs)[source]¶

Reconstruct spectrogram phase using Inversion of Auditory Spectrograms, Traditional Spectrograms, and Other Envelope Representations, where I directly use the torch.optim.LBFGS optimizer provided in PyTorch. This method doesn’t restrict to traditional short-time Fourier Transform, but any kinds of presentation (ex: Mel-scaled Spectrogram) as long as the transform function is differentiable.

Parameters:

spec (Tensor) – the input presentation.
transform_fn – a function that has the form spec = transform_fn(x) where x is an 1d tensor.
samples (int, optional) – number of samples in time domain. Default: None
init_x0 (Tensor, optional) – an 1d tensor that make use as initial time domain samples. If not provided, will use random value tensor with length equal to samples.
outer_max_iter (int) – maximum number of iterations before timing out.
tol (float) – tolerance of the stopping condition base on L2 loss. Default: 1e-6.
verbose (bool) – whether to be verbose. Default: True
eva_iter (int) – steps size for evaluation. After each step, the function defined in metric will evaluate. Default: 10
metric (str) – evaluation function. Currently available functions: 'sc' (spectral convergence), 'snr' or 'ser'. Default: 'sc'
**kwargs – other arguments that pass to torch.optim.LBFGS.

Returns:

A 1d tensor converted from the given presentation

torch_specinv.methods.RTISI_LA(spec, look_ahead=-1, asymmetric_window=False, max_iter=25, alpha=0.99, verbose=1, **stft_kwargs)[source]¶

Reconstruct spectrogram phase using Real-Time Iterative Spectrogram Inversion with Look Ahead (RTISI-LA).

Parameters:

spec (Tensor) – the input tensor of size \((N \times T)\) (magnitude).
look_ahead (int) – how many future frames will be consider. -1 will set it to (win_length - 1) / hop_length, 0 will disable look-ahead strategy and fall back to original RTISI algorithm. Default: -1
asymmetric_window (bool) – whether to apply asymmetric window on the first iteration for new coming frame.
max_iter (int) – number of iterations for each step.
alpha (float) – speedup parameter used in Fast Griffin-Lim, set it to zero will disable it. Default: 0
verbose (bool) – whether to be verbose. Default: True
**stft_kwargs – other arguments that pass to torch.stft().

Returns:

A 1d tensor converted from the given spectrogram

torch_specinv.methods.griffin_lim(spec, max_iter=200, tol=1e-06, alpha=0.99, verbose=True, eva_iter=10, metric='sc', **stft_kwargs)[source]¶

Reconstruct spectrogram phase using the will known Griffin-Lim algorithm and its variation, Fast Griffin-Lim.

Parameters:

spec (Tensor) – the input tensor of size \((N \times T)\) (magnitude) or \((N \times T \times 2)\) (complex input). If a magnitude spectrogram is given, the phase will first be intialized using torch_specinv.methods.phase_init(); otherwise start from the complex input.
max_iter (int) – maximum number of iterations before timing out.
tol (float) – tolerance of the stopping condition base on L2 loss. Default: 1e-6
alpha (float) – speedup parameter used in Fast Griffin-Lim, set it to zero will disable it. Default: 0
verbose (bool) – whether to be verbose. Default: True
eva_iter (int) – steps size for evaluation. After each step, the function defined in metric will evaluate. Default: 10
metric (str) – evaluation function. Currently available functions: 'sc' (spectral convergence), 'snr' or 'ser'. Default: 'sc'
**stft_kwargs – other arguments that pass to torch.stft()

Returns:

A 1d tensor converted from the given spectrogram

torch_specinv.methods.phase_init(spec, **stft_kwargs)[source]¶

A phase initialize function that can be seen as a simplified version of Single Pass Spectrogram Inversion.

Parameters:	spec (Tensor) – the input tensor of size \((* \times N \times T)\) (magnitude). **stft_kwargs – other arguments that pass to `torch.stft()`
Returns:	The estimated complex value spectrogram of size \((N \times T \times 2)\)