next up previous contents
Next: How to Run SIGSPEC Up: SigSpec User's Manual by Previous: Abstract   Contents

What is SIGSPEC?

SIGSPEC (abbreviation of `SIGnificance SPECtrum') is a program that computes a significance spectrum for a time series. It evaluates the Probability Density Function (PDF) of a given DFT amplitude level analytically, making use of the theoretical concept introduced by Reegen (2005, 2007). The False-Alarm Probability, $\Phi_\mathrm{FA}\left( A\right)$, is the probability that an amplitude in the DFT spectrum exceeds a given limit $A$, and is obtained through integration of the PDF (e.g. Scargle 1982). Instead of this frequently used quantity, SIGSPEC calculates the spectral significance (abbreviated by `sig') of an amplitude $A$ by

\begin{displaymath}
\mathrm{sig}\left( A\right) := -\log\left[ \Phi_\mathrm{FA}\left( A\right)\right]\, .
\end{displaymath} (1)

E.g., a sig equal to $5$ indicates that the considered amplitude level is due to noise in one out of $10^5$ cases. This value is used as the default threshold for the termination of the prewhitening sequence.

SIGSPEC performs an iterative process consisting of four steps[*]:

  1. computation of the significance spectrum,
  2. exact determination of the peak with maximum sig,
  3. a MultiSine least-squares fit of the frequencies, amplitudes and phases of all significant signal components detected so far,
  4. prewhitening of the sinusoidal components. The residuals are used as input for the next iteration.

If SIGSPEC is called without any special settings, it produces four files:

  1. the DFT amplitude spectrum s000000.dat of the original time series, containing also sig and phase,
  2. the DFT amplitude spectrum resspec.dat of the residual time series after prewhitening all significant signal components, containing also sig and phase,
  3. the residual time series residuals.dat after prewhitening all significant signal components,
  4. a result file called result.dat, which contains a list of significant signal components,
  5. MultiSine track files, each of which contains a list of the frequencies, amplitudes and phases for a single sinusoidal component through the prewhitening cascade (pp.[*], [*]).
Further options may be applied to obtain spectra, residuals, and/or result files (p.[*]) in the prewhitening sequence. The MultiSine fits, which are performed after each prewhitening step, modify the frequencies, amplitudes and phases of previous components. If the user examines the resulting signal components and decides not to use all of them, the additional result files help to have accurate frequencies, amplitudes and phases in hands also for a shorter list of significant sinusoids without re-running the program.

SIGSPEC can produce additional files containing

  1. a spectral window for the given time series (pp.[*], [*]),
  2. a sampling profile (pp.[*], [*]) containing the parameters $\alpha _0\left(\omega\right)$, $\beta _0\left(\omega\right)$, $\theta _0\left(\omega\right)$ determining the dependency of the sig on the time-domain sampling, as well as on frequency and phase in Fourier space (see Reegen 2007),
  3. a preview of the SIGSPEC analysis (pp.[*], [*]),
  4. a Sock Diagram (pp.[*], [*]),
  5. a Phase Distribution Diagram (pp.[*], [*]) containing probability densities for the Fourier phases,
  6. a correlogram for each step of the prewhitening sequence (pp.[*], [*]).
These options are deactivated by default.

Given a sequence of prewhitenings yielding $N$ significant components with associated sigs $\mathrm{sig}\left( A_n\right)$, it is desirable to additionally know the probability of the entire sequence to be valid. This means that not a single erroneous component is allowed. The False-Alarm Probability $\Phi _{\mathrm{FA}\,n} = 10^{-\mathrm{sig}\left( A_n\right)}$ of an individual peak is the probability that it is generated by noise. The complementary probability that the considered peak is true is $1 - 10^{-\mathrm{sig}\left( A_n\right)}$. If the individual components are statistically independent, the cumulative probability of all components to be real is the product of the individual probabilities,

\begin{displaymath}
1 - \Phi _{\mathrm{FA}} = \prod _{n=1}^N\left(1 - \Phi _{\mathrm{FA}\,n}\right)\: .
\end{displaymath} (2)

Consistently, the cumulative sig is introduced as the negative logarithm of this total False-Alarm Probability for all identified signal components, $\Phi _{\mathrm{FA}}$, and in terms of individual sigs, one obtains
\begin{displaymath}
\mathrm{csig}\left( A_N\right) := - \log\left\lbrace 1 - \pr...
... - 10^{-\mathrm{sig}\left( A_n\right)}\right]\right\rbrace\: .
\end{displaymath} (3)

In consistency with the definition of the sig associated with an amplitude in the DFT spectrum, a cumulative sig of $3$ means that the prewhitening cascade is entirely true in $999$ out of $1\,000$ cases. Or - in other words - in one out of $1\,000$ cases, at least one of the identified components is generated by noise.

Whereas the individual sig of a component in the prewhitening sequence may exceed that of the previously identified maximum, the cumulative sig is a monotone sequence uniquely decreasing with each additional signal component.

The prewhitening loop stops, if no sig level above a pre-defined limit is found. As described in ``Program termination'', p.[*], there are three different criteria that may be applied to determine the conditions for program termination:

  1. the number of iterations in the prewhitening sequence,
  2. a lower sig limit for the highest peak in the significance spectrum,
  3. a threshold for the cumulative sig related to a combined probability for all detected frequency components.

The program also supports the subdivision of a time series into a set of intervals and the separate analysis of all these parts in order to monitor frequency changes of signal components with time. This method will be called time-resolved analysis. In this case, the output is somewhat richer, as described in ``Time-resolved Analysis'' (p.[*]).

An immanent problem in the analysis of non-equidistantly sampled time series is aliasing. Due to periodic gaps in the data set, a peak in the amplitude spectrum is accompanied by side peaks. Especially if more than one sinusoidal component is present in the data, the superposition of side peaks may produce a maximum amplitude in the DFT spectrum at a frequency that has nothing in common with the true signal frequencies. Such a misidentification usually damages the complete prewhitening sequence from this point on. As pointed out by Reegen (2007), SIGSPEC appears less prone to aliasing than the previously used methods, since the noise component is employed into the statistical treatment correctly. However, the superposition mentioned above may also lead to erroneous identifications.

In order to overcome this potential weakness, SIGSPEC supports the simultaneous calculation of more than one signal component simultaneously. Instead of picking only the peak associated to maximum sig, a whole set of highest peaks is examined, searching all possible combinations for several iterations in order to obtain the solution providing a minimum rms residual. This function is called AntiAlC (ANTI-ALiasing Correction) mode (p.[*]).

There is a second option to examine multiple peaks simultaneously: a non-sinusoidal periodicity is represented by multiple peaks in the DFT amplitude spectrum. One finds a fundamental frequency, plus one or more harmonics the frequencies of which are integer multiples of the fundamental. In astronomical applications, this may occur if shock waves are present in the stellar pulsation or if surface variations are examined. In such a case, it is desirable to take into account not only the fundamental frequency, but also all available harmonics at once. This analysis of harmonics is described on p.[*]).

SIGSPEC is capable of analysing multiple time series input files simultaneously. This MultiFile mode (p.[*]) speeds up the computation considerably for time series with the same sampling.

A further option is the evaluation of differential significance spectra (p.[*]). The user may specify target vs. comparison data among the input files. Then SIGSPEC performs a quantitative comparison of the two groups of time series and returns a measure of the probability that a peak in a target dataset is `true', taking into account amplitudes and phases at the corresponding frequency in the comparison spectra. In this context, the term `true' is used in the sense of `not entirely produced by the same variability as present in the comparison data'.

The examples presented here refer to the sample projects available for download at http://www.astro.univie.ac.at/SigSpec.


next up previous contents
Next: How to Run SIGSPEC Up: SigSpec User's Manual by Previous: Abstract   Contents
Piet Reegen 2009-09-23