Periodicity searches - Pulsar search methodology

2.2 Pulsar search methodology

2.2.4 Periodicity searches

Pulsar searching 41

Pulsar searching 42 Performing a DFT through brute-force calculation of Equation 2.8 is computationally expensive, requiringO(N²) arithmetical operations. To reduce the number of operations, and, therefore, the amount of time required to calculate the DFT, we make use of the fast Fourier transform (FFT) algorithm (Cooley and Tukey, 1965). The FFT is a highly efficient algorithm which requires only O(NlogN) operations to calculate an N-point real DFT. Furthermore the FFT may itself be sped up by realisation that Tk ∈ R, meaning that the DFT is symmetric aboutν_Nyquist, with the value ofFN−j simply being the complex conjugate of Fj. Currently the fastest free-ware FFT library available is FFTW⁸. FFTW has become the standard FFT library used in pulsar search software in recent years due to its portability, speed⁹ and ease of use.

In the case of bright normal pulsars, we may detect them through simple examination of the power in each Fourier bin. This is done by creating a power spectrum, where P^j =R(F^j)²+I(F^j)². However, in the case of weak pulsars, especially those with long periods or narrow pulse profiles, we must use several techniques to improve the response of the DFT to pulsar-like signals.

2.2.4.3 Improving DFT frequency response

One of the largest problems associated with using the DFT, is that it has a non-uniform frequency response. For signals with frequencies which fall in the center of a Fourier bin, ν =ν_j, the response is maximised, whereas for signals which fall on the edge of a Fourier bin,ν =νj±1/(2t_obs), the response is minimised. In the worst case scenario the degree of amplitude degradation can be as high as 36% (see e.g. Ransom et al., 2002), meaning that methods should be used to recover this power before searching for periodic signals in the power spectrum.

Through the use ofzero-padding, we may increase the spectral resolution of the DFT. In zero-padding, the time series to be transformed is padded withN_pad data points of mean value, until the frequency resolution of the DFT output, 1/(t_samp[N+N_pad]), reaches a desired level. The addition of such a noiseless stream of constant-valued data to the time series will produce a uniform offset in the amplitude and phase of the Fourier components (see e.g. Bracewell, 2000). As a consequence, the limiting factor in zero-padding is the compute speed and memory usage required to perform very long FFTs.

8http://www.fftw.org/

9FFTW maintains speed by using a variety of different algorithms depending on the size and shape of the data set to be transformed. The algorithm with the lowest number of operations, the so called

‘split-radix’ algorithm (Johnson and Frigo, 2007), is sped up by the ability to factor the data using small prime numbers. It is therefore fastest to perform FFTs on data sets which are a power of two in length.

Pulsar searching 43

0 20 40 60 80 100

Frequency (Hz)

0.0

0.2 0.4 0.6 0.8 1.0

R el at iv e am pl it ud e

40 . 0 Hz (Bin centre)

0 20 40 60 80 100

Frequency (Hz) 40 . 5 Hz (Bin edge)

39.0 39.5 40.0 40.5 41.0

Frequency (Hz)

0.6

0.7 0.8 0.9 1.0

R el at iv e am pl it ud e

non−interpolated interpolated

Figure 2.7: The frequency response of the DFT. Top-left: The response for a signal with a frequency that falls in the centre of a Fourier bin. Top-right: The response for a signal with a frequency that falls halfway between adjacent Fourier bins (solid, blue line) and the recovered response after application of Equation 2.9 (solid, red line).

Bottom: The response as a function of frequency, before (solid, blue line) and after (solid, red line) application of Equation 2.9. Here the width of a Fourier bin is 1 Hz.

Another method of recovering signals that fall off the bin centre, is to compare the power in each bin with that of its two nearest neighbours. By this method, the power in the j^th Fourier component is given by (see e.g. Lorimer and Kramer, 2005),

Pj = max

|Fj−1+Fj|²

2 ,|F|²,|Fj+Fj+1|² 2

. (2.9)

Here, any power from a signal which lies on a bin edge is reconstructed into its nearest whole bin.

The above methods are not mutually exclusive and are often used in tandem when analysing pulsar search data. In certain circumstances, it is useful to pad the time series up to a power-of-two length before processing, as this both increases resolution and can, with certain FFT algorithms, speed up the DFT calculation. After the spectrum has been created, it is always useful to compare neighbouring bins, regardless of the increased spectral resolution.

Pulsar searching 44 2.2.4.4 Spectral whitening

If the observation is predominantly Gaussian noise, then the output of the DFT should also be predominantly Gaussian noise. However, gain fluctuations in the receiver, com-bined with strong, long-period RFI or backend instabilities, can produce large power contributions at the low-frequency end of the Fourier spectrum (Israel and Stella, 1996).

Such excess red noise, significantly alters the statistical properties of the spectrum and can act to mask long-period pulsars from the data. It is therefore necessary to apply methods to whiten the spectrum prior to searching it for periodic signals.

The first step in any whitening procedure is to zero the first bin, the DC bin, in the Fourier spectrum. The value of this bin is simply the mean of the transformed time series¹⁰ and contains, at least for our purposes, no useful information. Dependent on the degree of red-noise power, it is often useful to perform brute-force suppression of the lowest frequency components in the data. In this case the spectrum is zeroed up to the minimum frequency which is to be searched for pulsed signals. Typically this is done up to a frequency of∼0.1 Hz, although it should be noted that for longer observations the red noise components in the data may be more easily modelled, allowing for this limit to be reduced.

After the lowest frequency components have been dealt with, the spectrum may be whitened by subtraction of a running mean or median (see e.g. Israel and Stella, 1996;

Ransom et al., 2002). Although calculation of a running mean is computationally faster than calculation of a running median, the running median is used more often due to its lack of sensitivity to outlying data points. Special care must be taken in selection of the window sizes for the median filter, as signals which drift in frequency spread their power over a number of adjacent Fourier bins. In cases where the filter window is comparable in size to the number of Fourier bins over which a signal has drifted, the application of the filter can result in suppression of the signal. As signals which drift in frequency are characteristic of pulsars in binary systems, the filter window size is often chosen to have frequency-dependent width, with windows at higher frequencies being larger than those at lower frequencies.

After the whitening process the Fourier components will have a mean of zero. By further normalising the spectrum such that it has a variance of one, the S/N of any signal in the power spectrum is simply its Fourier amplitude, p

Pj.

10If we look at Equation 2.8, in the case of the first Fourier bin (i.e. k = 0), the last term becomes e^−i2πj^N⁰ = 1, makingF0= _N¹

N−1

k=0

Tk.

Pulsar searching 45

Figure 2.8: A comparison of spectral whitening using a running median filter for a 3 Hz sine wave in white (Gaussian) and red noise. Top panels: Two time series containing a 3 Hz sine wave. Both time series are composed of Gaussian noise, with the right-hand time series containing strong low-frequency noise. Middle panels: The Fourier spectrum of each time series. Lower panels: The result of the application of a running median filter with frequency dependent width. At 0 Hz the filter width is 3 bins, while at 5 Hz the filter width is 13 bins. A brute-force suppression of the power

below 1.0 Hz is also performed.

2.2.4.5 Harmonic summing

Of the pulsars with measured pulse widths in the ATNF pulsar catalogue¹¹, only 1%

have duty-cycles greater than 30%. A consequence of these narrow duty-cycles is that the signals from pulsars are not a clean tones in the Fourier spectrum. Instead, the power from the pulse is spread over the fundamental and a number of harmonics.

To examine the distribution of power throughout the harmonics, we shall consider the behaviour of a pulsed signal with a 10% duty-cycle and a profile given by a ‘top-hat’

function. The Fourier transform of this top-hat function is proportional to sinc(πδ) (see e.g. Bracewell, 2000), whereδ is the duty-cycle of the pulse. Similarly the Fourier transform of a train of top-hat pulses will be a train of delta functions modulated by the shape of the transform of a single pulse. Figure 2.9 shows the modulation of harmonics for our pulsed signal in the case where it has a fundamental frequency of 10 Hz. It is clear to see from Figure 2.9 that the harmonic envelope has its first null atf = 1/(P δ) = 1/W,

11http://www.atnf.csiro.au/people/pulsar/psrcat/(Manchester et al., 2005)

Pulsar searching 46

0 100 200 300 400 500

Frequency (Hz)

Fourieramplitude

DFT of pulse train DFT of folded profile

0.0 0.2 0.4 0.6 0.8 1.0 Phase

Amplitude Folded profile

Figure 2.9: The modulation of harmonics from a noiseless top-hat pulse with a duty-cycle of 10% and fundamental frequency of 10 Hz

where W is the width of the pulse. If we consider only harmonics up to the first null, then we can say that a top-hat signal of duty-cycle δ and period P, spreads its power over 1/δ harmonics, spaced by 1/P.

To recover the power distributed in these harmonics, the process ofincoherent harmonic summing (Taylor and Huguenin, 1969) is used. Here, the spectrum is summed with a stretched copy of itself such that all second harmonics are added to their corresponding fundamentals. In the case of the noiseless signal shown in Figure 2.9, this results in almost a doubling of the Fourier amplitude. In reality there will be significant Gaussian noise in the observation. As the summation of two Gaussian distributions with standard deviation σ, is another Gaussian distribution with standard deviation √

2σ, the net increase in S/N achieved through a single incoherent harmonic sum is of the order √

2 (Taylor and Huguenin, 1969). By repeating the summing process for both even and odd harmonics, we may recover a significant portion of the power distributed throughout the spectrum. It should be noted that for wider-profile pulsars, such as the majority of MSPs, summing up to the higher-order harmonics will not significantly improve the signal. In this case the higher-order harmonics will simply add noise to the signal.

For each harmonic fold performed, we search the resultant spectrum for statistically significant signals.

2.2.4.6 Identifying significant signals

To determine which periodic signals are statistically significant, we must first consider the statistical properties of the Fourier power spectrum. Groth (1975) showed that, for a time series composed of both signal and Gaussian noise, the PDF of its Fourier power

Pulsar searching 47 spectrum, after nharmonic sums, may be described by

p(P) = Pⁿ⁻¹

Γ(n) exp^−P, (2.10)

where the Fourier powers have been normalised such that the total power is zero (i.e normalisation by NT¯k, where N is the number of samples in time series Tk (see e.g.

Ransom et al., 2002)) and Γ(n) = (n−1)!. By integrating this relation between our lowest significant power Pmin and infinity we find

p(P >Pmin) =

j=1

P_min^j⁻¹

Γ(j) exp^−P^min, (2.11)

wherep(P >Pmin) is the probability that the power in a given Fourier bin exceedsPmin

for n harmonic folds. By setting this probability such that it gives only a single false positive in n_trials independent Fourier trials, we may estimate a minimum significant power level for a pulsar search.

This approach may also be extended to searches which work with Fourier amplitudes rather than powers. It can be shown (see e.g. Lorimer and Kramer, 2005) that in this case the minimum significant S/N is given by

S/Nmin =

pln [n_trials]−p π/4

p1−π/4 . (2.12)

For signals which exceed S/N_min orPmin, a secondary measurement of their significance can be obtained by reconstruction of the time domain pulse (see e.g. Manchester et al., 2001). This is done by performing the inverse DFT of only those Fourier bins which contain the signal, its harmonics and their complex conjugates. As the reconstructed profile considers the phase of the signal, the effects of random RFI superpositions on the signal’s harmonics are suppressed. By the same logic, the reconstructed profile also selects against sinusoidal signals in the data, as very little signal will be present in the harmonics being back-transformed. This effect can be both desirable and detrimental, as while many RFI signals are sinusoidal in nature, the reconstructed profile also selects against pulsars with wide pulse profiles. The S/N of the reconstructed profile can be measured using standard techniques (see Section 2.2.5.2).

2.2.4.7 Acceleration searching

So far, all the methods described to enhance and detect periodic signals in a time series have only considered signals which have approximately uniform frequency throughout

Pulsar searching 48 the course of an observation. However, many of the most interesting pulsars we know of reside in binary systems. In such systems, changes in the line-of-sight velocity due to the orbital motion of the pulsar, creates a time-varying Doppler correction to its intrinsic frequency. This correction takes the form ν(t) = ν₀(1 +v_t/c), where ν(t) is the frequency of the pulsar at time t, ν0 is the intrinsic frequency of the pulsar and vt is the line-of-sight velocity of the pulsar at time t, where vt << c. The change in apparent frequency across an observation is therefore given by ∆ν =aν₀t_obs/c, assuming that the line-of-sight acceleration, a, is constant (an approximation which is valid for observations wheret_obs .P_orb/10 (Ransom et al., 2003)). By expressing ν^. in terms of a number of Fourier bins, Z, we find that a pulsar of intrinsic period P will spread its power over Z =|at²_obs/P c|Fourier bins during an observation of integration time t_obs. Applying methods which can recover power spread across multiple Fourier bins allows us to detect pulsars in relativistic binary systems.

For all acceleration searches in this thesis, we apply the ‘correlation technique’ of matched filtering in the Fourier domain (Ransom et al., 2002). Here the Fourier am-plitudes are cross-correlated with an analytically predicted signal response function of width Z. The correlation is therefore maximised when the pulsar has an acceleration

|a| = ZP c/t²_obs. To make our search sensitive to a range of possible accelerations, matched filters are applied multiple times with different values ofZ.

2.2.4.8 Candidate improvement

Having found significant accelerated or non-accelerated signals in the Fourier domain, it is possible to improve the frequency response of the DFT usingFourier interpolation.

For a hypothetical signal at a frequency νr, which is less than half a Fourier bin from frequencyν_k, the DFT response can be shown (see e.g. Ransom et al., 2002) to be

F^r =Fksinc[π(k−r)]. (2.13)

As this response is sinc-like, the majority of the power from the signal will be distributed among Fourier bins local to Fk. By correlating the nearest bin to r with a predicted response function for the signal, the true frequency of the signal may be recovered (Ransom et al., 2002). Typically the response function covers only a small number of bins (∼32) around the nearest bin tor. This technique may also be extended to account for power distributed throughout higher harmonics.

Pulsar searching 49 2.2.4.9 Candidate sifting

After periodicity searching, the candidates are sifted to remove duplicates and to attempt to ensure that all detected signals are of the fundamental frequency. Furthermore, when periodicity searches have been performed for all trial DMs, the candidates can be sifted to remove signals which exhibit RFI-like behaviour. This can be signals which do not appear at a minimum number of adjacent DM trials, signals which appear strongest below a threshold DM¹²or signals which are at known RFI periods. A true pulsar signal should appear at multiple DM trials, with the S/N in each DM trial being dependent on the true DM of the pulsar, its pulse width and its rotational period (see figures 2.5 and 2.6). At this stage the candidates may be examined graphically or ranked on significance, or S/N, to be passed to a phase-folding algorithm.

Im Dokument Searching for Pulsars with the Eﬀelsberg Telescope (Seite 63-71)