Celestial Emission - Slowing down - Positron-Annihilation Spectroscopy throughout the Milky Way

Slowing down

4.4 Celestial Emission

Gamma-ray data from SPI is completely background dominated, so that an inver-sion of Eq. (3.12) towards the celestial components (deconvolution) is not

straight-forwardly applicable. A projection of the measured photon counts onto a sky map is possible in principle, but would only work if the source is signicantly stronger than the background. Otherwise, the complexity of the imaging response func-tion (mask, shadowgram) will produce arbitrary and imprecise results. In general, there are two possibilities to infer information from data in image space. Either the point-spread-function is used to deconvolve the image towards what is actually occurring in the sky, i.e. producing an "unblurred" image, or the forward-folding method is used to compare models with data, i.e. the assumed model is convolved ("blurred") to the data space. The convolution of a model with a precisely known response function is usually easier to apply and to judge than the deconvolution of background dominated data with unknown contributions from celestial signals.

However, forward-modelling may be biased by ad hoc assumptions which can not be proven in many cases.

Astrophysical Bias

Informational Content

Richardson-Lucy Clean

Algorithm Maximum

Entropy Pixon Method

Model Fitting Information

Field Theory

Figure 4.5: Imaging analysis approaches informational content against astrophysical bias. While methods like the Richardson-Lucy or the Clean algorithm can provide an initial guess or rst imaginations of how the data space looks like in an image, their astrophysical interpretation is in most cases questionable. Likewise, the Maximum Entropy method decomposes an image based on probabilistic theory but is inuenced strongly by the starting point or number of pixels (or pixons) of the iterative algorithm. Information Field Theory is the most modern fully Bayesian approach for the data analysis problem. It starts from very rst principles but can also incorporate prior information about astrophysical systems (dashed extended horizontal box) and can answer very specic questions to the data. Likewise, astrophysical model tting assumes the properties of, for example, a Galaxy made of a bulge and disk, and directly determines the values for the input model components. By incorporating also prior information of the astrophysics that may happen as part of the model, more information can be extracted (dashed extended vertical box).

A degree of bias in gamma-ray image analysis and reconstruction can be dened:

Unbiased imaging methods are the RL deconvolution or information eld theory (IFT). More bias is added in Maximum Entropy (ME) method which may be biased by the starting point of the iterative algorithm or pixons¹. Full-forward model-tting in data space is then biased by ahe imagination of how an astrophysical system might look like. The IFT, for example, allows to answer very specic questions from the data, typically of the form: "Assuming the spectrum of a population sources has a certain shape, where in the sky can those sources be found?" In other words, where does the sky look like from what is expected from any supposed imagination. This is biased as well from the spectral perspective, but can provide detailed and probab-ilistic correct answers. On the other hand, model-tting allows to derive unbiased

1Pixons are image objects like wavelets, spline functions or normal pixels.

spectra of what has been input as (astrophysically) biased morphology. In Fig. 4.5, the quantitative informational content of certain imaging analysis approaches is il-lustrated qualitatively against their bias.

4.4.1 Unbiased Imaging Methods

Richarson-Lucy Deconvolution

The RL method assumes that a measured (blurred) image is described by the con-volution of a latent (real) image with the point-spread-function of the instrument (see Fig. 3.12b), with photons being Poisson distributed. An observed pixel with valued_i at a position i in image space is hence formally written as

d_i =X

p_ijm_j ↔d=p⊗m. (4.1)

Here,p_ij is the point-spread-function which is distributing a fraction of photons with valuesm_j originally at the true positionj to the observed positioni. The right-hand side of Eq. (4.1) is equivalent to the left-hand side, using the shorter notation of the convolution operator. The Poissonian likelihood, Eq. (3.13), is then used to obtain a maximum likelihood estimate of the true image valuem_j, which leads to an iterative formulation,

m^(r+1)_j =m^(r)_j X

p_ij d_i P

jpijm^(r)_j

↔m^(r+1)=m^(r)

p⊗ d p⊗m^(r)

. (4.2)

It has been shown by Shepp & Vardi (1982) that Eq. (4.2) indeed converges to the maximum likelihood solution for m_j in cases when the "Poissonian noise" is not dominant. Otherwise, the RL method will amplify this noise by overtting the data, i.e. squeezing counts into single bins in data space which will produce artefact like images. Consequently, there is no real objective stopping criterion when to interrupt the algorithm. Often, the maximum likelihood or χ² statistics are used to judge whether an image is representative or not.

Knoedlseder (2005) illustrate the performance of the RL algorithm for the 511 keV emission measured with SPI by illustrating the trade-o between maximum likeli-hood value, derived ux, granularity of image, and comparisons to the exposure map for dierent iterations steps. As the most plausible sky image, he chose iteration 17, although all iterations between 10 and 25 are similarly well describing the data.

The map in Fig. 4.6a yields an integrated ux of ≈ 1.4×10⁻³ ph cm⁻² s⁻¹ while the map in Fig. 4.6b counts≈1.5×10⁻³ ph cm⁻² s⁻¹. Statistically, the maps dier by ∆χ² ≈ 10, which, accounting for the dierence in degrees of freedom (1) is a signicant dierence. By considering the total number of data points in the analysis (129599), the two images are statistically indistinguishable. The maps, however, dier in the appearance of the galactic disk in 511 keV, which is not seen at all for

(a) Richardson-Lucy iteration #17. (b) Richardson-Lucy iteration #25.

Figure 4.6: Positron annihilation sky map from Richardson Lucy deconvolution. Shown are two acceptable iterations (#17 left and #25 right) with a5^◦×5^◦boxcar smoothing. Contours are indicating levels of10⁻²,10⁻³, and10⁻⁴ph cm⁻²s⁻¹sr⁻¹from the centre outwards. From Knödlseder et al. (2005).

iteration #17. If the RL algorithm is pushed further, more and more emission is revealed, even in the far outskirts of the disk. Due to the apparently bad likelihood of images beyond iteration≈#40, these have not been considered real in Knödlseder et al. (2005), also because it seemed that the exposure map partly correlated with the derived 511 keV map.

In general, it can be seen that there is a bright emission region in the centre of the galaxy whose longitudinal distribution (integrated over the latitude range) is slightly asymmetric. Empirically, these regions can be described by two 2D Gaussians with a compact component of 5^◦ FWHM, and a broader component with a FWHM of 10-20^◦ (Knödlseder et al. 2005). Knödlseder et al. (2005) also showed by simulations that low surface brightness regions will hardly be traced by the RL method at that time (low exposure in the disk), which could be one reason why the disk in 511 keV was seen only weakly.

Maximum Entropy Method

The ME method exploits the information content of an image by penalising a goodness-of-t criterion with a regularisation function. Because the goodness-of-t criterion (e.g. theχ² measure) has many possible solutions, and because maximum-likelihood-based approaches (e.g. RL or gradient searches) tend to overt gamma-ray data, another gure of merit is introduced which includes the image properties.

This function is chosen to smooth the discretised pixel distribution of an image (Skilling 1989). In particular, the "best" set of proportions p_i (image brightness values) with i = 1,2, ..., N on N a priori equivalent cells (here: image pixels) is obtained by maximising the entropy of the image

S(~p) =−

i=1

pilogpi. (4.3)

This generalises in terms of measured data pixeld_i and proposed model pixelm_i, in which the prior knowledge about the sky intensity d_i is contained, to

S(d, ~~ m) =

i=1

d_i−m_i−d_ilog d_i

m_i

. (4.4)

From Eq. (4.4) it is evident that S(d, ~~ m) is maximal for d~= m~. In total, a Lag-rangian method containing a regularisation with the entropy function,S, and a t quality function, describing the discrepancy between model and data,L, is applied to the data set, aiming to maximise

Q(d) =~ αS(d)~ −L(d).~ (4.5) Here, α is the Lagrange multiplier, balancing between the inuence of the entropy and of the log-likelihood on the resulting image, and is iteratively adjusted during the ME procedure. The log-likelihood function is distributed according to the χ² distribution, L = −¹₂χ², so that the expectation value of L is −¹₂N with N the number of data points, i.e. the "best t" is found whenχ²/N = 1. Historically, the maximum entropy method was hence dened as maximising the entropy function, S, over the constraint that χ² =^! N, which is nally expressed in Eq. (4.5) (e.g.

Frieden 1972; Skilling 1989).

Although the ME method it based on "rst principles", the application of the method is not unique. The entropy function can also take several forms of which Eq. (4.4) is the most popular one. Another problem is the correlation between adjacent pixels in "realistic" images, as their ux might have been emitted from the same (3D) source region in space. But these are treated independently in the ME method. In addition, the algorithm is sensitive to the rst initial guess, i.e.

the "reference" image upon which the entropy is evaluated, so that equivalently well

"tting" images can look quite dierent. This is true even if the images only slightly dier in the value of their entropy, and henceα.

(a) ME image with isotropic start. (b) ME image with maximum likelihood start.

Figure 4.7: Maximum entropy images of the 511 keV sky as started from an isotropic map (a, bare isotropy max-imum), and a "best t" map (b, bare likelihood maximum) in logarithmic scaling. Contours indicate ux levels of30,10,5,3,1, and0.1×10⁻³ ph cm⁻²s⁻¹ sr⁻¹

In Fig. 4.7, the results of the ME method applied to the 511 keV data set described in Sec. 4.2 are compared for the central 508-514 keV, i.e. only representing the line dominant part of the spectrum. The reference image was set to an isotropic map, so that each pixel has the same intensity, i.e. the probabilities are equipartitioned, and the bare entropy is at maximum. Panel (a) shows the resulting image of iteration

#66, with bright emission in the centre and a disk which gradually merges into an

almost omnipresent diuse component. The 10% contours indicate a low surface-brightness disk-like component, extending at least from latitudes≈ −25^◦to≈+25^◦. The Crab as the strongest continuum source in that band can also be identied at its correct position at(l/b) = (−175.44/−5.78). The appearance of the Crab does not necessarily mean that there is annihilation emission from that direction, but rather that emission was found in the 508-514 keV band. There seems to be diuse emission around that position, which may also be attributed to the Crab, but the ME algorithm does not have enough "leverage" to account for it correctly. This shows the limits of the ME method from the perspective of unknown prior knowledge. The faint diuse emission above and below the disk (violet colour) may arise from a halo-like emission or even more so a large scale-height of disk emission. In Fig. 4.7b, the chosen reference image was the "best tting" empirical model from Sec. 4.4.2. This image is already close to the maximum-likelihood, as it ts the data set well. In the Me result, the general shape of the starting image is rather conserved, but the smoothness is reduced towards a more patchy appearance of the disk structure. The Crab as a point source was not included in the starting image and is thus only weakly apparent.

In iteration #220, shown in Fig. 4.7b, the formerly symmetric disk is now stretched towards negative longitudes, i.e. to where the Crab is located. Theχ² values for the two images are1215355(a) and1219534(b), respectively, resulting in aχ² dierence of ≈ 4179. Considering the number of degrees of freedom, they are statistically only marginally distinguishable: The log-likelihood ratio follows a χ² statistic, for which the expectation value equals the number of degrees of freedom, here ν = 1211027. Since this probability distribution has a width (standard deviation; second moment) of√

2ν ≈1100, then×σ tolerance band for an adequate t is 1211027± 1100n. In other words, depending on the realisation of the Poisson-distributed data, χ² dierences for non-nested models² of the order of n×σ, where n is to be chosen according to the number of data points³, are insignicant and compatible with statistical uctuations that are expected. Formally, the image in panel (a) is favoured by<4σ over the image in panel (b). But since the models / reconstructed images are non-nested, this formal criterion cannot be used. Only if large∆χ²values would occur, one could still argue that one or the other representation of the data should be preferred. In fact, neglecting the Crab in the best tting likelihood map as a starting point for the maximum entropy algorithm misleads the ME imaging in this case. The total integrated ux for both images is 4.0×10⁻³ ph cm⁻² s⁻¹ (a) and 3.4×10⁻³ ph cm⁻² s⁻¹ (b), respectively⁴. Given the equivalence of both representations, an estimate of the systematic uncertainty on the line ux of the order of 0.3×10⁻³ ph cm⁻² s⁻¹, i.e. ≈10%, can be derived.

Both reference images lead to a rather pued up disk. This is new in respect to the positron annihilation morphology of the Milky Way (see Sec. 4.6.6 for further interpretation on the morphology). The conventional models for the central part of the Galaxy may also be reconsidered, as the ME results show structures in 511 keV

2Models or parametrisations of data which are not expressible by combinations or sub-model-components of another model are called non-nested. For example,ax²+bx+canddx+eare nested models, whereasax²+bx+cand exp(dx) +eare not.

3In general, the value ofnis arbitrary, depending on the wanted accuracy of the t. For a small number of data points, say10¹−10²,nshould be small, around 1, as higher values would allow any model to t the data.

However, for a large number of data points, say10⁶, the expected number of outliers also rises, so that bigger numbers ofnare also appropriate. In fact, this also depends on the number of tted parameters, but as rule of thumb for number of data points10^N, and number of degrees of freedom.10^N, the appropriate value fornto judge among models isn≈N.

4This will turn out to be exactly the sum of 511 keV annihilation line uxes and continuum in the 508-514 keV band from all sources in the Milky Way (see Sec. 4.6).

−30 −20 −10 0 10 20 30

(a) ME image with isotropic start.

−30 −20 −10 0 10 20 30

(b) ME image with maximum likelihood start.

Figure 4.8: Maximum entropy images of the inner galactic ridge in 511 keV. Left and right panel are identical to Fig. 4.7, but zoomed into a region conned by|l|<30^◦and |b|<30^◦in now square-root scaling for illustration purpose.

emission which are deviant from the smooth Gaussian models, and are reminiscent of known astrophysical source congurations. In Fig. 4.8, the inner 30^◦ ×30^◦ of the ME images of the Milky Way are shown for both reference images. Comparing Fig. 4.8a with the RL deconvolution by Knödlseder et al. (2005), Figs. 4.6a and b, the central part of the Milky Way bulge is represented in rather similar ways, with a symmetric shape in latitude, but slight deviations from spherical symmetry in longitude. Especially, an emission feature around (l/b) = (−5^◦/5^◦) is evident in both (RL and ME) reconstructions. The elongated and tilted emission region in the centre, which was interpreted as originating in the circumnuclear disk, appears less stretched in the ME reconstruction. It may be reminiscent of a galactic bar, viewed at a specic aspect angle (see also Sec. 4.6.6). The disk is now clearly detected and evident: It is indicated by the 0.001 ph cm⁻² s⁻¹ sr⁻¹ contours in both the zoomed and unzoomed image. On the other hand, in Figs. 4.6a and b, the disk quickly fades away for larger longitudes. In the case of Fig. 4.8b, the central part looks again similar to a galactic bar, as the emission maximum is slightly oset towards negative longitudes (l_max ≈ −1^◦), more extended towards positive longitude, and with increasing ux values for larger latitudes. But in addition to the disk, there is emission above and below the region of the galactic bulge, extending to latitudes |b| ≈ 20^◦ with a higher intensity than the disk emission.

This is, however, not seen in Fig. 4.8a. Such bipolar emission is reminiscent of the so-called "Fermi bubbles". These are two large bubble-like emission features, extending50^◦ above and below the galactic centre, with a longitudinal extent of40^◦. They have been detected in gamma-rays between0.3and300 GeVwith Fermi/LAT (Su et al. 2010). If both emission features have a common origin, one reason why the latitudinal extent in 511 keV is not larger, may be due to the fact that the exposure time with SPI, Fig. 4.3, is only long enough to have detected such faint and low contrast emission below 20^◦ latitude. In any case, the Fermi bubbles and the 511 keV emission from this ME reconstruction are spatially correlated. This may thus point towards the same physical origin, which will be discussed in more detail in Sec. 5.2.2. Another emission feature which is also thought to be related to the Fermi bubbles is the "WMAP haze". This is a microwave excess with a spherical, non disk-like morphology, visible up to at least|b| ≈30^◦, interpreted as synchrotron emission from cosmic-ray electrons (Finkbeiner 2004). A third, very similar emission

feature was revealed by the ROSAT 1.5 keV all sky map, showing a bipolar X-ray structure along the inner few tens of degrees towards the galactic centre (Snowden et al. 1997). This was, for example, interpreted as a superbubble, associated with a bipolar outow from the galactic centre region, formed from a superwind with an energy of the order of 10⁵⁴-10⁵⁵ erg (Sofue 2000; Sofue et al. 2016).

(a) Flat intensity model. (b) Shell model.

Figure 4.9: Projected emissivity distribution models for bipolar morphologies in the galactic centre. See text for details. From Su et al. (2010).

Su et al. (2010) presented dierent modelling approaches for the Fermi bubbles, based on the above mentioned measurements in dierent wavelengths. In Fig. 4.9, dierent emission models to explain the bipolar structures from the galactic centre are presented. In panel (a) a toy model for the Fermi bubbles with at projected intensity is shown. The volume emissivity, assuming an optically thin interstellar medium, is proportional to (R² − r²)^−1/2 for r < R, where R = 3.5 kpc is the radius of the bubble, and r the distance to the centres of the northern and south-ern bubble, respectively. Panel (b) describes a bubble model with compressed gas shells of 0.5 kpc thickness, that may originate from inverse Compton scattering of cosmic-ray electrons. In this model, the density distribution must be very non-uniform. The resulting morphology would then be limb-brightened. This, however, is in contrast to high-energy gamma-ray observations, but appears compatible with data from other wavelengths. The WMAP haze model, panel (c), is dominated by synchrotron emission, which depends also on the cosmic-ray electron density, and

the magnetic eld, which was chosen as B =B₀exp(−z/z₀) with z₀ = 2 kpc. Due to the decreasing magnetic eld at larger heights above the galactic disk (higher lat-itudes), the emissivity also decreases for the microwave emission model. Panel (d) is a representation of a bipolar shell model from the ROSAT 1.5 keV measurements.

The limb of the shell here is also brightened, as the gas is entirely contained in the shell "wall" with a thickness of 1 kpc, and no gas is assumed for the interior (see also Sofue 2000; Sofue et al. 2016). Disentangling the bubble-like emission feature from the 511 keV ME map, panels (a) and (c) would be favoured over the shell models (b) and (d). Reasons are: First, because the 511 keV emission is only seen up to |b| < 20^◦ which would indicate a connection with the magnetic eld as the WMAP haze is also only dominant to this latitude scale. And second, because the

Im Dokument Positron-Annihilation Spectroscopy throughout the Milky Way (Seite 115-130)