A General Version of Price’s Theorem

(1)

https://doi.org/10.1007/s10959-020-01017-w

A General Version of Price’s Theorem

A Tool for Bounding the Expectation of Nonlinear Functions of Gaussian Random Vectors

Felix Voigtlaender^1,2

Received: 29 September 2019 / Revised: 13 May 2020 / Published online: 26 June 2020

Abstract

Assume that X ∈ Rⁿis a centered random vector following a multivariate normal distribution with positive definite covariance matrix. Letg :Rⁿ→Cbe measurable and of moderate growth, say |g(x)| (1+ |x|)^N. We show that the map → E[g(X)]is smooth, and we derive convenient expressions for its partial derivatives, in terms of certain expectationsE[(∂^αg)(X)]of partial (distributional) derivatives ofg.

As we discuss, this result can be used to derive bounds for the expectationE[g(X)]of a nonlinear functiong(X)of a Gaussian random vectorXwith possibly correlated entries. For the case when g(x) = g1(x1)· · ·gn(xn)has tensor-product structure, the above result is known in the engineering literature asPrice’s theorem, originally published in 1958. For dimensionn =2, it was generalized in 1964 by McMahon to the general caseg : R² → C. Our contribution is to unify these results, and to give a mathematically fully rigorous proof. Precisely, we consider a normally distributed random vectorX∈Rⁿof arbitrary dimensionn ∈N, and we allow the nonlinearityg to be a general tempered distribution. To this end, we replace the expectationE[g(X)]

by the dual pairingg, φ_S,S, whereφdenotes the probability density function of X.

Keywords Normal distribution·Gaussian random variables·Nonlinear functions of Gaussian random vectors·Expectation·Price’s theorem

AMS subject classification 60G15·62H20

B

Felix Voigtlaender felix@voigtlaender.xyz

1 Katholische Universität Eichstätt-Ingolstadt, Lehrstuhl Wissenschaftliches Rechnen, Ostenstraße 26, 85072 Eichstätt, Germany

2 Technische Universität Berlin, Institut für Mathematik, Straße des 17. Juni 136, 10623 Berlin, Germany

(2)

1 Introduction

In this introduction, we first present a precise formulation of our version of Price’s theorem, the proof of which we defer to Sect.4. We then briefly discuss the relevance of this theorem: In a nutshell, it is a useful tool for estimating the expectation of a nonlinear function g(X)of a Gaussian random vector X ∈ Rⁿ with possibly correlated entries. In Sect.3, we consider a specific example application which illustrates this.

The relation of our result to the classical versions [6,8] of Price’s theorem is discussed in Sect.2.

1.1 Our Version of Price’s Theorem Let us denote by Sym_n:=

A∈Rⁿ^×ⁿ : A^T =A

the set of symmetric matrices, and by

Sym⁺_n :=

A∈Sym_n : ∀x∈Rⁿ\ {0} : x,Ax>0

the set of (symmetric) positive definite matrices, where we writex,y := x^Tyfor the standard scalar product ofx,y∈Rⁿand|x| :=√

x,xfor the usual Euclidean norm. For ∈Sym⁺_n, let

φ:Rⁿ→(0,∞),x→

(2π)ⁿ·det₋¹

2 ·e⁻¹²^x^,⁻¹^x, (1.1) and note thatφis the density function of a centered random vectorX∈Rⁿwhich follows a joint normal distribution with covariance matrix— that is,X∼N(0, ); see for instance [5, Chapter 5, Theorem 5.1].

Let us briefly recall the notion of Schwartz functions and tempered distributions, which will play an important role in what follows. First, withN = {1,2, . . .}and N0= {0}∪N, anyα∈Nⁿ₀will be called amultiindex, and we write|α| =α1+· · ·+αn

as well as ∂^α = _∂^∂_x^α1α1 1

· · ·_∂^∂_x^αnαn

n , and z^α = z^α₁¹· · ·z^αnⁿ for z ∈ Cⁿ. Finally, given α, β ∈Nⁿ₀, we writeβ≤αifβj ≤αj for all j∈ {1, . . . ,n}. With this notation, it is not hard to see that the density functionφfrom above belongs to theSchwartz class

S(Rⁿ)=

g∈C^∞(Rⁿ;C) : ∀α∈Nⁿ0∀N∈N∃C>0∀x∈Rⁿ:

|∂^αg(x)| ≤C·(1+|x|)⁻^N

of smooth, rapidly decaying functions; see for instance [3, Chapter 8] for more details on this space. In fact,φ(x) = c·e⁻¹²⁻¹^/²^x^,⁻¹^/²^x = c·(⁻¹^/²x), where is the usual Gaussian function(x)=e⁻¹²^|^x^|², which is well-known to belong to S(Rⁿ).

The spaceS(Rⁿ)oftempered distributionsconsists of all linear functionals g : S(Rⁿ)→Cwhich are continuous with respect to the usual topology onS(Rⁿ); see [3, Sections 8.1 and 9.2] for the details. Since φ ∈ S(Rⁿ), given any tempered

(3)

distributiong∈S(Rⁿ), the function

g:Sym⁺_n →C, → g, φ_S_,S (1.2) is well-defined, where·,·_S_,S denotes the (bilinear) dual pairing betweenS(Rⁿ) andS(Rⁿ). As an important special case, note that ifg :Rⁿ→Cis measurable and of moderate growth, in the sense thatx →(1+ |x|)⁻^N·g(x)∈ L¹(Rⁿ)for some N ∈N, then

g()=E g(X)

(1.3) is just the expectation ofg(X), whereX ∼ N(0, ). Here, we identify as usual the functiongwith the tempered distributionS(Rⁿ)→C, ϕ→

g(x)ϕ(x)d x.

The main goal of this note is to show for each g ∈ S(Rⁿ)that the functiong

is smooth, and to derive an explicit formula for its partial derivatives. Thus, at least in the case of Equation (1.3), our goal is tocalculate the partial derivatives of the expectation of a nonlinear function g of a Gaussian random vector X ∼N(0, ), as a function of the covariance matrixof the vector X.

In order to achieve a convenient statement of this result, we first introduce a bit more notation: Writen:= {1, . . . ,n}, and let

I :=

(i,j)∈n×n : i≤ j

, I:=

(i,i) : i∈n , I_<:=

(i,j)∈n×n : i< j

, (1.4)

so thatI =II_<. Since forn>1, the sets Sym_nand Sym⁺_n have empty interior in Rⁿ^×ⁿ(because they only consist of symmetric matrices), it does not make sense to talk about partial derivatives of a function:Sym⁺_n →C, unless one interprets Sym⁺n as an open subset of the vector space Sym_n, rather than ofRⁿ^×ⁿ. As a means of fixing a coordinate system on Sym_n, we therefore parameterize the set of symmetric matrices by their “upper half”; precisely, we consider the following isomorphism betweenR^I and Sym_n:

:R^I →Sym_n,

Ai,j 1≤i≤j≤n →

i≤j

Ai,jEi,j +

i>j

Aj,iEi,j. (1.5)

Here, we denote by(Ei,j)i,j∈nthe standard basis ofRⁿ^×ⁿ, meaning that(Ei,j)k,= δi,k ·δj, with the usual Dirac delta δi,k. Below, instead of calculating the partial derivatives ofg, we will consider the functiong◦|U, whereU :=⁻¹

Sym⁺_n ⊂ R^Iis open.

In order to achieve a concise formulation of our version of Price’s theorem, we need two non-standard notions regarding multiindicesβ=

β(i,j) ₍_i_,_j_)∈_I ∈N₀^I. Namely, we define theflattened versionofβ as

β:=

(i,j)∈I

β(i,j) (ei+ej)∈Nⁿ0 with the standard basis(e1, . . . ,en)ofRⁿ, (1.6)

(4)

and in addition to|β| =

(i,j)∈Iβ(i,j), we will also use

|β|:=

(i,j)∈I

β(i,j)=

i∈n

β(i,i). (1.7)

With this notation, our main result reads as follows:

Theorem 1 (Generalized version of Price’s theorem)Let g ∈ S(Rⁿ)be arbitrary.

Then the functiong◦|U :U →Cis smooth and its partial derivatives are given by

∂^β

g◦ (A)=(1/2)^|β|·

∂^βg, φ₍A)

S,S ∀A∈U=⁻¹(Sym⁺_n)∀β∈N₀^I. (1.8) Here∂^βg denotes the usual distributional derivative of g.

Remark 1 Note that even if one is in the setting of Equation (1.3) whereg :Rⁿ→C is of moderate growth, so thatg() = E[g(X)] is a “classical” expectation, it neednotbe the case that the derivative∂^βg is given by afunction, let alone one of moderate growth. Therefore, it really is useful to consider the formalism of (tempered) distributions.

1.2 Relevance of Price’s Theorem

An important application of Price’s theorem is as follows: For certain values of the covariance matrix , it is usually easy to precisely calculate the expectation E[g(X)]—for example ifis a diagonal matrix, in which case the entries of X are independent. As a complement to such special cases where explicit calculations are possible, Price’s theorem can be used to obtain (bounds for) the partial derivatives of the map→E[g(X)]. In combination with standard results from multivariable calculus, one can then obtain bounds forE[g(X)] for general covariance matrices. Thus,Price’s theorem is a tool for estimating the expectation of a nonlinear function g(X)of a Gaussian random vector X, even if the entries ofXare correlated.

An example for this type of reasoning will be given in Sect.3. There, we apply our version of Price’s theorem to show that if f(x)= f_τ(x)“clips”x ∈Rto the interval [−τ, τ]and if(X_α,Y_α)∼ N(0, _α)for_α =₁_α

α1 , then the map F_τ : [0,1] → R, α→E[f_τ(X_α)f_τ(Y_α)]isconvexand satisfiesF_τ(0)=0. Thus,F_τ(α)≤α·F_τ(1), where F_τ(1)is easy to bound since X1 = Y1almost surely. These facts constitute important ingredients in [4]; see Theorem A.4 and the proof of Lemma A.3 in that paper.

2 Comparison with the Classical Results

The original form of Price’s theorem as stated in [8] only concerns the case when the nonlinearityg(x)= g1(x1)· · ·gn(xn)has a tensor-product structure. In this special

(5)

case, the formula derived in [8] is identical to the one given by Theorem 1, up to notational differences.

This tensor-product structure assumption concerninggwas removed by McMahon [6] and Papoulis [7] in the case of Gaussian random vectors of dimension n = 2 with covariance matrix of the form=_α=₁_α

α1 withα∈(−1,1). Precisely, if X_α ∼N(0, _α), then [6] states forg:R²→Cthat

g:(−1,1)→C, α→E[g(X_α)] is smooth with ⁽_gⁿ⁾(α)=E ∂²ⁿg

∂xⁿ₁∂x₂ⁿ(X_α)

. (2.1) Based on the work by Papoulis, Brown [1] showed that Price’s theorem holds for Gaussian random vectors X of general dimensionality and unit variancei,i = E[(X)_i²] = 1, if one takes derivatives with respect to the covariances i,j = E[(X)i(X)j]where i = j. In this setting, Brown also showed that Price’s the- oremcharacterizes the normal distribution; more precisely, if(X)is a (sufficiently nice) family of random vectors with Cov(X)=which satisfies the conclusion of Price’s theorem, thenX∼N(0, )is necessarily normally distributed. This extends and corrects the original work of Price [8], where a similar claim was made.

Finally, we mention the article [9] in which a quantum-mechanical version of Price’s theorem is established. In Sect.2of that paper, the author reviews the “classical” case of Price’s theorem, and essentially derives the same formulas as in Theorem1.

Despite their great utility, the existing versions of Price’s theorem have some shortcomings—at least from a mathematical perspective:

• In [1,6,8], the assumptions regarding the functionsg1, . . . ,gnorgare never made explicit. In particular, it is assumed in [6,8] without justification thatg1, . . . ,gnor gcan be represented as the sum of certain Laplace transforms. Likewise, Papoulis [7] assumes that g satisfies the decay condition|g(x,y)| e^|(^x^,^y^)|^β for some β <2, but does not imposeanyrestrictions on the regularity ofg. Finally, [1] is mainly concerned with showing that Price’s theoremonlyholds for normally distributed random vectors, and simply refers to [7] for the proof that Price’s theorem does indeed hold for normal random vectors.

None of the papers [1,6–8], explains the nature of the derivative ofg (classical, distributional, etc.) which appears in the derived formula.

• In contrast, for calculating thek-th order derivatives of → E[g(X)], it is assumed in [9] that the nonlinearity g is C^2k, with a certain decay condition concerning the derivatives. This classical smoothness of g, however, does not hold in many applications; see Sect.3.

Differently from [1,6–9], our version of Price’s theorem imposes precise, rather mild assumptions concerning the nonlinearityg (namelyg ∈S(Rⁿ)) and precisely explains the nature of the derivative∂^βgthat appears in the theorem statement: this is just a distributional derivative.

Furthermore, maybe as a consequence of the preceding points, it seems that Price’s theorem is not as well-known in the mathematical community as it deserves to be.It is my hope that the present paper may promote this result.

(6)

Before closing this section, we prove that—assuming g to be a tempered distribution—the result of [6,7] is indeed a special case of Theorem1. With similar arguments, one can show that the forms of Price’s theorem considered in [1,8,9]

are covered by Theorem1as well.

Corollary 1 Let g∈S(R²). Forα∈(−1,1), let_α :=₁_α

α1 . Let g:(−1,1)→C, α→ g, φ_α_S,S,

where φ_α : R² → (0,∞) denotes the probability density function of X_α ∼ N(0, _α).

Then g is smooth with n-th derivative ⁽gⁿ⁾(α) = _∂_2n

g

∂x₁ⁿ∂x₂ⁿ, φ_α

S,S for α ∈ (−1,1).

Remark 2 In particular, if bothgand the (distributional) derivative _∂^∂_x²ⁿn ^g

1∂x₂ⁿ are given by functions of moderate growth, then Equation (2.1) holds, i.e.,

dⁿ dαⁿE

g(X_α)

=E ∂²ⁿg

∂x₁ⁿ∂x₂ⁿ(X_α)

.

Proof of Corollary1 In the notation of Theorem1, we have

g(α)=

g◦

A^(α) with A^(α)_i_,_j =

1, ifi= j, α, ifi= j for (i,j)∈ I = {(1,1), (1,2), (2,2)}.

Since

A^(α) =α is easily seen to be positive definite, we haveA^(α) ∈U. Now, setting β:=n·e₍1,2)∈N₀^I (with the standard basis e₍1,1),e₍1,2),e₍2,2) of R^I), the flattened versionβofβsatisfiesβ=ne1+ne2=(n,n). Thus, Theorem1and the chain-rule show thatgis smooth, with

⁽gⁿ⁾(α)= dⁿ dαⁿ

(g◦)

A^(α) =

∂^β(g◦) A^(α)

=

∂^βg, φ₍_A(α))

S,S = ∂²ⁿg

∂x₁ⁿ∂x₂ⁿ, φ_α

S,S.

3 An Example of an Application of Price’s Theorem

In this section, we derive bounds for the expectationE[f_τ(Xα)f_τ(Yα)], whereX_α,Y_α follow a joint normal distribution with covariance matrix₁_α

α1 ,and where the nonlinearity f_τis just a truncation (orclipping) to the interval[−τ, τ]. We remark that this example has already been considered by Price [8] himself, but that his arguments are

(7)

not completely mathematically rigorous, as explained in Sect.2. Precisely, we obtain the following result:

Lemma 1 Letτ >0be arbitrary, and define

f_τ :R→R,x→

⎧⎪

⎨

⎪⎩

τ, if x≥τ, x, if x∈ [−τ, τ],

−τ, if x≤ −τ.

Forα∈ [−1,1], set_α :=₁_α

α1 ,and let(X_α,Y_α)∼N(0, _α). Finally, define F_τ : [−1,1] →R², α→E

f_τ(X_α)· f_τ(Y_α) .

Then F_τ is continuous and F_τ|_[0,1]is convex with F_τ(0)=0. In particular, F_τ(α)≤ α·F_τ(1)for allα∈ [0,1].

Proof It is easy to see that f_τ is bounded and Lipschitz continuous, so that f_τ ∈ W¹^,∞(R)with weak derivative f_τ = 1_(−τ,τ). Therefore, using the notation (g ⊗ h)(x,y)=g(x)h(y), we see thatg_τ := f_τ⊗ f_τ ∈W¹^,∞(R²)⊂S(R²), with weak derivative _∂^∂_x²^g^τ

1∂x₂ =1(−τ,τ)⊗1(−τ,τ)=1_(−τ,τ)².Directly from the definition of the weak derivative, in combination with Fubini’s theorem and the fundamental theorem of calculus, we thus see for eachφ∈S(R²)that

∂⁴g_τ

∂x₁²∂x₂², φ

S,S

= ∂²g_τ

∂x1∂x2, ∂²φ

∂x1∂x2

S,S= _τ

−τ

_τ

−τ

∂²φ

∂x1∂x2

(t1,t2)dt1dt2

= _τ

−τ

∂φ

∂x2

(τ,t2)−∂φ

∂x2

(−τ,t2)dt2

=φ(τ, τ)−φ(−τ, τ)−φ(τ,−τ)+φ(−τ,−τ).

Now, Corollary1shows thatF_τ|₍₋1,1)=g_τ is smooth with

F_τ(α)= ∂⁴g_τ

∂x₁²∂x₂², φ_α

S,S

=φ_α(τ, τ)−φ_α(−τ, τ)−φ_α(τ,−τ)+φ_α(−τ,−τ)

forα∈(−1,1). We want to showF_τ(α)≥0 forα∈ [0,1). Sinceφ_αis symmetric, it suffices to showφ_α(τ, τ)−φ_α(−τ, τ)≥0, which is easily seen to be equivalent to

exp − 1

2(1−α²)(2τ²−2ατ²)

! !

≥exp − 1

2(1−α²)(2τ²+2ατ²)

!

⇐⇒ 2τ²+2ατ²≥^! 2τ²−2ατ² ⇐⇒ 4ατ²≥^! 0,

(8)

which clearly holds forα∈ [0,1).

To finish the proof, we only need to show thatF_τ is continuous withF_τ(0)=0.

To see this, let (X,Z) ∼ N(0,I2), with the 2-dimensional identity matrix I2. For α∈ [−1,1], it is then not hard to see thatY_α:=αX+√

1−α²Zsatisfies(X,Y_α)∼ N(0, _α). Therefore, we see forα, β ∈ [−1,1]that

|F_τ(α)−F_τ(β)| =""E[g_τ(X,Y_α)]−E

g_τ(X,Y_β) ""

=""E

f_τ(X)·

f_τ(Y_α)− f_τ(Y_β) ""

(^since|f_τ(X)|≤τ)≤τ·E""f_τ(Yα)− f_τ(Yβ)""

(^since^fτis 1-Lipschitz)≤τ·E""Y_α−Y_β""

≤τ· |α−β| ·E|X| +τ·""

""#

1−α²−$ 1−β²""

""·E|Z| −−−→

β→α 0, which shows that F_τ is indeed continuous. Furthermore, we see by independence of X,Z that

F_τ(0)=E[f_τ(X)· f_τ(Z)] =E[f_τ(X)] ·E[f_τ(Z)] =0,

sinceE[f_τ(X)] = −E[f_τ(−X)] = −E[f_τ(X)],because of X ∼ −X and f_τ(x)=

−f_τ(−x)forx∈R.

4 The Proof of Theorem1

The main idea of the proof is to use Fourier analysis, since the Fourier transformFφ of the density functionφ will turn out to be much easier to handle thanφitself.

This is similar to the approach in [1,7] but slightly different from the approach in [6,8], where the Laplace transform is used instead.

For the Fourier transform, we will use the normalization Fϕ(ξ):=%ϕ (ξ):=

Rⁿϕ(x)·e⁻ⁱ^x^,ξd x forξ ∈Rⁿandϕ∈ L¹(Rⁿ).

It is well-known that the restrictionF :S(Rⁿ)→ S(Rⁿ)ofF is a well-defined homeomorphism, with inverseF⁻¹:S(Rⁿ)→S(Rⁿ), whereF⁻¹ϕ(x)=(2π)⁻ⁿ· Fϕ(−x). By duality, the Fourier transform also extends to a bijectionF:S(Rⁿ)→ S(Rⁿ)defined¹byFg, ϕ_S_,S := g,Fϕ_S_,S forg ∈ S(Rⁿ)andϕ ∈ S(Rⁿ).

Further, it is well-known for the distributional derivatives∂^αgofg∈S(Rⁿ)defined by∂^αg, ϕ_S,S =(−1)^|α|· g, ∂^αϕ_S,Sthat if we set

X^α·ϕ:Rⁿ→C,x→x^α·ϕ(x) and X^α·g, ϕ_S_,S= g, X^α·ϕ_S_,S

1 This definition is motivated by the identity %f(x) ·g(x)d x = f(ξ)e⁻^ix^,ξg(x)dξd x = f(ξ)%g(ξ)dξwhich is valid forf,g∈L¹

Rⁿ thanks to Fubini’s theorem.

(9)

forg∈S(Rⁿ)andϕ ∈S(Rⁿ), then we have F

∂^αg

=i^|α|·X^α·Fg ∀g∈S(Rⁿ), α∈Nⁿ0. (4.1) These results can be found e.g., in [2, Chapter 14], or (with a slightly different normalization of the Fourier transform) in [3, Sections 8.3 and 9.2].

Finally, we will use the formula (2π)ⁿ·F⁻¹φ(ξ)=

Rⁿeⁱ^x^,ξφ(x)d x=E eⁱ^ξ,^X

=e⁻¹²^ξ,ξ=:ψ(ξ) for ξ ∈Rⁿ, (4.2) which is proved in [5, Chapter 5, Theorem 4.1]; in probabilistic terms, this is a statement about the characteristic function of the random vectorX∼N(0, ).

Next, by the assumption of Theorem 1, we haveg ∈ S(Rⁿ)and henceFg ∈ S(Rⁿ). Thus, by the structure theorem for tempered distributions (see for instance [2, Theorem 17.10]), there areL ∈N, certainα1, . . . , αL ∈Nⁿ₀and certain polynomially bounded, continuous functions f1, . . . , fL :Rⁿ →CsatisfyingFg =_L

=1∂^αf, i.e.,g = L

=1F⁻¹(∂^αf). Since both sides of the target identity (1.8) are linear with respect tog, we can thus assume without loss of generality thatg=F⁻¹(∂^αf) for someα∈Nⁿ₀and some continuous f :Rⁿ→Cwhich is polynomially bounded, say|f(ξ)| ≤C·(1+ |ξ|)^Nfor allξ ∈Rⁿand certainC>0,N ∈N0. We thus have

g()= g, φ_S,S =

g,FF⁻¹φ

S,S=

Fg,F⁻¹φ

S,S

(^{Eq. (4.2)})=(2π)⁻ⁿ·

∂^αf, ψ

S,S

=(−1)^|α|·(2π)⁻ⁿ·

f, ∂^αψ

S,S

=(−1)^|α|·(2π)⁻ⁿ·

Rⁿ f(ξ)·

∂^αψ (ξ)dξ for all∈Sym⁺_n. (4.3) Our first goal in the remainder of the proof is to show that one can justify “differentiation under the integral” with respect toAi,j with =(A)in the last integral in Equation (4.3).

It is easy to see that A→ψ(A)(ξ)is smooth, with partial derivative

∂Ai,jψ₍A)(ξ)=e⁻¹²^ξ,(^A^)ξ·∂Ai,j −1 2 ·

n k,=1

(A)

k,·ξkξ

!

=

−¹₂·ξiξj·ψ₍A)(ξ), ifi= j,

−ξiξj ·ψ₍A)(ξ), ifi< j

for allξ ∈ Rⁿand arbitrary(i,j)∈ I and A ∈ U. Givenβ ∈ N₀^I, let us write∂^β_A for the partial derivative of orderβwith respect to A ∈R^I. Then, a straightforward

(10)

induction using the preceding identity shows (with|β|andβas in (1.7) and (1.6)) that

∂^β_Aψ(A)(ξ)=(−1)^|β|· 1 2

!_|β|

·ξ^β·ψ(A)(ξ) ∀β ∈N0^I, ξ ∈Rⁿ, A∈U. (4.4) Next, we show for arbitraryγ ∈Nⁿ₀that there is a polynomial p_α,γ = p_α,γ(,B) in the variables∈RⁿandB∈Rⁿ^×ⁿthat satisfies

∂_ξ^α

ξ^γ ·ψ(ξ)

=p_α,γ(ξ, )·ψ(ξ) ∀∈Sym⁺_n, ξ ∈Rⁿ. (4.5) To see this, we first note that a direct computation using the identity ∂_ξi(ξkξ) = δi,kξ+δi,ξkand the symmetry ofshows that∂_ξiψ(ξ)= −( ξ)i·ψ(ξ). By induction, and since ( ξ)i is a polynomial in ξ, , we therefore see that for each β ∈Nⁿ₀there is a polynomialp_β =p_β(,B)in the variables∈RⁿandB ∈Rⁿ^×ⁿ satisfying∂_ξ^βψ(ξ)=ψ(ξ)·p_β(ξ, ). Therefore, the Leibniz rule shows

∂_ξ^α

ξ^γψ(ξ)

=

β∈Nⁿ0withβ≤α

α β

!

∂^βψ(ξ)·∂^α−βξ^γ

=ψ(ξ)

β∈Nⁿ₀withβ≤α

α β

!

p_β(ξ, ) ∂^α−βξ^γ,

which proves Equation (4.5).

Now we are ready to justify differentiation under the integral (as in [3, Theo- rem 2.27]) for the last integral appearing in Equation (4.3), with =(A), that is, for the function

U→C, A→

Rⁿ f(ξ)·

∂^αψ₍A) (ξ)dξ.

Indeed, let A0 ∈ U be arbitrary. SinceU is open, there is some ε > 0 satisfying B_ε(A0) ⊂ U, for the closed ball B_ε(A0) =

A∈R^I : |A−A0| ≤ε

, with the Euclidean norm| · |onR^I. TheopenballB_ε(A0)is defined similarly.

Now, with

σmin(A):= inf

x∈Rⁿ

|x|=1

x, Ax forA∈Rⁿ^×ⁿ

we have forA,B ∈Rⁿ^×ⁿand arbitraryx∈Rⁿwith|x| =1 that

σmin(A)≤ x,Ax = x,Bx + x, (A−B)x ≤ x,Bx + A−B. Since this holds for all |x| = 1, we get σmin(A) ≤ σmin(B)+ A−B, and by symmetry |σmin(A)−σmin(B)| ≤ A−B. Therefore, the continuous function

(11)

A → σmin

(A) has a positive(!) minimum on the compact set B_ε(A0), so that ξ, (A)ξ ≥c· |ξ|²for allξ ∈Rⁿand A∈ B_ε(A0), for a positivec>0. Further- more, there is someK =K(A0) >0 with(A) ≤K for allA∈B_ε(A0).

Now, since the mapU×Rⁿ (A, ξ) →ψ(A)(ξ) ∈Cis smooth, we have (in view of Equations (4.4) and (4.5)) for arbitraryβ ∈N₀^I,A∈Uandξ ∈Rⁿthat

∂^β_A f(ξ)·

∂^αψ(A) (ξ)

= f(ξ)·∂_ξ^α

∂^β_Aψ(A)(ξ) (^{Eq. (4.4)})= f(ξ)·(−1)^|β^|· 1

2

!_|β|

·∂_ξ^α

ξ^β·ψ₍A)(ξ) (^{Eq. (4.5)})= f(ξ)·(−1)^|β^|· 1

2

!_|β|

· p_α,β(ξ, (A))·ψ₍A)(ξ).

(4.6) Using the polynomial growth restriction concerning f, we thus see that there is a constantC_α,β>0 and someM_α,β∈Nwith

""

"∂^β_A f(ξ)·

∂^αψ₍A) (ξ)"""=""

""

"f(ξ)· 1 2

!_|β|

·p_α,β(ξ, (A))·ψ₍A)(ξ)""

""

"

≤C·(1+|ξ|)^N·C_α,β·(1+ |ξ| + (A))^M^α,β· e⁻¹²^ξ,(^A^)ξ

≤C_α,βC·(1+ |ξ|)^N·(1+ |ξ| +K)^M^α,β ·e⁻^c²^|ξ|²

=:h_α,β,A₀,f(ξ),

for allξ ∈ Rⁿand all A ∈ B_ε(A0). Sinceh_α,β,A0,f is independent ofA ∈ B_ε(A0) and since we clearly haveh_α,β,A0,f ∈L¹(Rⁿ), [3, Theorem 2.27] and Equation (4.3) show that the function

B_ε(A0)→C, A→(−1)^|α|·(2π)ⁿ·g((A))=

Rⁿ f(ξ)·

∂^αψ₍A) (ξ)dξ

is smooth, with partial derivative of orderβ ∈N₀^I given by

∂^β_A

(−1)^|α|·(2π)ⁿ·g((A))

=

Rⁿ∂^β_A

f(ξ) (∂_ξ^αψ₍A))(ξ) dξ (^{Eq. (4.6)})=(−1)^|β|· 1

2

!_|β|

·

Rⁿ f(ξ)·∂_ξ^α

ξ^β·ψ₍A)(ξ) dξ

=(−1)^|β|· 1 2

!_|β|

· f, ∂^α

X^β·ψ₍A)

S,S

(^{Eq. (4.2)})=(−1)^|β|+|α|· 1 2

!_|β|

·(2π)ⁿ·

X^β·∂^αf,F⁻¹φ₍A)

S,S

(12)

g=F⁻¹(∂^αf),Eq. (4.1),and(−1)^|β|=i^|^β^|

= 1 2

!_|β|

·(2π)ⁿ·(−1)^|α|·

F[∂^βg],F⁻¹φ₍A)

S,S.

In combination, this shows thatg◦is smooth onB_ε(A0), with partial derivatives given by

∂^β

g◦

(A)= 1 2

!_|β|

· F

∂^βg

,F⁻¹φ₍A)

S,S

= 1 2

!_|β|

·

∂^βg, φ₍A)

S,S,

as claimed. Since A0∈Uwas arbitrary, the proof is complete.

Acknowledgements Open Access funding provided by Projekt DEAL. The author acknowledges support by the European Commission-Project DEDALE (contract no. 665044) within the H2020 Framework. The author is grateful to Martin Genzel for bringing up the topic discussed in this paper, and to Ali Hashemi for pointing out the original paper by Price. Last but not least, the author would like to thank the anonymous referees for valuable suggestions that led to an improved presentation and for suggesting the references [1,7].

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/.

References

1. Brown, J.: Generalized form of Price’s theorem and its converse. IEEE Trans. Inform. Theory13(1), 27–30 (1967)

2. Duistermaat, J.J., Kolk, J.A.C.: Distributions. Birkhäuser Boston Inc, Boston, MA (2010)

3. Folland, G.B.: Real Analysis: Modern Techniques and Their Applications. Pure and Applied Mathe- matics, 2nd edn. Wiley, New York (1999)

4. Genzel, M., Kutyniok, G., März, M.:¹-analysis minimization and generalized (co-)sparsity: when does recovery succeed? Appl. Comput. Harmon. Anal. (2020).https://doi.org/10.1016/j.acha.2020.01.002 5. Gut, A.: An Intermediate Course in Probability. Springer Texts in Statistics, 2nd edn. Springer, New

York (2009)

6. McMahon, E.: An extension of Price’s theorem (corresp.). IEEE Trans. Inform. Theory10(2), 168–168 (1964)

7. Papoulis, A.: Comments on ’an extension of Price’s theorem’ by McMahon. E. L. IEEE Trans. Inform.

Theory11(1), 154–154 (1965)

8. Price, R.: A useful theorem for nonlinear devices having Gaussian inputs. IRE Trans.4, 69–72 (1958) 9. Vladimirov, I.G.: A quantum mechanical version of Price’s theorem for Gaussian states. In: Control

Conference (AUCC), 2014 4th Australian, pp. 118–123. IEEE, (2014)

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.