• Keine Ergebnisse gefunden

Statistical strategies and stochastic predictive models for the MARK-AGE data

N/A
N/A
Protected

Academic year: 2022

Aktie "Statistical strategies and stochastic predictive models for the MARK-AGE data"

Copied!
9
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

MechanismsofAgeingandDevelopment151(2015)45–53

ContentslistsavailableatScienceDirect

Mechanisms of Ageing and Development

j o ur na l h o me pa g e :w w w . e l s e v i e r . c o m / l o c a t e / m e c h a g e d e v

Review

Statistical strategies and stochastic predictive models for the MARK-AGE data

Enrico Giampieri

b,e,∗

, Daniel Remondini

b,e

, Maria Giulia Bacalini

a,b

, Paolo Garagnani

a,b

, Chiara Pirazzini

a,b

, Stella Lukas Yani

a,b,c

, Cristina Giuliani

a,b

, Giulia Menichetti

b,e

, Isabella Zironi

b,e

, Claudia Sala

b,e

, Miriam Capri

a,b

, Claudio Franceschi

a,b

,

Alexander Bürkle

d

, Gastone Castellani

b,e

aDepartmentofExperimental,DiagnosticandSpecialtyMedicine,ViaS.Giacomo,12UniversityofBologna,Bologna,Italy

bInterdepartmentalCenterGalvani“CIG”,ViaSelmi,3UniversityofBologna,Bologna,Italy

cInstituteforBiomedicalAgingResearch,UniversityofInnsbruck,Austria

dMolecularToxicologyGroup,DepartmentofBiology,UniversityofKonstanz,78457Konstanz,Germany

ePhysicsandAstronomyDepartment,VialeBertiPichat6/2,UniversityofBologna,Bologna,Italy

a r t i c l e i n f o

Articlehistory:

Received2December2014

Receivedinrevisedform26May2015 Accepted9July2015

Availableonline21July2015

Keywords:

Biomarkers Biologicalage Chronologicalage Statisticsmodels MARK-AGE

a b s t r a c t

MARK-AGEaimsattheidentificationofbiomarkersofhumanagingcapableofdiscriminatingbetweenthe chronologicalageandtheeffectivefunctionalstatusoftheorganism.Toachievethis,giventhestructure ofthecollecteddata,aproperstatisticalanalysishastobeperformed,asthestructureofthedataarenon trivialandthenumberoffeaturesunderstudyisneartothenumberofsubjectsused,requiringspecial caretoavoidoverfitting.Herewedescribedsomeofthepossiblestrategiessuitableforthisanalysis.

Wealsoincludeadescriptionofthemaintechniquesused,toexplainandjustifytheselectedstrategies.

Amongotherpossibilities,wesuggesttomodelandanalyzethedatawithathreestepstrategy:

©2015PublishedbyElsevierIrelandLtd.

1. Introduction

MARK-AGE(EuropeanStudytoEstablishBiomarkersofHuman Aging)aimsattheidentificationof biomarkersofhumanaging capable of distinguishing between chronologicaland biological aging, as described thoroughly in this special issue, where the chronologicalagerepresenttheamountoftimefrombirthandbio- logicalageislinkedtotheunderlyingagingprocesseshappening inthebody.

Forthispredictionsystemicandtissuerelatedparametersare takenintoaccount,notonlyregardingbiologicalsamples(blood, urine,buccalmucosacellsofvolunteers),butalsowithanthropo- metric,healthreported,cognitive,andfunctionalassessments.

Toachieve this objective arobust humanmodelwithclear- cutassumptionswasconceptualizedaccordingly.Theoretically,the

Correspondingauthorat:InterdepartmentalCenterGalvani“CIG”,ViaSelmi,3 UniversityofBologna,Bologna,Italy.

E-mailaddress:enrico.giampieri@unibo.it(E.Giampieri).

modelisbasedonthreedifferentagingratesrelatedtothreedif- ferentpopulations:

i.)a population representing the“normal” aging or randomly recruitedage-stratifiedindividualsfromthegeneralpopula- tion(RASIG),coveringtheagerange35–74years;

ii.)a population representing the successful or “decelerated”

aging:subjectsbornfromalong-livingparentbelongingtoa familywithlonglivingsibling(s)alreadyrecruitedintheframe- workoftheGEHA-geneticofhealthyaging-project(Skytthe etal.,2011).Theseindividuals(“GEHAOffspring”orGO)were recruitedtogetherwiththeirspousesorSGO(“SpousesofGEHA Offspring”)thatrepresentthebestcontroltoevaluatepossi- blelifestyleeffects,sincesharingthesameenvironmentalfor manyyearswiththeirpartner;

iii.)apopulationrepresentingaccelerated“segmental”aging,i.e.

patientswithprogeroid syndromes (Cockaine, Werner,and Downsyndromes),wererecruited(seeinthisissueCaprietal., 2015).

http://dx.doi.org/10.1016/j.mad.2015.07.001 0047-6374/©2015PublishedbyElsevierIrelandLtd.

Erschienen in: Mechanisms of Ageing and Development ; 151 (2015). - S. 45-53 https://dx.doi.org/10.1016/j.mad.2015.07.001

Konstanzer Online-Publikations-System (KOPS) URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-0-312547

(2)

Expectedresultsaretightlyrelated bothtobiological mean- ingandrelative poweroftrackingaging-rate-relatedchangesof eachinvestigatedparameter(amonghundredsanalyzedinMARK- AGE).Thus,thedefinitionandtheroleofbiomarkersasapanelof measurementsthatcapturesandquantifiesfeaturesofthechrono- logicalversusbiologicalagingareatthecoreofMARK-AGEproject.

Theformerandthelatteraretwofacesofthesamecoin,i.e.the agingprocess,butonlytheircombinationcanempowertheirpre- dictivevaluefordeterminingsuccessfulorunsuccessfulaging.This isa criticalstep,stillfartoobtainageneralconsensusandvali- dationbeingalsocloselyconnectedtothefastproductionofnew potentialbiomarkerswithhighthroughputtechnologyenhance- ment(Deelenetal.,2013).

Recentpublisheddatahavehighlightedthecomplexityofthe geneticsofaging(Caprietal.,2014)andtheroleofnewclassesof biomarkersforthedetectionofdifferencesbetweenchronological andbiologicalaging,suchasepigeneticschanges,N-glycansand metabolitesprofilesfromblood.Therecentdiscoveryofasubsetof CpGsitesthattogetherformanagingclockinblood(Hannumetal., 2013;Weidneretal.,2014)andinawiderangeoftissues(Horvath, 2013)hastheoreticallyopenthepossibilitytodatetheageofthese tissues,predictedto bedifferently agedin thesame individual (Ceveninietal.,2008).Further,themethylationlevelsatspecific CpGsitesofELOVL2andFHL2genesshowedthestrongestcorre- lationwithageinapopulationconstitutedofabout500donors fromnewborns untilcentenarians(Garagnanietal.,2012),sug- gestingthatsomemechanisms,likemethylationat5Cytosinein thenuclearDNA,couldbetterrepresentthechronologicalageof humans.Anotherimportantclassofbiologicalage-markersisrep- resentedbymicroRNAs(miRs)andinparticularthosemiRsable tomodulatetheinflammatory responsewithagingorinflamm- miRs(Olivierietal.,2012,2013).Concomitantly,N-glycansfrom serumbloodhavereceivedduringthelastyearstheattentionof manyresearchgroups. Inparticular,theincrease ofagalactosy- latedN-glycanstructuresduringagingappearstobeconfirmedin manystudies(Dall’Olioetal.,2013)andspecificN-glycanstructures areusedfordifferentpredictivemodels(Vanhoorenetal.,2007;

Kriˇsti ´cet al.,2014).Lastly, metabonomics and lipidomicstech- nologieshaverecentlyrisenupmanybloodmetabolites,suchas phospho/sphingolipids(Collinoetal.,2013;Montoliuetal.,2014), thatcouldpotentiallybeputativebiologicalmarkersandmodula- torsofhealthyaging.

Currently,thechallengeistheuseofadhocadvancedstatis- ticalmodelstoelaborateproperlytheavailablehugeamountof data.MARK-AGEprojecthasfaced thischallengeexploitingthe bestfittingandmodelingofdata,combiningbothchronologicaland biologicalagingmarkersintheabovedescribed“humanmodels”.

ThedatabaseresultingfromMARK-AGEprojectincludesboth qualitativeandquantitativedatabelongingtoseveralcategories:

•Clinicalandsocialdata:thiscategoryincludesmainlyqualita- tive/categorical/ordinaldata,suchasdemographicinformation (familycomposition,maritalstatus,education,occupation,hous- ingconditions),lifestyleinformation(useoftobaccoandalcohol, dailyactivities),healthstatusinformation(presentandpastdis- eases, self-perceived health, number, and type of prescribed drugs)andcognitive/functionalstatus(activitiesofdailyliving, Nortonscale,STROOPtest,15-picturelearningtest,ZUNGdepres- sionscale).

•Anthropometricdata:this category includesquantitativedata relativetoclassicalcandidatemarkers ofaging,suchaswaist andhipcircumferences,bloodpressureand heartrateatrest, lungcapacity,nearvision,five-timeschairstanding,andhandgrip strength.

•Molecularbiomarkers:this category includesawide range of bothqualitativeandquantitativedata.Qualitativemeasurements

resultfromtheanalysisofAPOEgenotypeshouldbemanaged usingstatisticaltoolsspecificforgeneticdata.Thevastmajority ofmolecularmeasurementsareexpressedasquantitativedata, bothonaninfinitescale(forexamplealbuminlevels,whichare expressedasg/l)oronafinitescale(forexamplemethylation levels,whichareexpressedascontinuousnumbersrangingfrom 0to1).

2. Statisticalmethods

Here we discuss some statistical methods alongside with theirpotentialandlimitationfordatasetslikeMARK-AGE.These methodscover alltherequired stepsoftheanalysis,fromdata preparation, throughfeature selection, modeling, biologicalage assessment,toprediction,anddivergencefromchronologicalage.

Selectingthemostimportantpassagesoftheanalysisisfun- damentalforthechoiceoftheproperanalysisstrategy. Dealing withthiskindofproblemsinvolvesalongsequenceofanalysisand accordingtothischainstructure,therobustnessofthefinalresults isupperlyboundbytherobustnessofthefrailestcomponent.Evena spotlessanalysiscanbeseverelydistortedbyasingle,nonaccurate step.Thestepsofouranalysiswillbethefollowing:

I)Variable pre-selection,toremovenonappropriate variables fromtheanalysis.

II)Featureextractionfromtherawvariables,toobtainmorebio- logicallyrelevantinformation.

III)Selectionoftheappropriatemethodfortheanalysis,witha properstandardizationofthevariablestomakethemfollow theassumptionsofthemethod,wherepossible.

IV)Featureselection,todiscriminatethemostrelevantderived featuresfortheanalysis.

V)Parameterestimation,toadaptthemodeltothedata.

VI)Modelselection,toselectthemostappropriatemodelamong alltheproposedone(setsoffeatures andparameters);this stepwillincludepriorsbiological knowledgetohelpinthe discriminationofgoodmodelsfromnonmeaningfulones.

VII)Predictionrobustnessestimation,toverifyhowmuchthepro- posedmodelisabletogeneralizetothepopulation,verifying thebiologicalhypothesisunderlyingtheproject.

2.1. Classicaltestlimits

Allstatisticaltestsreliesonspecifichypothesestobeperformed.

Thesehypothesesrepresentstheassumptionsthatthetestneeds tooperate.Mostclassicaltestsforexamplerelyontheassump- tionthatthedatacanbedescribed(oratleastapproximated)with aNormaldistribution.Thesetestarecannotbeappliedondata thatdonotrespectthishypothesis,astheresultswouldbeunre- liable.Thesehypotheses,ontheotherside,allowstoincludemore informationinthetest,increasingitspower.Asubsetoftheclassi- calteststrytouseasmallersetofhypotheses,typicallyremoving therequestofadherencetoaspecificdistributioninfavorofjust consideringtherankingofthevalueinthepopulation.Thesetests areusuallyreferredasnon-parametric,asmostdistributionsare describedby specific“parameters”(likemeanandvariance), so theteststhatassumespecificdistributionimplicitlyaretestingthe valuesoftheseparameters.

Itisnowknownthatseveralbiologicalvariablesdonotrespect therequirementfortheclassicaltests,suchasthenormality(or thepossibilitytonormalizeitwithanappropriatetransformation), even when common transformation (location-scale, logarithm, squareroot)areused.Whentheproperdistributionofthedata isunknown,orcan’tbedescribedwithnumerical(ideallycontin-

(3)

uousorrealvalues),weareforcedtouseanonparametrictestfor ourhypothesis.

IntheMARK-AGEdatabasewe haveseveralparametersthat couldnotconformtotherequirementfortheparametrictesting:

thenumericalcontinuousdata,measurementsmainlybased on biochemicalmethods,orthosearisingfromcontinuousdata,such asbloodpressure,bodyweight,etc.canshowsignificantdeviations fromtheGaussiandistributionandarenotimmediatelyascribable toanyknowndistribution.

Thisproblemiscommoninbiomedicaldataanalysisasmost biologicaldata,suchascell’svolume,proteinsmolecularweights, genelengths,andotherbiochemicalvariables,arenotdistributed inaNormalway(theyarenotfollowingaNormaldistribution,or equivalentlyaGaussiandistribution).ThisdeviationfromNormal distributionwashistoricallydefinedasanexception,whilenowa- daysisconsideredvery common(Zhangand Popp,1994; Koch, 1966;Bahretal.,1987;Russoetal.,2012,2011).Asimpleintuition ofthisbehaviorcanbeobtainedbyobservingthatthemajority ofthesemeasurementsarestrictlypositiveandwithanaverage closetozero,andthuscannotbedescribedasaGaussiandistri- butionasthis wouldgiveadefiniteprobabilityalsotonegative values.Metabolites,forexample,usuallyhaveadistributioncloser toanExponentialdistribution,withastrongpeakonthevalueof zero,andbeingameasureofconcentrationcanofcoursebeonly positive.

2.2. Nonparametriccorrelations

Themost widelyknowntype of correlationis thePearson’s product-moment correlationcoefficient, usually referred asthe Pearson’sr.Thisvaluemeasurethelineardependenceoftheparam- eters,andiscomprisedbetween1and−1.

Thisquantityhasthelimitoflosinganyinformationaboutnon- linearrelationshipsbetweenthevariables,andrequiresthedatato benumeral.Tocircumventtheseproblemsacommonsolutionisto useanon-parametriccorrelationsmethod.Thesemethodsusually requireonlyinformationabouttheorderingofthedata,working bothfornumericalandordinalvariables.Workingonlyontherank- ingofthedata,thesemethodsallowtoincludenon-linear(butstill monotonic)effectsinaccount.

Thetwomostcommonnon-parametrictestsaretheKendall’s TauandtheSpearman’sRho.Theformer(endall’sTau)generated thecouplesofobservedranksofeachvariables,thenconfrontall thecouplestoseeiftheyareconcordant(boththevalueofthe couplearegreaterorsmallerthantheothercouple);thestatisticof thetestistheexpectednumberofconcordantcouplewhenthereis norelationshipbetweenthevariables.Thelatter(Spearman’sRho) isdefinedasthePearson’scorrelationbetweentherankingofthe twovariables.

TheKendall’sTauisusuallymoreconservative,solesssensible toerrorsinthedataandlessbiased,buthaslessstatisticalpower andtakesmoretimetocomputeinpresenceoflargedatasetslike MARKAGE,asthetimeitrequiresforthecomputationgrowswith thesquareofthenumberofobservations.Thesetwoanalysesare notequivalent,andthemostappropriate oneshouldbechosen dependingonthegoaloftheanalysis(Xuetal.,2013).

2.3. Robustregressionfortruncateddata

Withtheterm“robustregression”werefertoalargefamily ofmethodsforestimatingthelinearrelationshipbetweentwoor morevariables.Theregressionmethodsareingeneralperformed bysearching thecombination of parameters thatminimizethe sumof all theprediction errors. The classical linear regression useasanestimateofthepredictionerrorthesquareofthedif- ferencebetweenthepredictedvalueandtheobservedone.This

Fig.1.Comparisonbetweenparametricdistributions,ontheleftaGaussianstan- dardizeddistributionwithzeromeanandunitarystandarddeviationandtwonon Gaussiandistributions.InthecenterisaLogNormaldistributionwithlocation parameterequaltozeroandscaleparameterequalto1.OntherightaGamma distributionwithshapeparameterequalto2andscaleparameterequalto1.

measurecorrespondtohypothesizethatalltheerrorshaveaNor- maldistribution.Evaluating theerrorswiththismethodcanbe veryfragile,asanyanomalyofthedistributionoftheerrorscan heavilyaffecttheregression results.Thiscanbedue, forexam- ple,tooutliersorunevendifferencesthatmayhappenwhenthe dependentvariableshashardlimitlike0fortheconcentrationof molecules.Tocircumventthisproblemseveralmethodshavebeen proposedduringtheyears.Mostofthemmodifytheunderlying error functionin a way that allowsto beless affectedby out- liers.SomerobustregressionuseaStudent’sTdistribution,other atruncatedexponential,andsoon.Thiscorrespondstodifferent hypothesisontheunderlyingdistributionoftheestimationerror.

Mostoftheobservedvariablesinthedatabasedoesnotconform totheGaussianhypothesis;hence;usingsafeassumptionsonthe errordistributionisthebestapproachwithoutremovingoutliers fromthedataset,a practicethat isfrowned upon asit is com- pletelyarbitraryandcanleadtounpredictedbiasesinthemodel Fig.1.

Theregressionneedstoconsideralsoanimportantfeatureof thedata:whentheregressionisdonewiththeageasafunction ofasetofregressors,thedependentvariable(theage)doesnot respecttheassumptionsoftheordinaryregressionsmethods,that istoberandomlysampledonthebasisoftheindependentvari- able.Havingapreciselimitonthevalueofthedependentvariable transformthisprobleminonewheretherearemissingnotatran- dom(MNAR)data.Thiscanleadtoaseverelossofpowerofthe predictor,asthemissingdataaretheoneintheextremeposition ofthespectrum,themostimportantfortheprediction.Thiswould alsoleadtostrongbiasesintheestimation,astheunderlyingdata arebiased.TheeffectsofthisbiascanbeseeninFig.2,whereatoy modelisusedtoshowtheseverityofthedistortion.Methodsto dealwiththemissingdataarewellknown(SchaferandGraham, 2002;Ibrahimetal.,2005),andincludeadjustedmaximumlikeli- hood,multipleimputation,andfullBayesianmethods.Allofthese methodsarebasedonamodelizationofthemechanismthatcan leadtothemissingdata,andcansignificantlydecreasetheerrorin theregression(Masonetal.,2012).

2.4. Multiplexnetworkapproaches

Since,intheMARK-AGEproject,weareinterestedinmultivari- ateanalysesthatkeepintoaccounttherelationsbetweenvariables (e.g.biochemical,behavioral)andtherelationsbetweensamples (e.g. gender, age group, health history, lifestyle, nationality), a network-basedapproachcanbefullyexploitedforsuchpurposes.

Veryrecentlytheconceptofnetwork,namelyasetofelements (nodes)thatsharespecificrelationshipsbetweeneachother(links), hasbeenextendedtomultiplexnetworks(Boccalettietal.,2014;

Menichettietal.,2014a,b;Castellanietal.,2014).Inamultiplexnet- workthesamenodesarerepresentedineachlayer,corresponding todifferenttypesofrelationsorinteractions.Apossibleanalysis involvesamultiplexinwhichthenodesarethemeasuredparam- eters(e.g.biochemicals),andthelinksineach layeraredefined bysimilaritydistances,evaluatedthroughcorrelationsorsimilar

(4)

Fig.2.Theeffectofthetruncationofdataonthepredictedvariable,dependingonadifferentamountoftruncateddatas.Thisfiguredepictatrivialmodel,whereboth variablesarenormallydistributed,withthedependentoneequaltothefirstonewiththeadditionofwhitenoise:xisdistributedasN(0,1)andyasx+N(0,1).Itisclear howtheeffectofthetruncationisevidentontheregressionevenwhenonlytheextreme5%isexcluded.When32%ofthedatafromtheextremeisremoved(onestandard deviationfromthemean),theregressionlinethatshouldbeclosetothebisectinglineisalmostflat.Whilethetrueregressionhasaslopeof1,theestimatedslopefromthe truncateddatais0.27.Evenusing95%ofthedata,theregressionlineis0.67.

measurements(Menichetti etal., 2014a,b).Each layercouldbe associatedwitha particular populationsubgroup,e.g. GO,SGO, RASIG,etc.(seeFig.3)Inthiswaywecancharacterizetheparam- etersandtherelationsthatarecommontoallthegroupsinwhich wedividethedataset,bymeansofspecificmultiplexobservables.

Anexampleisshownin(Menichettietal.,2014a,b)inwhichwe havecharacterizedtherelationbetweennodeconnectivitydegree (thesumofthelinkforanode)anditsstrength(thesumofthe weightsforanode)forthesetoflinkssharedbetweenmorethan onelayer,orspecificforeachlayer.

Thismultiplexapproachallowsalsotodefinemorestatistical observablesusefultoevaluatethequalityofourmodels.Oneofthe possiblemeasurementsistheNetworkEntropy(Menichettietal., 2015),thatisrelatedtotheselectivitytheobservedcharacteris- ticsintherealdataset,i.e.itestimatesthenumberofstructures (networksormultiplexnetworks)compatiblewithsomegivenreal features.

Thesenetwork approachescanbeusedinthecontextofour analysisstrategytoincludebiologicalrelevantinformationonthe relationshipsbetweenvariablesandtoquantifythemeaningful- nessofasetofvariablesFig.4.

Fig.3.AMultiplexisasetofnetworksorlayerswithcommonnodes(theobserved variables).Eachlayercorrespondstoagivenpopulationsubgroup.Theselinksare generatedbycorrelationandcausalrelationships.Therelationshipbetweenthe layerscanbeusedtoimprovetheestimationoftherelationshipbetweenthenodes.

(5)

Fig.4. Bayesianinvestigationofthefairnessofacoin.Theinvestigatorstartwitha mildpriorhypothesisoverthefrequencyofthehead,asbeingbalanced,butwith amarginofuncertainty.Theobservationsare2headsand7tails,thatcorrespond tothedepictedlikelihood.Thefinalknowledgeaboutthecoinisrepresentedbythe posteriordistribution,givenbythenormalizedproductofthepriorandthelikeli- hood.Thisposteriordistributionshowthataftertheseobservationstheconfidence aboutthefairnessofthecoinisreduced,butit’sstillaplausiblehypothesis.

2.5. Training,testingandvalidation

Statisticalanalysiscan beusedtoreachtwo major,distinct, objectives:explanationandprediction(Shmueli,2010).Mostsci- entificresearchfocusonexplanatoryanalysis,thattriestoexplain whichisthecausalrelationshipamongvariables.Inthiscaseour interestisonthepredictionofthebiologicalageofasubject,and thusthestatisticalapproachshouldbedifferentfromthestandard forexplanatoryanalysis.

Predictionanalysisdoesnotconcernsomuchoncausationrela- tionshipsasmuchonreliablecorrelations(Domingos,2012).This is easilyquantifiedfor simplylinearmodels bytheadjusted r- squaredstatistics,thatconveysthepredictedr-squaredthatwould beobservedonanewsamplefromthesameprocess.Itcanbecon- sideredameasureofthetruecorrelationafterremovingtheeffects oftheoverfitting.

Thiskindofanalyticalestimatesarenotpossibleongeneralnon- linearmodels,ornonparametricone.Thebestsolutionsofartothis problemistosplitthedataintwosubsets(usuallyunequalinsize), fitthestatisticalmodelonaset(calledtrainsubset)andverifying howbetterthemodelperformonnew,unobserveddata(calledthe testsubset).Thisprocedureisusuallyrepeatedseveraltimeswith differentsplittingofthedata,tomaketheestimationmorerobust andnotdependentonaspecificsplit.

Thedivisionofthedataintrainandtestcanbedoneinseveral ways.Oneofthemostusedisthek-foldcrossvalidation,wherethe dataarepartitionedinkdifferentsubset,ofwhichoneisusedastest andtheotherastrain,repeatingtheprocedureuntilallthesubsets hasbeenused.AnextremeversionofthisprocedureistheLeave OneOutmethod,wherethetestcorrespondtoasingleobservation andthetraintoalltheothers.Thiskindofapproachgetthemost outofthedata,butcaneasilybecomeunfeasibleasthenumberof observationincrease.

Asimilarapproachcanbeusedinthecontextofmodelselection, andisreferredastrain-test-validationsplit.Withmodelselection wemeanchoosingonemodelamongseveralothers;anexample ofthismaybetheselectionoftheregressorstouseinthelinear model,ortheuseofparametricornonparametricmodels.Forthe sakeof theexplanation,lettheproblembetheselectionofthe variablestobeusedasregressorsinalinearmodel.Wewantto usethetraintesttofiteachpossiblemodel,andthenusethetest

settochoosethebestperformingmodel.Theresultingestimateof thepredictionpowerisbiasedupward,aswechosethebestper- formingmodelamongthesetoftestedones.Toavoidthisbias,we performasecondestimationoftheexpectederrorusingthethird set,calledvalidationset,thatisseparatebothfromthetestandthe trainone.Atypicaldivisionofthedatawouldbe60%fortrain,20%

forvalidation,and20%fortest.

Usingthesestrategieswecanevaluatetheexpectederrorfor anykindofmodel,independentlyfromhowcomplicateditis.These methodsincludeaformofautomaticOccamRazor,asthegreater thenumberofparametersincluded,thebiggertheoverfittingof themodelandthusgreatertheerroronthetestset.

Aproperprocedureofmodelselectionshouldalsoconsiderthe clinicalpractice,asalltheseparametersrequiresexamsthatare expensiveinterms oftime and money.Twovariables withthe samestatisticalpropertiescanbeverydifferentintermsofcost ofmeasurements,andthustheselectionshouldbeinformedofthe relativecostofthetwo.Oftenvariablesaremeasuredinbatches, andthusremovingonevariablefromtheanalysiswouldnotsave anycostanditisworthincludingeveniftheimprovementtothe modelissmall.

2.6. Featureselection,featureextractionandthecurseof dimensionality

Inabiologicalanalysis,itisoftenassumedthathighernumber offeatures(thevariablesused)leadtoabettermodelforthedata.

Thisideaiswrong,especiallywhentheincludedfeaturesarenot directlylinkedtotheproblemunderstudy.

Thefirst,andbyfarthemostimportantproblem,isknownwith thenameofcurseofdimensionality.Thisproblemisduetothe number ofexamplesneededtosufficientlysample thepossible casesdescribableusingasetofvariables.Ifweconsideronlytwo variables,eachofwhichcanonlyassumetwovalue,wehavefour possiblecombinations.Ifweassumethatweneedmoreorlessten subjectsforeachcaseforthestatisticalmethodstoproperlywork, wewouldneedaroundfortysubjects.Ifweareusingsixdiffer- entvariable,wemayhavearound64combinationandthus640 subjectforminimalcoverage.Withtenvariableswewouldneed around10,000subjectstomaintainthesamepowerasbefore.

This,andothereffects,makedangerousincreasingindiscrimi- natelythenumberoffeaturesincluded,aswewillnothaveenough subjecttodiscriminateeasilywhichvariablearerelevantandwhich arenot.Thisisevenworstwhenfewfeaturesaresimilarbetween eachother.Inthiscasethealgorithmmayhaveanhardtimedis- criminatingbetweenthem,andreachanundeterminedstate.In thiskindofsituationismoreproficuoustosummarizeallthesesim- ilarandrelatedfeaturesinasingleone.Thiswouldatthesametime reducethenumberoffeaturesusedandimprovethebehaviorof mostpredictors.Thiskindofprocedureisreferredasfeatureextrac- tion,andisakeyprocedureinhighdimensionaldataanalysis.The termfeatureextractionreferstotheprocessofgenerationofnew featuresfromtheexistingone,thatarethenusuallyreplacedby thenewones.Oneoftheeasiestandmostcommonkindoffeature extractionistheprincipal componentanalysis(usuallyreferred byitsacronymasPCA),anon-supervisionedmethodthatgener- atenewfeaturesbygroupingfeaturesthathavestrongcorrelation betweenthem.Thismethodisbestsuitedtoworkwithfeatures thathaveasimilarscaling,asthevarianceoftheindividualfeature becomesaweightingfactorintheprocess.

ThismakesthePCAappropriatetorefinerawmeasurementsof acommonconcept,forexamplemethylationlevels(whereseveral probesareusedforeachgene)andbodyfatcomposition(where mostanalysishaveregionwithoverlaps).

(6)

2.7. Bayesianestimation

TheBayesianstatisticisanalternativeapproachtothewhole standard setof analysis commonlydone (Kruschke, 2010).The wholeapproachisbasedonasinglewayofconsideringtheanalysis thatisimplementeddifferentlydependingonthecase.Mostofthe standardstatistics(referredasfrequentiststatistic)canbeseenas aspecialcaseofthismoregeneralapproach.

ThemainideasoftheBayesianstatisticsareeasytograsps,and allowstoanswerseveralquestioninadirectand intuitiveway.

Firstly,theanalystcreateamodeldescribinghowthedatacanbe generated.Thismodelcontainsaprobabilisticdescriptionofthe phenomenonintermsofcertainparameters.Amodeliscomposed bytwoparts:thelikelihoodfunction,thatdescribeshowlikelyis tohaveanobservationvaluegiventhevalueoftheparameters, andtheprioridistributionsthatexpresstheplausiblevalueofthe unknownparameters.TheBayesiananalysisupdatetheplausibility oftheparametersaftertheobservations.Alltheclassicaltestscan berewrittenintermsofthiskindofparameterestimation.

Ifonewanttostudytheplausibilityofnefariousdevelopment ofanillnessunderanewtreatmentcomparedtoaplacebo,one wouldestimatethefrequencyofnefariousdevelopmentineachof thetwocasesandtocheckthatthesefrequenciesdoinfactdiffer.

Firstly,oneneedtodescribetheplausiblevaluesoffrequen- ciesbeforetheobservations.Thisknowledgeisrepresentedasa probabilitydistributionoverthepossiblevaluesthatthefrequency canhave.IntheBayesianstatisticsprobabilityisnotconsideredan objectiveentity,butratheranexpressionofourknowledgeabout theprocess. Thisdistribution,representingthepreviousknowl- edge,iscalledthepriordistributionoftheparameter.Thechoiceof thepriorsisoneofthemostdelicatepartsoftheBayesianmodeling, andisgoodpracticetotestthemodelwithdifferentpriors,ranging fromverywidedistribution(oracompletelyflatone)thatdoes notexpressanypreferenceforanypossiblevalueoftheparameter, tomorecommittedonesthatcanincludepreviousobservations, acceptedknowledgeinthefieldortheevaluationofexperts.

Thelikelihoodfunctiondescribetheplausibility ofobserving acertainnumberofdevelopmentinthepopulationunderstudy givenacertainfrequency,usingtheBinomialdistribution.Thisrep- resentshowtheresearcherthinkthatthedataaregenerated.The choiceofthelikelihoodfunctionrepresentthemodel,anddeter- minetheparametersunderconsideration. Differentmodelscan implyverydifferentlikelihoodfunctions,withverydifferentsta- tisticalproperties.Itisthenalwaysrecommendedtonotlimitthe analysistoasinglemodel,buttakedifferentonesintoconsider- ation.

ThefinalresultsofaBayesiananalysisistheposteriordistribu- tionoftheparameters,thatencodesalltheavailableinformation aboutit.Intherealpracticeisnecessarytakeadecisionbasednot onthedistribution,butratheronthebestpossibleestimateofthe parameterbasedontheinformation.Todothisisnecessarytosyn- thesizetheinformationfromtheposterioridistributioninasingle pointestimate.Thiscanbedonebychoosingariskfunctionthat describethepenaltyincurringinchoosingavalueoftheparameter whenadifferentvalueisthetrueone.Thefinalestimationisthe valuethatminimizetheexpectedrisk,weightedovertheplausi- bilityoftheothervaluesasindicatedbytheposterioridistribution.

Differentriskfunctionwillgeneratedifferentestimates.

2.8. Predictionusingexpertmixtures

Astatisticalmodelusedtopredicttheoutcomeofanexper- imentoranintervention isoften referredasan ExpertSystem.

Eachpossiblemodelisapartialrepresentationofthereality,that includecertainelementsinthepredictionneglectingothers,and usedifferentapproximationof thephenomenon.Linearregres-

sionmodelshypothesizethatalleffectsarelinearandindependent, neuralnetworksapproximatetheresponseasthesuperpositionof binarysignalsandsoon.

Toimprovetheperformancesofthepredictionsacommonstrat- egyistoimprovethemodelused,includingmoreandmoredetails init,reducingtheapproximationsusedtryingtogetclosertothe truth.Thisapproachisusefulwithverysimplemodels,buthavea verylowpayoffformorecomplicatedones,especiallywhenconsid- eringcomplicatedeffectslikehumanaging.

Research have shown (Masoudnia and Ebrahimpour, 2014) thatamoreeffectiveapproachistopoolseveralsimplemodels together,weightingthepredictionbasedontheiraccuracy.Com- biningseveraloftheseindependentmodelsinahigherlevelone, theperformanceboostcanbesignificative.Inthecaseofbiological ageprediction,eachmodelreturnsanestimationoftherealbio- logicalage.Eachoneofthispredictionwillbeimprecise,butifthe modelisgood,shouldalsobeunbiasedandindependentfromthe others.Abetterpredictorcanbegeneratedbytakingtheweighted averageoftheestimations.

3. Proposedstatisticalapproach

Giventhevolumeofdataandthecomplexityoftheresearch question,several differentpredictive modeling approaches will havetobeusedonMARK-AGEdata.

Toincreasethereplicationpoweroftheanalysis,thisshouldbe focusedonrobuststatisticalmethods,likenonparametricones,to avoidthatoutliers,duetoexceptionalsituationorhumanerror, influencetoomuchthefinalresults(seeSections2.1and2.2).The resultsofthesefeatureselectionsshouldbetestedwithatrain-test protocoltoassesstherobustnessandreplicabilityofthemethod (see Section2.5).In particular, thetested signatures shouldbe expectedtobeabletoseparateyoungandsenilepeoplewithan highdegreeofaccuracytobemeaningful.

ThestepsnecessaryforapropermodelingoftheMARK-AGE dataarethusthefollowing:

1.cleaningofthedataset,selectingtherelevantfeaturesandgen- eratingnewbiologicallyrelevantone;

2.dividethedatasetintrain,test,andvalidationsets,balancingfor gender,country,andothercovariates;

3.dividethefeaturesingroupofinterest:clinicalfeatures,bio- chemical,genetic,andepigenetic.Thesegroupswillbethebasis forthemulti-expertevaluationofthebiologicalage;

4.foreachoneofthesesetsgeneratingallthemodelsthatusethese features(uptotwoorthreefeatures)topredictthechronolog- icalage.Thesemodelsshoulduseprior biologicalknowledge whereavailablethroughBayesianmodeling,andshouldbeeval- uatedwithmodelsthatcancorrectforthetruncationeffectof thedependentvariable;

5.Combineallofthesemodelsinamultiplexnetworkofthedif- ferent populationgroups (RASIG, GO,SGO, etc.)and usethe information betweentheselayerstochoosea reliablemodel (onlytheinformationcommoninmostlayersshouldbecon- sideredrealanduseful);

6.Generatehigherlevelmodelforeachfeatureset,andcombine theseinageneral,composablemodel,robusttodatacollection issues.

3.1. Datacorrection,featureextractionanddimensionality reduction

Thedatabaseshouldfirstcheckedforanyirregularityofdata, bothforcompliancetothestandard rangeofvaluesandfordif- ferencesbetweennationality,studygroups, andgender.Dueto

(7)

Statisticalsoftwares

Statisticalanalysiscanbeseenontwodifferentlevels:high level decision about the analysis workflow, and number crunchingofthedata.Varioussoftwarehasbeencreatedto lifttheburdenofthenumbercrunchingasmuchaspossible fromtheanalyst,andautomatizingthemostcommonhigher levelprocedures.Differentsystemshavedifferentapproaches tothisproblem,butwecanroughlydistinguishthemintotwo broadcategories:lowlevel,programming suites,andhigh- levelautomatedinterfaces.

The authors sustains that data analyst should focus more onprogrammingenvironmentsratherthanhigherlevelinter- faces.Thiswillincreasethemarginalcostofanewanalysis, butinthelong termitwillforcetheanalysttokeepapre- ciseandreproducibletraceoftheanalysisdone.Itwouldalso helpotheranalysttocheckforthecorrectnessoftheoperations performed,andeasethelongdistancecollaborationsbycode sharing.

Themostcommonprogrammingenvironmentsare:

R(www.r-project.org)

Python(www.python.org)

SAS(www.sas.com)

Thereareothersuitableenvironments,suchasMATLABand Mathematica,buttheyarelesscommonchoicesamongdata analysts.

Twomaindistinctionscanbedonebetweenthesesuites.

Firstly, BothR andPython arefreeand opensource,while SASisacommercialproduct.ThismeansthatSASisnotfree touse,butgivesmoresupporttopayingusersandismore standardized.RandPython,ontheotherway,relymoreon theircommunityforsupport,bothofwhichareveryactiveand availabletohelp,butthereisnoguaranteethattherewillbe someonewiththecompetencetosolvetheuser’s problem.

Ontheupsidetheyallowamorerichandfastdevelopment ofsolution,takingadvantageoftheconcurrentnatureofthe developmentfrommultiplesources.Theselibrariesareoften implementedbytheoriginalauthorsofthemethods,andare quicklyavailable,butmanagingandinstallingtheselibraries arelefttotheuser(evenifimprovementsaremadeeachyear tosimplifythelibrarymanagements).

Secondly,SASandRareDomainSpecificLanguagesfordata analysis,whilePythonisageneralprogramming language.

ThismakesRandSASslightlymorecomfortableforinterac- tiveanalysis,witharichersetof dedicatedmodels.Python, onthecontrary,hasalessrichsetofbleedingedgemodels, butiteasestheincorporation oftheanalysisinmorecom- plicatedpipelines,suchasdatamanagementandmangling, outputproductionandsoon.

thestrongdifferencesfoundingenderduringaging,alltheanaly- sisshouldbeperformedseparatelybetweenthetwogroups.This willavoidtoincurintheSimpson’sparadox,whereanimbalance betweenacovariatevariablecanleadtowrong,evenparadoxical results.

Thefirststepisdatacleaning:clinicalstudiescanhavearela- tivelyhighpercentageofmissingdatainsomerelevantbiological andanthropometricvariables,andmostanalysistechniqueshave problemdealingwithmissingvalues. Removingallthepatients withmissingvaluesisimpossibleas,giventheamountofparam- eters,itwillmeantodropthewholedatabase.Onesolutionisto makeaselectionofasubsetofvariablesandremoveallthepatients thataremissingvaluesinthosevariables.Multiplesignatureselec- tionbasedondifferentvariablesubsetsshouldbethentried.These analysisshouldbeperformedonbothsubsets ofhighcoverage samplesandextended samplessetswithdataimputationbased onexternalinformations.

Wheremeaningful,newderivedvariablesnotdirectlyencoded in thedatabaseshouldbe generated,like factorizedexpression of multiple methylation sites with highcorrelation via dimen- sionalityreductionmethods(seeSection2.6),andcombinationof anthropometricvaluesviamedical-drivenreasoning,liketheaver- agestrengthofthedominanthand.Thisprocedureincreasesthe robustnessoftheestimationofcorrelationcoefficientsandavoid discriminationproblemduetohighlycorrelatedvariables.Italso increasestheinterpretabilityoftheresultsofthesignatureselec- tion.

Severalrelatedvariables,forexampletheexpressionofneigh- borsmethylationprobes,shouldbegroupedwitha hierarchical clusteringandreducedwithaPrincipalComponentAnalysistoa singlevalue(seeSection2.6).Thisissupposedtorepresentmore biologically relevant values, removing spurious correlation due tovariability.Thisapproach shouldbeverifiedwitha traintest validationandisexpectedtoyieldasignificantimprovementin robustnessandreliabilityofthemeasure.

Several relevant anthropometric and biochemical quantities should be analyzed with robust linear regression and logistic regression toassess theirinterdependenceand choosing which variablesshouldbeusedascovariateinthefollowinganalysis.

3.2. Datasetsplitintrain-testandvalidation

AsnotedinSection2.5,dividingthedatasetallowstohavea moresolidestimationofthepredictiveabilityofthemodelunder analysis.Theselection shouldbedoneina randomway,but it shouldmaintainabalancebetweenthemostrelevantvariablesof thedataset,likeage,smokingabit,andcountry.Thiswillensurethat thetestwillnotbealteredbyunbalancingbetweenthesegroups.

Thevariouspopulationgroups(RASIG,GO,SGO,etc.)should bekeptcompletelyseparate:agoodpredictormodelisexpected toperformproperlyonallthesegroups,butwitharelevantbias towardloweragesintheGOgroupandtowardhigheragesinthe progeric-likegroups.Thisisakeyfeatureofthemodels:thebase biologicalassumptionisthatpartofthediscrepancybetweenthe predictedandobservedageisduetotheunderlyingbiologicalage, butwithoutatestpopulationitwouldbeimpossibletodiscriminate thisfromabadpredictivemodel.

3.3. Featuregrouping

Therearetwomainreasonsforfeaturegrouping:differential agingandclinicalavailability.Firstly,itissuspectedthatagingis notasingleprocess,butratheramulti-spectrumprocessthatcan developwithdifferentspeedindifferentpartofthebody.Agood predictorshouldnottrytopredictallofthematonce,asitwould meanlosingstatisticalpower,mixingeffectspotentiallyindepen- dentonefromtheother,averagingtheminanulleffect.

Secondly, different measurements require different set of exams,notalwaysavailableorconvenient.Havingasinglemodel thatuseallthepossibleinformationavailableinthebestcasesce- nariowherealltheexamsresultsareavailablemaynotbeuseful inpractice.

3.4. Agesignature

Togeneratethefeaturesignatureforthepredictionofthebio- logical agesingle,couple, and triple correlationsamong all the variablesshouldbeperformed,rankingthemonthebasisofrobust, non-parametric regression methods (see Section 2.3), then the validityshouldbecheckedwithatrain-testsetup,includinginthe regressionandthetesttheappropriateconsiderationsforthetrun- cateddata(seeSection2.3).Oneachsubsetafullsub-matrixshould begenerated,withonlytheusedregressors,toavoidimputation

(8)

andtheremovalofsubjectsifnotabsolutelynecessary.Eachsig- natureshouldbetestedbothfortheregressionabilityandforthe capabilityofdiscriminationbetweenyoungandelderlysubjectsin thecontrolgroup.Allthesecorrelationsshouldseparatedbygender andthecountryoforiginshouldbeusedascovariate.

Havingobtainedacompletesetofsmallsignatures,anetwork basedmethodcanbeusedtogenerateabiggersignature.Theinfor- mations containedinthe relationshipsnetworkcan beusedto weighttheperformancesofeachsubset,andtocombinetheminto biologicallyrelevantsupersets.

Differentsignaturesbasedonbiochemicalandanthropometric variablesshouldbegenerated,andonasupersetofthethese.The predictedbiologicalagesobtainedwiththesevariablesetsshould beconfrontedtounderstandtheamountofagreementbetween theseapproaches. Thesepredictionsshouldalsobetestedusing theGEHA Offspring (GO) group,withtheir spouses (SGO), and Progericgroups,asunderthebiologicalhypothesisoftheproject thesegroupsshouldbehavedifferentlyinareliablewayifthepre- dictedageisrepresentativeoftheunderlyingbiologicalage.This biologicalcomparisonallowstoremoveseveralspuriouscorrela- tionsfromtheprediction,especiallyoncecomparedwithsubjects fromsimilarenvironment.

Areliablemethodtoconfrontthepredictorsofdifferentpop- ulationiswithareasonablesetofinitialassumptions.Thesecan begivenbydirectestimationandupdateusingtheBayesianmeth- ods(Seesection2.7)andbyusingtheinformationpresentinother databasesthroughmultiplexmethods(seeSection2.4).

3.5. Multiplexmodelselection

Thefinalmodelforeachfeaturesethastobegeneratedfroma setofwellperformingsubmodels.Agoodtradeoffbetweenexplor- ingthemodelspaceandtimerequirementistheensembleofallthe couplesoffeatures(withtheappropriatecovariatesincluded).This allowsustogenerateanetworkbetweenallthefeatures,wherethe featuresarethenodesandthequalityofthepredictionofthemod- elsarethelinks.Generatingdifferentnetworksforthepopulation groupsallowstogenerateamultiplex.

Wecanusethismultiplextoselectthemostrelevantfeatures setbyweightingtwoparameters.Firstly,onelinkcanbeconsidered relevantifitispresentinmostlayerswiththesamevalue,excluding highpredictiveresultsdueonlytorandomoutcomes.Secondly,a subsetoffeaturescanbeconsideredrelevantwhentheclusteringof thesefeaturesishigherthanaverage.Thismeansthateachcouple offeaturesintheselectedgroupisrelevantbyitself;thisselection allowstolimittheamountofcorrelatedfeaturesthatothermeth- odscouldrisktoincludeintheselection.Thisselectedgroupshould thenbecheckedagainstthevalidationgroupofRASIGs,toassess itsrealpredictivepower.

3.6. Predictionwithmixedmodels

Havingdifferentpredictorsthatcomesfromdifferentrangesof examswillallowtocombinethesepredictionasanexpertset.The coherencebetweentheirpredictionwillallowabetterprediction, thatwouldbemorerobustnotonlytotheintrinsicdistortionof eachmethod,butalsotothelackofinformationduetotheimpos- sibility(duetocostorhealthrisk)ofperformingcertaintests(see Section2.8).

Theseanalysisshouldalsotest fornonlinearbehavior inthe trends,thatcanarisefromsurvivorsbiases,wherethelesshealthy subjectsgetremovedfromtheviablestudysubjectsduetohealth issues.Thiscangenerateareductioninthesensitivityofthemethod inolderages,thatshouldbekeptinconsiderationduringthemodel selectionandfinaltesting.

4. Conclusions

MARK-AGEisanambitiousprojects,andtoreachtheproposed goalsahugeensembleofdatahasbeencollected.Thesedatahave acomplexandheterogeneousstructure,beingthecompositionof clinical,social,anthropometric,andbiochemicaldata.Thesedata cannotbedescribed byanyofthestandard distribution;inthe caseofordinaldata,it isoftenimpropertoconsiderananalyti- caldistributionatall.Giventheintrinsicdifficultieswithhuman datacollection,boththedatabaseandthefutureobservationonthe subjectswillcontainssomenon-observedvaluesaswellaswrong valuesderivedfromerrorsinthedataentry.Thechosenstrategy shouldbedesignedtoincludea properstatisticalapproachthat isrobustenoughtothiskindoferrorsinthedata.Thisleadstoa preferenceforrobustandnonparametricmethods,thataremore conservativewhenfacingnonordinarydatastructures.Thisrobust- nessiscrucialwhendealingwithhighdimensionaldata,asthe samplingofthevariablespacewillbeunevenandthevariability overwhelming.

BeingtheobjectiveoftheMARK-AGEprojectthepredictionfor theindividualofitsagingstatus,agreatdealofattentionshould begiventothepredictioncapabilityandrobustnessofthepredic- tion,employingtechniquesapttoreduceaspossibletheprediction error.

ItisalsoworthnotingthattheMARK-AGEpredictorhopefully willfindapplicationinseveralfieldsofthepublichealthmanage- ment.Whenthepolicymakerwillbecalledtotakeadecisionon thebaseofthepredictionoftheMARK-AGEmodel,itwillhaveto takeintoconsiderationtheeffectsandtheriskassociatedwiththis estimationofthebiologicalage.Itisthusnecessaryforthismodel tobeashonestaspossibleandgivenotasinglepointestimationof thebiologicalage,butratherawholeposteriordistributionofthe plausiblevaluesofthebiologicalage.Thiswillallowtheusersof thismodeltoperformanadequatedecisionbasedonariskevalu- ationthatencompassthewholeprediction,andnotonlyapartof it.

Torealizeapredictionasunbiasedandinformedaspossible, itisalsonecessaryfortheanalyticstrategytoallowtheprevious biologicalknowledgeasanexplicitinformation,bothaspriorin theparameterestimation,plausibleexpecteddistributionforthe missingvaluesandbiologicallyinformedmodelselection.

Theanalysisstrategythatweproposeshouldbeabletocope withall these challenges, and outperform simpler, more naive approaches,intermofpredictiveaccuracyandrobustnessofthe results.Dividingthemodelinseveralsub-modelswillallowusto improvethepredictionrobustnesstomissingdata,butalsotoallow tousethismodelinsituationwhereonlypartialdataareavailable.

Thedivisionbetweensubmodelsitisalsoimportantintheclini- calpractice,wherethecost-benefitratioofperformingatestcan becrucialinthemodelselection.Ourgoalshouldbenotonlyto developaprecisemodel,butonethatcanandwillbeusedinthe clinicalpractice.

Acknowledgements

WewishtothanktheEuropeanCommissionforfinancialsup- port through the FP7large-scale integrating project“European StudytoEstablishBiomarkersofHumanaging”(MARK-AGE;grant agreementno.:200880)

References

Boccaletti,S.,Bianconi,G.,Criado,R.,DelGenio,C.I.,Gómez-Garde ˜nes,J.,Romance, M.,Sendi ˜na-Nadal,I.,Wang,Z.,Zanin,M.,2014.Thestructureanddynamicsof multilayernetworks.Phys.Repvol.544(1(July)),1–122.

Referenzen

ÄHNLICHE DOKUMENTE

Different from secular judges or jury-courts, the divine judges, being the autocratic lords and mistresses of temple and village, are not impersonal and objective authorities,

2 Through this activity, ,-unsaturated carbonyl compounds trigger the activation or inhibition of anti- and proinflammatory pathways, where reactive sulfhydryl groups of

This article tests the historical validity of such claims in light of the historical record, and by doing so addresses the question of the prerequisites of

Previous experimental research has shown that such models can account for the information processing of dimensionally described and simultaneously presented choice

Studies that compare different PP models show that primary production estimated from only CHLorophyll-a (CHL) produces time averages not that different from those

Si les exemples de bonne gestion environnementale sont rares et les investissements insuffisants pour amorcer la transition verte, les décideurs publics doivent

The principle of territorialisation dictated by the EU for the implementation of the Regional Rural Development Programmes (RDP) 2007/2013 in Apulia has generated, in the recent

Subject-related data including biographical data were entered into the database by the recruitment centres whereas bio- analytical data were entered only using a secondary code;