W er kz eu ge de r In fo rma tik XM L - Ex te ns ib le M ar ku p La ng ua ge Prof. Dr. Sven Hartmann TU Clausthal Institut für Informatik Lehrstuhl für Datenbanken und Informationssysteme
X M L D a ta
WerkzeugederInformatik:XML,WS2009/101�Asemi-structureddatatuple(andapossiblevisualisationthroughawebbrowser): �Employee� �Name�Jerry�/Name� �Position�Mouse�/Position� �Email�jerry@turiteaConsulting.co.nz�/Email� �Phones� �Phone�3501111�/Phone� �Phone�3541112�/Phone� �Phone�2113333�/Phone� �/Phones� �Qualification�MasterofArts�/Qualification� �Skills� �Skill�Hiding�/Skill� �Skill�Running�/Skill� �Skill�Teasing�/Skill� �/Skills� �Photo�figures/jerry.jpg�/Photo� �/Employee�
X M L Ele m e n ts
WerkzeugederInformatik:XML,WS2009/102�XMLstandsforExtensibleMarkupLanguage, describingdatawithXMLissometimescalledXML-ification �Wehavechosenmarkuptagstospecifythelogicalstructureofthedata �thestaffdetailsofanemployeeconsistofaname,aposition,etc. �hencewehavechosenthecorrespondingtagstomarkuptherespectivedata items �Theessentialinformationisthetextbetweenthetags,whilethetagsrepresent meta-informationthathelpstounderstandthetext �AnypieceofXMLcodeiscalledanXMLfragment �however,therearecertainrulesforformingXMLcode �MarkuptagsusuallycomeinpairsandmarkupXMLelements,suchas �Skill�Hiding�/Skill� �herein,�Skill�isthestarttag,and�/Skill�theendtag �thetextinbetweenisthecontentoftheXMLelement
X M L Ele m e n ts
WerkzeugederInformatik:XML,WS2009/103�ThecontentofanXMLelementmightbe �puretext �amixtureofpuretextandmarkup �furtherXMLelements �nothing �XMLelementsmaybenestedintooneanother,suchas �Phones� �Phone�3501111�/Phone� �Phone�3541112�/Phone� �Phone�2113333�/Phone� �/Phones� �AnXMLelementwithoutcontentiscalledanemptyXMLelement �inthiscase,weuseonlyasinglemarkuptag,suchas �Retired/�
A tt ri b ut e s o f X M L E le me nt s
WerkzeugederInformatik:XML,WS2009/104�AnXMLelementmayhaveattributestocapturefurtherproperties �theyarestoredasattribute-valuepairsinthestarttag �Employee� �Name�Jerry�/Name� �PositionSince=“2000”�Mouse�/Position� �Email�jerry@turiteaConsulting.co.nz�/Email� �Phones� �PhoneKind=“work”�3501111�/Phone� �PhoneKind=“work”�3541112�/Phone� �PhoneKind=“mobile”�2113333�/Phone� �/Phones� �Qualification�MasterofArts�/Qualification� �Skills� �Skill�Hiding�/Skill� �Skill�Running�/Skill� �Skill�Teasing�/Skill� �/Skills� �Photo�figures/jerry.jpg�/Photo� �/Employee�
X M L D o c ume n ts
WerkzeugederInformatik:XML,WS2009/105�XMLcodeisstoredinXMLdocuments �AnXMLdocumentconsistsofthreeparts: �itsXMLdeclaration �itsprocessinginstructions �itsrootelement �AnXMLdocumentmusthavearootelement,whileXMLdeclarationandthe processinginstructionsareoptional �Usually,theXMLdeclarationlooksasfollows: �?xmlversion=“1.0”encoding=”UTF-8”?� �itindicatestheversionofXMLbeingused,here1.0 �anditstatesinwhichencodingthedocumentiswritten �Theprocessinginstructionscouldbedeclarationsofstylesheets,etc. �Fortherootelement,justchooseanameandformitlikeanyotherXMLelement: �DB�...�/DB�
X M L D o c ume n ts
WerkzeugederInformatik:XML,WS2009/106�AnXMLdocumentmustbewell-formed,thatis, �thereisexactlyonerootelement �startandendtagsmustmatch �startandendtagsmustnestproperly �ThefollowingXMLfragmentsarenotwell-formed: �apple��/pear� �apple��pear��/apple��/pear� �XMLiscase-sensitive(thisisdifferentfromHTML) �ThefollowingXMLfragmentisnotwell-formed: �Apple��/apple� �Infuture,wheneverwetalkaboutanXMLdocument,wemeanawell-formedone
X M L R e p o sit o ries
WerkzeugederInformatik:XML,WS2009/107�?xmlversion=“1.0”encoding=”UTF-8”?� �Employee� �Name�Jerry�/Name� �PositionSince=“2000”�Mouse�/Position� �Email�jerry@turiteaConsulting.co.nz�/Email� �Phones� �PhoneKind=“work”�3501111�/Phone� �PhoneKind=“work”�3541112�/Phone� �PhoneKind=“mobile”�2113333�/Phone� �/Phones� �Qualification�MasterofArts�/Qualification� �Skills� �Skill�Hiding�/Skill� �Skill�Running�/Skill� �Skill�Teasing�/Skill� �/Skills� �Photo�figures/jerry.jpg�/Photo� �/Employee�
�StoretheXMLelementEmployeein anXMLdocument(jerry.xml) �Similarly,createanXMLdocument foreachstaffmember
X M L R e p o sit o ries
WerkzeugederInformatik:XML,WS2009/108�?xmlversion=“1.0”encoding=”UTF-8”?� �Employee� �Name�Tom�/Name� �PositionSince=“2000”�Cat�/Position� �Email�tom@turiteaConsulting.co.nz�/Email� �Phones� �PhoneKind=“work”�3502222�/Phone� �PhoneKind=“home”�3542222�/Phone� �/Phones� �Skills� �Skill�Constructingmousetraps�/Skill� �Skill�Eating�/Skill� �/Skills� �Photo�figures/tom.gif�/Photo� �/Employee� �AnXMLrepositoryisacollectionofXMLdocuments(thataresomehowrelated)
De sc ribing Da ta T y p es
WerkzeugederInformatik:XML,WS2009/109�Weobserve: �therearelotsofemployeeshavingdifferentstaffdetails,butinallcasesthe structureoftheirstaffdetailslookssimilar �classificationabstractionmeanstodescribethecommonstructure �weaimtodescribethecommondatatype(asfaraspossible) �then,thisdatatypecanserveasaschemafortheXMLdatatuples,whichwill beinstancesofthedatatype �AfteranalysingthestructureoftheEmployeeelements,wedeclare: ��ELEMENTEmployee(Name,Position,Email,Phones,Qualification,Skills,Photo)� �thismayserveasacommondatamodelforallstaff �Weobserve: �thisisacomplexdatatype,sowealsoneedtodeclaredatatypesforNames, Positions,etc. �Qualificationisonlyoptional,soweneedtoindicatethis
X M L Ele m e n t De c la ra tio ns
WerkzeugederInformatik:XML,WS2009/1010�Anelementdeclarationhasthegeneralform: ��ELEMENTelementnamecontentmodel� �Theelementnameisthenameinsidethestartandendtag �itmustbeavalidXMLname,thatis, �startwithanalphabeticalcharacteroranunderscore �butnotwiththestring“xml” �itmaycontainanyalphanumericalcharacteroror-or. �butnoblanks,noreservedsymbolssuchas�or�or&or” �Thecontentmodelspecifieswhatmayoccurbetweenthestartandendtag: �puretext �anything(anymixtureofpuretextandmarkup) �furtherXMLelements �nothing
X M L Ele men t De c la ra tio n
WerkzeugederInformatik:XML,WS2009/1011�Weuse��ELEMENTelementname(#PCDATA)�ifthecontentispuretext ��PCD�T�standsforparsed,orbetter,parsablecharacterdata �Weuse��ELEMENTelementnameANY�ifthecontentmaybeanything �thisisveryconvenient,butnotveryinformative... �Weuse��ELEMENTelementnameEMPTY�ifthereisnocontent �butwait,tillwecanaddattributes... �Weuse��ELEMENTelementnamechildelements�ifthecontentarefurther XMLelements �theseelementsarereferredtoaschildelementsorchildren �asanexample,werecallourdatatypeforthestaffdetails: ��ELEMENTEmployee(Name,Position,Email,Phones,Qualification,Skills,Photo)�
De c la ri n g C h il d E le me nts
WerkzeugederInformatik:XML,WS2009/1012�Recall,thatweneedtoindicatethatQualificationisanoptionalchild �Weuseregularexpressionstodescribethepermittedcombinationsofchildelements ��ELEMENTelementnameregexpression� �Regularexpressionscanbebuildasfollows: �startwith�PCD�T�,EMPTYoranyvalidXMLnames �formsequences �formalterations �indicateoptionality �indicateiteration �indicatenon-emptyiteration �addbraces �Inpractise,theregularexpressionsusedforXMLelementsareoftenrathersimple
De c la ri n g C h il d E le me nts
WerkzeugederInformatik:XML,WS2009/1013�Herearesomeeasy-to-followrulesofthumb: �Todescribeasequenceofelementsoftypeschild�,...,child�,use ��ELEMENTelement-name(child�,...,child�)� �Todescribethealternativeofelementsoftypeschild�,...,child�,use ��ELEMENTelement-name(child��···�child�)� �Toindicateanoption,attacha?tooneormorechildelements �suchanelementmayormaynotappear �Toindicateaniteration,attacha�tooneormorechildelements �suchanelementmayoccurafinitenumberoftimes(ornotatall) �Toindicateanon-emptyiteration,attacha+tooneormorechildelements �suchanelementmayoccuranon-zero,finitenumberoftimes
Our E x a mp le
WerkzeugederInformatik:XML,WS2009/1014�WeindicatethatQualificationisonlyoptional: ��ELEMENTEmployee(Name,Position,Email,Phones,Qualification?,Skills,Photo)� �WedeclaredatatypesforthechildelementsNames,Positions,etc. ��ELEMENTName(#PCDATA)� ��ELEMENTPosition(#PCDATA)� ��ELEMENTEmail(#PCDATA)� ��ELEMENTPhones(Phone�)� ��ELEMENTQualification(#PCDATA)� ��ELEMENTSkills(Skill� )� ��ELEMENTPhoto(#PCDATA)� �WedeclaredatatypesforthegrandchildelementsPhoneandSkill ��ELEMENTPhone(#PCDATA)� ��ELEMENTSkill(#PCDATA)�
Our E x a mp le
WerkzeugederInformatik:XML,WS2009/1015�Wecheckthesuitabilityofthedatatype: �Employee� �Name�Tom�/Name� �PositionSince=“2000”�Cat�/Position� �Email�tom@turiteaConsulting.co.nz�/Email� �Phones� �PhoneKind=“work”�3502222�/Phone� �PhoneKind=“home”�3542222�/Phone� �/Phones� �Skills� �Skill�Constructingmousetraps�/Skill� �Skill�Eating�/Skill� �/Skills� �Photo�figures/tom.gif�/Photo� �/Employee�
��ELEMENTEmployee(Name,Position,Email, Phones,Qualification?,Skills,Photo)� ��ELEMENTName(#PCD�T�)� ��ELEMENTPosition(#PCD�T�)� ��ELEMENTEmail(#PCD�T�)� ��ELEMENTPhones(Phone�)� ��ELEMENTPhone(#PCD�T�)� ��ELEMENTQualification(#PCD�T�)� ��ELEMENTSkills(Skill�)� ��ELEMENTSkill(#PCD�T�)� ��ELEMENTPhoto(#PCD�T�)�
A tt ri b ut e D e c la ra ti o n
WerkzeugederInformatik:XML,WS2009/1016�XMLelementscanhaveattributestocaptureparticularpropertiesofthese elements,suchas ��ATTLISTPositionSinceCDATA#REQUIRED� �Anattributedeclarationhasthegeneralform: ��ATTLISTelementnameattributespecifications� �theelementnamespecifiestheelementwhoseattributeswewanttodeclare �thelistofattributespecificationscontainsexactlyoneforeachattribute, eachattributespecificationhastheform attributenameattributetypeattributeconstraint �theattributenameisthenamechosenforthisattribute �theattributenamemustbeavalidXMLname(asexplainedabove) �naturally,anytwoattributesofthesameelementshouldhavedistinctnames
A tt ri b ut e D e c la ra ti o n
WerkzeugederInformatik:XML,WS2009/1017�Therearethreekindsofattributevalues:strings,enumerated,andtokens �Strings:theattribute’svalueisacharacterstring �weusethesimpledatatypeCD�T� �blanksareallowed �anytextisallowedexceptforreservedsymbols �Enumerated:theattribute’svaluemustbechosenfromauser-specifiedlist ��ELEMENTCarEMPTY� ��ATTLISTCarMakeCDATA#REQUIRED ColourCDATA#REQUIRED New(yes�no)#REQUIRED� �Tokens:theattribute’svalueisaspecial-purposecharacterstring �NMTOKENcanbeusedforavalidXMLname �ENTITYcanbeusedforareferencetoanexternalfile �ID,IDREFandIDREFSareexplainedlateron
A tt ri b ut e D e c la ra ti o n
WerkzeugederInformatik:XML,WS2009/1018�Theattributeconstraintisoneof ��REQUIREDiftheattributemustoccurineveryelement ��IMPLIEDiftheattributeisoptional �adefaultvaluefortheattribute ��FIXEDvalue ��CURRENTiftheattributetakesthevaluemostrecentlyassignedtothisattribute �Forourexample �wecansimplychoose: ��ATTLISTPositionSinceCDATA#REQUIRED� ��ATTLISTPhoneKindCDATA#IMPLIED� �thus,Sinceisacompulsoryattribute,andKindisanoptionalattribute �alternativelywecouldalsochoose: ��ATTLISTPhoneKind(work�home�mobile)#IMPLIED�
Our E x a mp le
WerkzeugederInformatik:XML,WS2009/1019�Wecheckthesuitabilityofthedatatypeagain: �Employee� �Name�Tom�/Name� �PositionSince=“2000”�Cat�/Position� �Email�tom@turiteaConsulting.co.nz�/Email� �Phones� �PhoneKind=“work”�3502222�/Phone� �PhoneKind=“home”�3542222�/Phone� �/Phones� �Skills� �Skill�Constructingmousetraps�/Skill� �Skill�Eating�/Skill� �/Skills� �Photo�figures/tom.gif�/Photo� �/Employee�
��ELEMENTEmployee(Name,Position,Email, Phones,Qualification?,Skills,Photo)� ��ELEMENTName(#PCD�T�)� ��ELEMENTPosition(#PCD�T�)� ��ATTLISTPositionSinceCD�T�#REQUIRED� ��ELEMENTEmail(#PCD�T�)� ��ELEMENTPhones(Phone�)� ��ELEMENTPhone(#PCD�T�)� ��ATTLISTPhoneKindCD�T�#IMPLIED� ��ELEMENTQualification(#PCD�T�)� ��ELEMENTSkills(Skill�)� ��ELEMENTSkill(#PCD�T�)� ��ELEMENTPhoto(#PCD�T�)�