• Keine Ergebnisse gefunden

Secondary  structure  of  the  CTLD  of  perlucin  and  MBP-­‐A

3.   Results  and  Discussion

3.2. Molecular  dynamic  simulations  of  the  CTLD  of  perlucin  and   MBP-­‐A

3.2.2.   Secondary  structure  of  the  CTLD  of  perlucin  and  MBP-­‐A

The   first   characteristic   that   was   extracted   from   the   simulated   trajectories   was   the   average  secondary  structure  of  each  residue  (see  Nelson  &  Cox  [2013]  or  Richardson   [2007]  for  general  information  on  secondary  structure).  The  AMBER  “ptraj”  software   module   uses   the   DSSP   algorithm   (Kabsch   &   Sander   [1983])   to   assign   secondary   structure   elements   to   the   residues   of   a   trajectory.   “ptraj”   discriminates   between   the   following  elements:  parallel  strand,  anti-­‐parallel  strand,  α-­‐helix,  3/10-­‐helix,  π-­‐helix  and   turns.  Each  of  these  structural  elements  is  classified  according  to  the  hydrogen  bond   pattern  (see  Kabsch  &  Sander  [1983]  for  details)  Relevant  are  the  hydrogen  bonds  that   form   between   the   hydrogen   bound   to   the   backbone   nitrogen   of   residue  𝑗𝑗  and   the   backbone  oxygen  of  another  residue  𝑖𝑖.  

The  helical  structures  differ  in  the  number  of  residues  between  the  residues  that  are   involved   in   the   hydrogen   bond   formation.   In   the   familiar   α-­‐helix,   the   backbones   of   residue  𝑖𝑖  and  𝑖𝑖 + 4  are   connected   via   one   hydrogen   bond.   The   3/10-­‐helix   is   “tighter”  

since   the   residues  𝑖𝑖  and  𝑖𝑖 + 3  are   connected.   On   the   contrary   the   π-­‐helix   is   “looser”  

than   the   α-­‐helix   since   residues  𝑖𝑖  and  𝑖𝑖 + 5  are   connected.   At   least   two   consecutive   hydrogen  bonds  must  be  formed  to  define  a  helix.  If  only  one  hydrogen  bond  is  formed   then  a  turn  is  formed.  

Parallel  and  anti-­‐parallel  strands  differ  in  the  orientation  of  the  residue  segments  that   are  connected  by  hydrogen  bonds.  In  the  first  case  both  segments’  C-­‐  and  N-­‐terminal   ends  pointing  in  the  same  direction  whereas  in  the  latter  case  the  C-­‐  and  N-­‐terminal   ends  pointing  in  opposite  direction.  Note  that  according  to  the  definitions  that  underlie   DSSP   a   β-­‐strand   is   composed   of   successive   residues   that   are   in   a   “β-­‐bridge”  

conformation.   In   the   following   both   termini   are   used   interchangeably   except   when   stated  otherwise.  A  β-­‐bridge  is  characterised  by  two  hydrogen  bonds  formed  between   two  non-­‐overlapping  sequences  of  three  residues.  It  is  only  mentioned  that  the  STRIDE   algorithm   (Frishman   &   Argos   [1995])   uses   additionally   backbone   dihedral   angle   information  for  secondary  structure  classification.  

To  condense  the  secondary  structure  information  obtained  from  a  trajectory  as  much   as  possible  following  steps  were  performed.  “ptraj”  computes  the  percentage  of  frames   of   the   whole   trajectory   a   particular   residue   can   be   classified   by   one   of   the   above   mentioned   secondary   structure   elements.   Here   the   analysed   trajectories   comprised  

5010   frames   each   beginning   from   the   input   structure,   extending   over   the   restart   structures  from  the  restrained  heating  phase  and  every  frame  from  the  unconstrained   simulation.   The   influence   of   the   ten   restart-­‐structures   from   the   first  220  𝑝𝑝𝑝𝑝  on   the   subsequent   5000   frames   from   the   unconstrained   simulation   was   considered   to   be   negligible.   To   condense   the   information   further   only   the   total   strand   (sum   of   the   percentages  of  parallel  and  anti-­‐parallel  conformation  –  more  strictly  it  is  the  sum  of   parallel  and  anti-­‐parallel  β-­‐bridges),  total  helical  (sum  of  all  helical  conformations)  and   turn   conformation   per   residue   were   considered.   The   time   dependency   of   the   secondary   structure   conformation   per   residue   was   not   considered   further.   Since   several   MD   simulations   were   performed   with   the   same   initial   structure   (see   Table   3.2.2.)  the  results  of  every  single  MD  simulation  with  the  same  initial  structure  were   averaged.  

The  following  two  figures  –  one  for  perlucin  with  four  calcium  ions  (run09)  and  MBP-­‐A   with  three  calcium  ions  (run07)  –  show  for  every  residue  the  average  percentage  of   frames  that  a  certain  residue  is  in  a  strand,  helical  or  turn  conformation.  The  figures  for   the  remaining  MD  simulations  can  be  found  in  the  Appendix  III.R.3.  and  are  omitted   here  to  maintain  readability.  

 

 

Fig.  3.2.3.  Average  secondary  structure  conformation  from  six  10.2  ns  simulations  of  perlucin   with  four  calcium  ions  (run09).  For  every  residue  the  percentage  of  frames  the  given  residue  

adopts  one  of  the  following  conformations  is  given.  The  “general  helical”  (violet)  conformation   is   the   sum   of   the   α-­‐helix,   3/10-­‐helix   and   π-­‐helix   conformations   and   the   “general   strand”  

(yellow)  is  the  sum  of  parallel  and  anti-­‐parallel  β-­‐strands  (strictly  it  is  the  sum  of  parallel  and   anti-­‐parallel   β-­‐bridges).   The   third   conformation   is   the   “turn”   (cyan)   conformation.   Note   that   due  to  the  graphical  representation  with  “columns”  or  “bars”  the  residue  number  marker  on   the   bottom   axis   is   positioned   on   the   left   side   of   the   corresponding   column/bar.   For   better   orientation   the   (presumed)   identifiers   of   the   characteristic   SSEs   of   CTLDs   according   to   Zelensky  et  al.  (Zelensky  &  Gready  [2003],  Fig.  2a  therein)  are  given  at  the  top  of  the  graph.  

 

 

Fig.   3.2.4.   Average   secondary   structure   conformation   from   three  10.2  ns  simulations   of   the   CTLD  of  MBP-­‐A  (PDB  code  1KWV,  chain  A,  residues  104-­‐221)  with  three  calcium  ions  (run07).  

For   every   residue   the   percentage   of   frames   the   given   residue   adopts   one   of   the   following   conformations  is  given.  The  “general  helical”  (violet)  conformation  is  the  sum  of  the  α-­‐helix,   3/10-­‐helix  and  π-­‐helix  conformations  and  the  “general  strand”  (yellow)  is  the  sum  of  parallel   and  anti-­‐parallel  β-­‐strands  (strictly  it  is  the  sum  of  parallel  and  anti-­‐parallel  β-­‐bridges).  The   third   conformation   is   the   “turn”   (cyan)   conformation.   Note   that   due   to   the   graphical   representation   with   “columns”   or   “bars”   the   residue   number   marker   on   the   bottom   axis   is   positioned   on   the   left   side   of   the   corresponding   column/bar.   The   crosses   in   either   violet   or   yellow   positioned   at   the   100%   value   of   some   residues   indicate   the   secondary   structure   obtained  for  the  crystal  structure  1KWV  (chain  A)  from  the  PDB  web  site.  Here  again  all  helix   types  are  subsumed  in  the  violet  crosses  and  as  well  as  all  β-­‐strands  and  β-­‐bridge  content  is   subsumed  in  the  yellow  crosses.  Note  that  the  crosses  are  attached  on  the  left  hand  side  of  the   corresponding  column.  For  better  orientation  the  (presumed)  identifiers  of  the  characteristic   SSEs  of  CTLDs  according  to  Zelensky  et  al.  (Zelensky  &  Gready  [2003],  Fig.  2a  therein)  is  given   at  the  top  of  the  graph.  

 

As  it  can  be  seen  in  the  Figures  3.2.3.  and  3.2.4.  the  secondary  structure  elements  that   are   expected   for   CTLDs   in   the   long   form   (perlucin   with   β0-­‐strand)   and   short   form   (MBP-­‐A   without   β0-­‐strand)   can   be   identified.   During   the   simulations   of   the   CTLD   of   MBP-­‐A   deviations   from   the   secondary   structure   conformations   obtained   from   the   crystal   structure   can   be   observed.   This   can   be   seen   in   Fig.   3.2.4.   by   comparing   the   crosses  reflecting  the  secondary  structure  of  the  crystal  structure  and  the  height  of  the   bars/columns  representing  data  from  the  simulation.  First  of  all  it  has  to  be  stated  that   the  total  number  of  simulated  protein  models/structures  is  low  (twelve  for  perlucin   and   nine   for   MBP-­‐A   in   total)   is   low   compared   to   typical   concentrations   in   typical   laboratory  experiments.  Additionally  initial  models/structures  might  have  partial  non-­‐

native   conformations   due   to   modelling/crystallization.   Since   only   one   simulation   parameter  set  was  used  in  this  thesis  their  influence  on  the  simulated  proteins  could   not  inferred  from  the  data.  Therefore  it  cannot  be  expected  that  the  distribution  of  the   here   simulated   secondary   structure   conformations   reflects   the  situation   in   a   protein   crystal  used  for  experimental  structure  determination.    

It   was   desirable   to   assign   one   unique   secondary   structure   to   each   residue   of   the   simulated   structures/models.   Since   the   time   dependency   of   the   secondary   structure   was  not  evaluated  in  this  thesis  an  arbitrary  threshold  was  chosen  to  assign  a  “general   helical”  (α-­‐helix,  3/10-­‐helix,  π-­‐helix)  or  “general  strand”  (parallel  and  anti-­‐parallel  β-­‐

strands   including   β-­‐bridges)   conformation   to   the   residues   of   the   simulated   proteins.  

Referring  to  the  averaged  results  of  a  MD  simulation  series,  e.g.  the  results  presented  in   Fig.  3.2.3.  and  3.2.4.,  a  certain  conformation  was  assigned  to  one  residue  if  it  was  in  at   least   75%   of   the   frames   of   the   analysed   trajectories   on   average   in   this   particular   conformation.   In   Figure   3.2.5.   the   result   of   this   assignment   is   shown.   For   every   MD   simulation  series  every  residue  of  the  simulated  protein  was  assigned  a  “h”  (general   helical)   or   “e”   (general   strand)   if   appropriate.   This   can   be   compared   to   expected   secondary  structure.  For  perlucin  this  secondary  structure  could  only  be  inferred  from   the  alignment  with  templates  that  was  used  during  the  modelling  process  (see  section   3.1.   and   Fig.   3.1.4.).   In   the   case   of   the   CTLD   of   MBP-­‐A   the   secondary   structure   was   obtained   from   the   PDB   web   page   for   the   structures   1KWT   and   1KWV.   The   PDB   provides  sequences  annotated  according  to  the  DSSP  algorithm.  Note  that  1KWT  and   1KWV  have  the  identical  sequence  as  well  as  the  identical  secondary  structure.  

 

A) perlucin ------ number | 1 10 20 30 40 50 60 70 80 90 100 110 120 130 PERLUCIN | GCPLGFHQNRRSCYWFSTIKSSFAEAAGYCRYLESHLAIISNKDEDSFIRGYATRLGEAFNYWLGASDLNIEGRWLWEGQRRMNYTNWSPGQPDNAGGIEHCLELRRDLGNYLWNDYQCQKPSHFICEKER w/ 4 calcium | eee eeeee e hhhhhhhhhh ee hhhhhhhhh eee ee ee e eee eee e eeee w/ 2 calcium | ee eeeee e hhhhhhhhhh ee hhhhhhh ee ee ee (e) eee eee eeeeee w/o calcium | ee eeeee e hhhhhhhhhh ee hhhhhhh e ee ee eeee eeee e eeeee DSSP expct. | EEE EEEEE B HHHHHHHHHH EE HHHHHHHHHh h EEEEEE EE b B ggg eeEEEE ggg EEEE ‡EEEEEEE SSE-Id | b0 b1 a1 b1' a2 b2 b2'' b3 b4 b5 PERLUCIN | GCPLGFHQNRRSCYWFSTIKSSFAEAAGYCRYLESHLAIISNKDEDSFIRGYATRLGEAFNYWLGASDLNIEGRWLWEGQRRMNYTNWSPGQPDNAGGIEHCLELRRDLGNYLWNDYQCQKPSHFICEKER ------ B) CTLD of MBP-A (1KWT, 1KWV) ------ number | 1 10 20 30 40 50 60 70 80 90 100 110 118 1KWV/T chn. A| GKKSGKKFFVTNHERMPFSKVKALCSELRGTVAIPRNAEENKAIQEVAKTSAFLGITDEVTEGQFMYVTGGRLTYSNWKKDEPNDHGSGEDCVTIVDNGLWNDISCQASHTAVCEFPA w/ 3 calcium | ee eeehhhhhhhhhh ee hhhhhhhhh ee ee ee e eeee hhh eeee ee ee w/ 1 calcium | eee eeehhhhhhhhhh ee hhhhhhhhh ee ee ee e eeee hhh eeee ee eee w/o calcium | --- ee eeehhhhhhhhhh ee hhhhhhhhhh ee ee ee e eeee eeee ee ee DSSP (1KWV/T)| EEEEEEEEEEHHHHHHHHHH EE HHHHHHHHHHH EEEEEE EE B B EEEE GGG EEEE EEEEEEEE SSE-Id | b1 a1 b1' a2 b2 b2'' b3 b4 b5 1KWV/T chn. A| GKKSGKKFFVTNHERMPFSKVKALCSELRGTVAIPRNAEENKAIQEVAKTSAFLGITDEVTEGQFMYVTGGRLTYSNWKKDEPNDHGSGEDCVTIVDNGLWNDISCQASHTAVCEFPA ------ Fig.  3.2.5.  Summary  of  the  secondary  structure  conformations  of  the  CTLD  of  perlucin  (A)  and  MBP-­‐A  (B)  obtained  from  the  MD  simulations.  Each  part  is   organised  as  follows.  The  first  line  labelled  “number”  contains  the  residue  numbering.  The  first  residue  of  the  simulated  protein  is  assigned  the  number  “1”.   The  next  line  labelled  ”sequence”  contains  the  sequence  of  perlucin  (residue  1  to  131)  and  MBP-­‐A  (residue  104  to  221  in  PDB  numbering).  The  following  lines   contain  the  secondary  structure  of  every  residue  as  obtained  from  the  simulations.  For  the  two  proteins  the  number  of  associated  calcium  ions  differed  in  each   simulation  series.  Only  a  general  helical  (“h”)  or  general  strand  (“e”)  conformation  is  assigned  if  in  at  least  75%  of  the  frames  of  the  analysed  trajectories  on   average  in  one  of  the  aforementioned  conformations  persists.    The  line  containing  “DSSP”  in  its  label  holds  the  expected  secondary  structure  of  each  residue.  In   the  case  of  perlucin  these  conformations  were  taken  from  the  alignment  of  the  perlucin  sequence  with  templates  during  the  modelling  process  (cf.  section  3.1.   and  Fig.  3.1.4).  In  the  case  of  MBP-­‐A  the  conformations  were  taken  from  the  crystal  structures  1KWT  and  1KWV  (both  structures  have  identical  secondary   structure  conformations).  The  PBD  web  page  offers  the  sequences  of  the  structures  with  conformational  annotations  mady  by  the  DSSP  algorithm.  “E”   represents  a  β-­‐strand,  “B”  a  β-­‐bridge,  “H”  an  α-­‐helix  and  “G”  a  3/10-­‐helix.  In  the  case  of  perlucin  lower  case  are  used  to  indicate  that  only  one  template  has  the   corresponding  conformation  instead  of  both.  The  last  but  not  least  line  labelled  “SSE-­‐Id”  refers  to  the  SSE  notation  scheme  for  CTLDs  as  described  by  Zelensky   et  al.  (Zelensky  &  Gready  [2003]).  The  exceptional  character  “‡”  signals  that  one  template  residue  is  in  a  β-­‐strand  and  the  other  in  a  β-­‐bridge  conformation.   “(e)”  in  A)  means  that  the  residue  was  in  74.2%  of  the  frames  in  strand  conformation  on  average.  

To  provide  more  information  the  secondary  structure  elements  were  divided  into  β-­‐

strands   (“E”),   β-­‐bridges   (“B”),   α-­‐helices   (“H”)   and   3/10-­‐helices   (“G”).   As   already   introduced   in   Fig.   3.1.4.   a   lower   case   for   the   expected   secondary   structure   for   a   perlucin  residue  implies  that  only  one  template  has  this  conformation  and  not  both  of   the  templates.  The  SSEs  that  are  characteristic  for  the  CTLD  (e.g.  Zelensky  &  Gready   [2003])  are  given  as  well  in  Fig.  3.2.5.  

Two  conclusions  can  be  drawn  from  Fig.  3.2.5.  For  both  perlucin  and  MBP-­‐A  deviations   of   the   average   secondary   structure   assigned   to   each   residue   from   the   expected   secondary   structure   can   be   observed.   First   of   all   it   has   to   be   pointed   out   that   both   simulated   proteins   lack   a   considerable   structural   segment:   perlucin   lacks   the   C-­‐

terminal  region  for  which  no  structural  information  were  available  and  MBP-­‐A  lacks   the   N-­‐terminal   helical   region.   For   the   latter   protein   this   might   influence   at   least   the   stability  of  the  first  strand  or  even  other  parts  of  the  protein  depending  on  the  native   state  of  MBP-­‐A.  It  is  suggested  that  this  protein  can  form  oligomers  (see  e.g.  Heise  et  al.  

[2000],  Weis  &  Drickamer  [1994]).  In  the  case  of  perlucin  the  native  structure  is  not   known  therefore  nothing  can  be  said  about  the  influence  of  the  C-­‐terminal  region  on   the  overall  protein  stability.  

For  MBP-­‐A  the  secondary  structure  reference  was  obtained  from  a  crystal  structure.  

Since  a  protein  crystal  is  not  a  native  environment  for  proteins  the  observed  deviations   might   reflect   the   influence   of   the   simulated   environment   on   the   overall   protein   structure.   However   the   results   of   the   MD   simulations   with   MBP-­‐A   as   a   reference   protein   set   the   frame   for   the   best   results   that   can   be   expected   from   the   simulation   protocol  that  is  used  in  this  thesis.  

With   respect   to   perlucin   the   most   obvious   explanation   for   any   deviation   is   that   the   generated  model  has  some  shortcomings  and  differs  from  a  native  solution  structure   or  the  energetically  most  favourable  one.  

Nonetheless   every   SSE   characteristic   for   CTLDs   can   be   identified   in   every   MD   simulation  series  of  perlucin  and  MBP-­‐A  and  the  number  of  deviations  is  of  the  same   order  of  magnitude  (if  one  counts  naively  the  number  of  residues  in  Fig.  3.2.5.  that  are   not  in  the  secondary  structure  conformation  expected  for  CTLDs  and  omits  the  3/10   helices).   Therefore   the   obtained   secondary   structure   assignment   is   considered   to   be   reasonable.  An  obvious  influence  of  the  calcium  ions  on  the  secondary  structure  seems   not  to  be  visible.  

A   final   remark   concerns   the   subsuming   of   the   α-­‐helix,   3/10-­‐helix   and   π-­‐helix   conformations   into   a   “general   helical”   class.   The   π-­‐helix   conformation   is   not   encountered   to   a   relevant   extent   during   the   MD   simulations.   In   contrast   the   3/10   helical  conformation  is  observed  more  frequently.  In  the  Appendix  the  Figures  III.R.10.  

to  III.R.12.  show  (non-­‐representative)  examples  from  the  conducted  MD  simulations.  

Especially  in  the  α2  helix  of  the  CTLD  fold  residues  adopt  a  3/10-­‐helical  conformation   or  switch  between  the  3/10-­‐  and  α-­‐helical  conformations.  This  feature  might  be  linked   to   the   overall   stability   of   the   α2   helix.   In   a   short   review   of   α-­‐   and   3/10-­‐helices   in   polypeptides   Bolin   and   Millhauser   (Bolin   &   Millhauser   [1999])   conclude   amongst   others  that  the  3/10-­‐helix  could  be  an  intermediate  state  between  the  unfolded  and  α-­‐

helical  conformation  of  polypeptides.    

Therefore  it  should  be  part  of  future  investigations  if  the  instability  of  the  C-­‐terminal   end   of   the   α2   helix   is   the   results   of   a   modelling   shortcoming   or   actually   a   protein   feature.   Remember   that   in   the   perlucin   model   the   loop   region   between   α2   and   β2   lacked  a  template  during  the  modelling  process  (see  also  end  of  section  3.1.3.).  As  it   will  become  clear  in  section  3.2.5.  this  region  shows  a  high  positional  fluctuations.  It   would   be   interesting   to   investigate   the   behaviour   of   the   structure   of   OC-­‐17   (PDB   accession  number  1GZ2),  which  has  a  15  residue  long  segment  between  its  α2  helix   and  β2  strand.  

 

3.2.3.  Solvent  accessible  surface  area  estimation  of  the  CTLD  of  perlucin  and