Design and Performance of the 6 GS/s Waveform Digitizing Chip
DRS4
Stefan Ritt
Paul Scherrer Institute, Switzerland
at 40 mW per channel
Switched Capacitor Array
• Cons
• No continuous acquisition
• No precise timing
• External (commercial) ADC needed
• Pros
• High speed (6 GHz) high resolution (11.5 bit resol.)
• High channel density (9 channels on 5x5 mm
2)
• Low power (10-40 mW / channel)
• Low cost (~ 10$ / channel)
t t t t t
Oct. 21st, 2008 IEEE/NSS Dresden 3
DRS4
•Fabricated in 0.25 m 1P5M MMC process (UMC), 5 x 5 mm2, radiation hard
•8+1 ch. each 1024 cells
•Differential inputs, differential outputs
•Sampling speed 500 MHz … 6 GHz, PLL stabilized
•Readout speed
30 MHz, multiplexed or in parallel
I N 0 I N 1 I N 2 I N 3 I N 4 I N 5 I N 6 I N 7 I N 8
S T O P S H I F T R E G I S T E R R E A D S H I F T R E G I S T E R W S R O U T
C O N F I G R E G I S T E R R S R L O A D
D E N A B L E W S R I N D W R I T E
D S P E E D P L L O U T
D O M I N O W A V E C I R C U I T P L L
A G N D
D G N D A V D D
D V D D D T A P
R E F C L K
P L L L C K A 0 A 1 A 2 A 3
ENABLE
O U T 0 O U T 1 O U T 2 O U T 3 O U T 4 O U T 5 O U T 6 O U T 7 O U T 8 / M U X O U T B I A S O - O F S R O F S S R O U T R E S E T S R C L KS R I N
F U N C T I O N A L B L O C K D I A G R A M
M U X
WRITE SHIFT REGISTER WRITE CONFIG REGISTER
C H A N N E L 0 C H A N N E L 1 C H A N N E L 2 C H A N N E L 3 C H A N N E L 4 C H A N N E L 5 C H A N N E L 6 C H A N N E L 7 C H A N N E L 8
M U X L V D S
ROI readout mode
readout shift register
Trigger stop
normal trigger stop after latency
Delay
delayed trigger stop
33 MHz
e.g. 100 samples @ 33 MHz
3 us dead time
e.g. 100 samples @ 33 MHz
3 us dead time
Oct. 21st, 2008 IEEE/NSS Dresden 5
Daisy-chaining of channels
Channel 0 – 1024 cells Channel 1 – 1024 cells Channel 2 – 1024 cells Channel 3 – 1024 cells Channel 4 – 1024 cells Channel 5 – 1024 cells Channel 6 – 1024 cells Channel 7 – 1024 cells Domino Wave Generation
Deeper Sampling Depth can be reached by multiplexing channels Deeper Sampling Depth can be reached by multiplexing channels
Daisy-chaining of channels
Channel 0 Channel 1 Channel 2 Channel 3 Channel 4 Channel 5 Channel 6 Channel 7 Domino Wave
1 clock
0 1 0 1 0 1 0
enable input
enable input
Channel 0 Channel 1 Channel 2 Channel 3 Channel 4 Channel 5 Channel 6 Channel 7 Domino Wave
1 clock
0
1 0
1 0
1 0
enable input enable input
Oct. 21st, 2008 IEEE/NSS Dresden 7
Single Channel
Channel 0 Channel 1 Channel 2 Channel 3 Channel 4 Channel 5 Channel 6 Channel 7 Domino Wave
0 clock
0 0 0 0 0 0 0
1 Channel 0
Channel 1
1
Channel 2
1
Channel 3
1
Channel 4
1
Channel 5
1
Channel 6
1
Channel 7
1
DRS4
Connect channels externally to keep high bandwidth limited by bond wires (PCB or analog switches) Connect channels externally to keep high bandwidth limited by bond wires (PCB or analog switches)
DRS4 can be partitioned in: 8x1024, 4x2048, 2x4096, 1x8192 cells DRS4 can be partitioned in: 8x1024, 4x2048, 2x4096, 1x8192 cells
Chip Daisy Chaining
DRS4
SROUT SRIN
DRS4
SROUT SRIN
DRS4
SROUT SRIN
Virtually unlimited sampling depth
Oct. 21st, 2008 IEEE/NSS Dresden 9
Simultaneous Write/Read
Channel 0 Channel 1 Channel 2 Channel 3 Channel 4 Channel 5 Channel 6 Channel 7
0
FPGA
0 0 0 0 0 0 0
1 Channel 0
Channel 1
1
Channel 0 readout
8-fold
analog multi-event buffer
Channel 2
1
Channel 1
0
Expected crosstalk ~few mV Expected crosstalk ~few mV
Trigger an DAQ on same board
• Using a multiplexer in DRS3, input signals can simultaneously digitized at 65 MHz and sampled in the DRS
• FPGA can make local trigger (or global one) and stop DRS upon a trigger
• DRS readout (6 GHz samples) though same 8-channel
FADCs
analog front end
DRS FADC
12 bit 65 MHz
MUX FPGA
trigger
LVDS
SRAM
DRS4
global trigger bus
“Free” local trigger capability without additional hardware
“Free” local trigger capability without additional hardware
DRS4
Test Results
On-chip PLL
Reference Clock
fclk = fsamp / 2048
Vspeed
• PLL jitter « 100 ps (Spartan-3 jitter 150 ps)
• “Dead Band” free
• PLL jitter « 100 ps (Spartan-3 jitter 150 ps)
• “Dead Band” free
loop filter
DRS4
Simulation
Measurement
Phase detector
up down
Oct. 21st, 2008 IEEE/NSS Dresden 13
Bandwidth
Bandwidth is determined by bond wire and internal bus resistance/capacitance:
850 MHz (QFP), 950 MHz (QFN), ??? (flip-chip)
850 MHz (-3dB)
QFP package final
bus width
Simulation Measurement
Timing jitter
t1 t2 t3 t4 t5
• Inverter chain has transistor variations
ti between samples differ
“Fixed pattern aperture jitter”
• “Differential temporal nonlinearity”
TDi= ti – tnominal
• “Integral temporal nonlinearity”
TIi = ti – itnominal
• “Random aperture jitter” = variation of ti between measurements
• Inverter chain has transistor variations
ti between samples differ
“Fixed pattern aperture jitter”
• “Differential temporal nonlinearity”
TDi= ti – tnominal
• “Integral temporal nonlinearity”
TIi = ti – itnominal
• “Random aperture jitter” = variation of ti between measurements
Oct. 21st, 2008 IEEE/NSS Dresden 15
Fixed jitter calibration
• Fixed jitter is constant over time, can be measured and corrected for
• Several methods are commonly used
• Most use sine wave with random phase and correct for TDi on a statistical basis
• Fixed jitter is constant over time, can be measured and corrected for
• Several methods are commonly used
• Most use sine wave with random phase and correct for TDi on a statistical basis
Fixed Pattern Jitter Results
• TDi typically ~50 ps RMS @ 5 GHz
• TIi goes up to ~600 ps
• Inter-channel variation on same chip is very small since all
channels are driven by the same domino wave
I N 0 I N 1 I N 2 I N 3 I N 4 I N 5 I N 6 I N 7 I N 8
S T O P S H I F T R E G I S T E R
R E A D S H I F T R E G I S T E R W S R O U T
R S R L O A D D E N A B L E W S R I N D W R I T E
D S P E E D P L L O U T
D O M I N O W A V E C I R C U I T P L L
A G N D A V D D P L L L C K R E F C L K D T A PA 0 A 1 A 2 A 3
ENABLE
O U T 0 O U T 1 O U T 2 O U T 3 O U T 4 O U T 5 O U T 6 O U T 7 O U T 8 / M U X O U T B I A S O - O F S R O F S S R O U T
F U N C T I O N A L B L O C K D I A G R A M
M U X
WRITE SHIFT REGISTER WRITE CONFIG REGISTER
C H A N N E L 0 C H A N N E L 1 C H A N N E L 2 C H A N N E L 3 C H A N N E L 4 C H A N N E L 5 C H A N N E L 6 C H A N N E L 7 C H A N N E L 8
M U X L V D S
Oct. 21st, 2008 IEEE/NSS Dresden 17
Random Jitter Results
• Sine curve frequency fitted for each measurement (PLL jitter compensation)
• Encouraging result for DRS3:
2.7 ps RMS (best channel) 3.9 ps RMS (worst channel)
• Differential measurement
t1 – t2 adds a 2, needs to be verified by measurement
• Measurement of n points on a rising edge of a signal improves by n
• Sine curve frequency fitted for each measurement (PLL jitter compensation)
• Encouraging result for DRS3:
2.7 ps RMS (best channel) 3.9 ps RMS (worst channel)
• Differential measurement
t1 – t2 adds a 2, needs to be verified by measurement
• Measurement of n points on a rising edge of a signal improves by n
Measurements for DRS4
currently going on, expected to be slightly better
Measurements for DRS4
currently going on, expected to be slightly better
Experiments using DRS chip
MAGIC-II 1200 channels DRS2 MAGIC-II 1200 channels DRS2 MEG 3000 channels DRS2
MEG 3000 channels DRS2
BPM for XFEL@PSI
1000 channels DRS4 (planned)
Oct. 21st, 2008 IEEE/NSS Dresden 19
Availability
• DRS4 will become available in larger quantities in November 2008
• Chip can be obtained from PSI on a “non-profit” basis
• Delivery “as-is”
• Reference design (schematics) from PSI
• Costs ~ 10-15$/channel
• VME boards from industry in 2009
64-channel
65 MHz/12bit digitizer
“boosted” by DRS4 chip to 6 GHz
64-channel
65 MHz/12bit digitizer
“boosted” by DRS4 chip to 6 GHz
Input USB 2.0
ext.
Trigger
DRS4
Conclusions
• Fast waveform digitizing with SCA chips will have a big impact on experiments in the next future
• DRS4 chip solves all known issues of DRS3 and adds more flexibility
• DRS4 has 6 GHz, 1024 sampling cells per channel, 9
channels per chip, 11.5 bit vertical resolution, 3 ps timing resolution
• ~4000 DRS channels already used in several experiments, hope that other experiments can benefit from this
technology
http://midas.psi.ch/drs
A bit of history…
DRS2 DRS2
DRS3 DRS3 DRS1 DRS1
MEG Experiment searching for e down to 10-13 MEG Experiment searching for e down to 10-13
DRS4 DRS4
2008 2006 2004 2001
3000 Channels with 3000 Channels with GHz sampling
Oct. 21st, 2008 IEEE/NSS Dresden 23
DRS4 packaging
6 4 - L e a d L Q F P
6 4 - L e a d Q F N
1 7 6 4
1 8 6 3
1 9 6 2
2 0 6 1
2 1 6 0
2 2 5 9
2 3 5 8
2 4 5 7
2 5 5 6
2 6 5 5
2 7 5 4
2 8 5 3
2 9 5 2
3 0 5 1
3 1 5 0
3 2 4 9
1 6 3 3
3 4 1 5
3 5 1 4
3 6 1 3
3 7 1 2
3 8 1 1
3 9 1 0
4 0 9
4 1 8
4 2 7
4 3 6
4 4 5
4 5 4
4 6 3
4 7 2
1 4 8
D R S 3 T O P V I E W ( N o t t o S c a le )
P I N 1
D R S 3 T O P V I E W ( N o t t o S c a l e )
P I N 1
P I N C O N F I G U R A T I O N
A 0
A 0 I N 8 +
I N 8 + A 1
A 1 I N 8 -
I N 8 - A 2
A 2 I N 7 +
I N 7 + A 3
A 3 I N 7 -
I N 7 -
O U T 1 1 O U T 1 1
I N 6 + I N 6 +
O U T 1 0 O U T 1 0
I N 6 - I N 6 -
O U T 9 O U T 9
I N 5 + I N 5 +
O U T 8
O U T 8 I N 5 -
I N 5 - O U T 7
O U T 7 I N 4 +
I N 4 +
O U T 6
O U T 6 I N 4 -
I N 4 -
O U T 5
O U T 5 I N 3 +
I N 3 +
O U T 4
O U T 4
I N 3 -
I N 3 -
O U T 3
O U T 3
I N 2 +
I N 2 +
O U T 2
O U T 2
I N 2 -
I N 2 -
O U T 1
O U T 1
I N 1 +
I N 1 +
I N 1 -
I N 1 -
DGND DVDD DTAPDSPEEDDWRITEDENABLEDMODEROFSIN11+IN11-IN10+IN10-IN9+IN9- DVDD DGND DGND DVDD DTAPDSPEEDDWRITEDENABLEDMODEROFSIN11+IN11-IN10+IN10-IN9+IN9- DVDD DGND
M U X O U T / O U T 0
M U X O U T / O U T 0
AGND AGND
AVDD AVDD
BIAS BIAS
SRIN SRIN
RSRLOAD RSRLOAD
RSRCLK RSRCLK
RSROUT RSROUT
RSRRST RSRRST
SSRLOAD SSRLOAD
SSROUT SSROUT
WSRCLK WSRCLK
WSROUT WSROUT
IN0- IN0-
IN0+ IN0+
AVDD AVDD
AGND AGND
DRS3 DRS4
9 mm 18 mm
4.2 mm
DRS4 flip-chip
P I N 1
O U T 0 + 5 7 A G N D
O U T 0 - 5 6 A G N D 21
O U T 1 - 5 5
I N 0 + 3
O U T 1 + D G N D
D G N D
D G N D 5 4
I N 0 - 4
O U T 2 + 5 3
I N 1 + 5
O U T 2 - 5 2
I N 1 - 6
O U T 3 - 5 1
I N 2 + 7
I N 2 - 8
O U T 4 + 4 9
I N 3 + 9
O U T 4 - 4 8 I N 3 - 1 0
O U T 5 - 4 7 I N 4 + 1 1
O U T 5 + 4 6 I N 4 - 1 2
O U T 6 + 4 5 I N 5 + 1 3
O U T 6 - 4 4 I N 5 - 1 4
O U T 7 - 4 3 I N 6 + 1 5
I N 6 - I N 7 + I N 7 - D G N D
1 6 1 7 1 8 1 9
AGND AGNDAVDD A2A3BIAS DTAP
REFCLK+REFCLK- PLLLCK
PLLOUTDSPEED
DWRITEDENABLEWSRIN AVDD
AVDD AGND
AGND
76 75 636465666768697071727374 62 61 60 59 58
O U T 7 + 4 2 4 1 4 0 3 9
OUT8-
35 36 37 38OUT8+
34 O-OFS
33DVDD DVDD RESET 32A131A030ROFS29RSRLOAD28SRCLK27SRIN26SROUT25
DVDD DVDD23DGND22IN8-21IN8+20
“Residual charge” problem
R
“Ghost pulse”
2% @ 2 GHz
“Ghost pulse”
2% @ 2 GHz
After sampling a pulse, some residual
charge remains in the capacitors on the next turn and can mimic wrong pulses
After sampling a pulse, some residual
charge remains in the capacitors on the next turn and can mimic wrong pulses
Solution: Clear before write
write clear Implemented
in DRS4 Implemented
in DRS4
Oct. 21st, 2008 IEEE/NSS Dresden 25
Sine Curve Fit Method
S. Lehner, B. Keil, PSI i
j
500
0 1024
0
2
2 2 ) )) min
sin(
( (
j i
j i
j j
j
ji o
i f a
y
yji : i-th sample of measurement j
aj fj j oj : sine wave parameters
i : phase error fixed jitter
“Iterative global fit”:
• Determine rough sine wave parameters for each measurement by fit
• Determine i using all measurements where sample “i” is near zero crossing
• Make several iterations
“Iterative global fit”:
• Determine rough sine wave parameters for each measurement by fit
• Determine i using all measurements where sample “i” is near zero crossing
• Make several iterations
Signal-to-noise ratio (DRS3!)
“Fixed pattern” offset error of 5 mV RMS
can be reduced to 0.35 mV by offset correction in FPGA
SNR:
1 V linear range / 0.35 mV = 69 dB (11.5 bits)
“Fixed pattern” offset error of 5 mV RMS
can be reduced to 0.35 mV by offset correction in FPGA
SNR:
1 V linear range / 0.35 mV = 69 dB (11.5 bits)
ANALOG OUTPUT [V]
BIN NUMBER
0 200 400 600 800 1000
0.48 0.49 0.5 0.51 0.52
Crosstalk from trigger signal
OCCURENCE
0 20 40 60 80 100 120 140 160 180 200
OCCURENCE
0 20 40 60 80 100 120 140 160 180 200
Offset Correction
Oct. 21st, 2008 IEEE/NSS Dresden 27
Global Timing Clock
signal
20 MHz Reference clock
PMT hit
Domino stops after trigger latency
8 inputs
shift register Reference
clock
domino wave
MUX
PLL jitter O(100ps) Timing difference between signals sampled by different chips need a global reference clock PLL jitter O(100ps) Timing difference
between signals sampled by different chips need a global reference clock
Datasheet
Oct. 21st, 2008 IEEE/NSS Dresden 29
Interleaved sampling
delays (200ps/8 = 25ps)
G. Varner et al., Nucl.Instrum.Meth. A583, 447 (2007) G. Varner et al., Nucl.Instrum.Meth. A583, 447 (2007)
6 GSPS * 8 = 48 GSPS
Possible with DRS4 if delay is implemented on PCB Possible with DRS4 if delay is implemented on PCB
Comparison with other chips
MATACQ D. Breton
LABRADOR G. Varner
DRS4
Bandwidth (-3db) 300 MHz > 1000 MHz 950 MHz
Sampling frequency 1 or 2 GHz 10 MHz … 3.5 GHz 500 MHz … 6 GHz Full scale range ±0.5 V +0.4 …2.1 V +0.1 … 1.1V
Effective #bits 12 bit 10 bit 12 bit
Sample points 1 x 2520 9 x 256 9 x 1024
Channel per board 4 N/A 64
Digitization 5 MHz N/A 30 MHz
Readout dead time 650 s 150 s 3 s – 370 s
Integral nonlinearity ± 0.1 % ± 0.1 % ± 0.05%
Radiation hard No No Yes (chip)
Oct. 21st, 2008 IEEE/NSS Dresden 31
On-line waveform display
click template
fit
pedestal histo
848
PMTs
“virtual oscilloscope”
“virtual oscilloscope”
Latch Latch Latch Latch
Constant Fraction Discr.
Latch
12 bit
Clock
+ +
MULT
Latch
0
&
<0
Delayed signal Inverted signal Sum