Minutes 09/05/00 (Birmingham)
Present: A. Baird, Y. Fleming, S. Kolya, D. Mercer,
P. Newman, D. Sankey, A. Schoening, R. Staley
Project Status and Time Schedules
Andre' reported on the outcome of the May PRC meeting. The PRC
have formally approved the level 2 feasibility document. Level 1
is accepted for the time being,
since there is no reason to stop it.
The PRC was not totally convinced by our
preliminary feasibility report , but accepted that things
are not yet well defined. We are asked to report back to the
October PRC as explained below.
The time scheduling of the project was reviewed in light of recent
developments. The main deadlines are ....
- October 2000 (DESY PRC):
Algorithm fully simulated in Quartus.
Final decision on level 1 trigger viability.
- January 2001:
First prototypes for testing in UK.
- March 2001 (HERA Machine studies):
Prototypes installed at DESY.
- June 2001 (Luminosity begins):
Start of Mass Production.
- December 2001:
Full system operational.
It was agreed that we urgently need a much more detailed time
schedule (Gant chart?),
covering all aspects of the project. `Micro-milestones' should be
better defined by working back from the completion date and
making reasoned estimates of how long everything will take.
Dave was asked to try and produce such a
chart in consultation with all concerned.
Front End Module Design
It was strongly felt that design of the Front End Module should
begin more or less immediately.
- This implies that the level 1
trigger design (fairly undefined so far) should be kept as
separate as possible from the ADC and segment finding
(fairly well defined). This probably implies that the generic
module cannot be used for the level 1 trigger functionality.
The necessary size of FPGA and the I/O are basically known for the
the Q/t, segment finding etc and much of the work can be done
without detailed knowledge of the algorithm. It should therefore
be possible to start
first design / schematics already. It is estimated that this will
take 6 months. Adam said that it should be possible to get going
very soon, but first Brian Claxton has to finish other tasks. The
end of May was given as a speculative date for Brian to become
available.
Adam explained the present plan for the FEM and much
discussion ensued (see section below on System
Architecture). The following (at least!) became clear
by the end of the meeting.
- Input in analogue form through front of module.
- Each module contains 30 ADCs working at 80 MHz (8 bits).
- A farm of FPGAs sits behind the ADCs. We expect to
use 1 FPGA (20k600) per triplet of wires.
- All digital data I/O internal to the L1 FTT will be
moved accross the backplane.
- An extra FPGA may be needed to control the readout etc.
- Two parallel LVDS outputs will feed the level 1 trigger
and the level 2 FTT respectively.
Algorithm Simulation
Yves and Richard have been working on simulations of various
parts of the algorithm using the Altera Quartus software.
There is presently a problem with the Windows NT PCs which are
running the software. - They regularly crash with a parity
exception when running the software. This is being investigated
and should not pose a long term problem.
It is still felt that
the Quartus / VHDL design is likely to be relatively trivial
compared to finalising the algorithm concept. Even so, at
least Yves should attend a course in Quartus design etc.
Yves (and Richard?) were asked to look
into available courses and decide which they think would be
most useful.
System Architecture
The overall system architecture was discussed at length,
starting from an update of the
diagram from Adam showing a possible crate layout and
necessary bus connections. This scheme is based on the concept
of a distributed level 1 trigger - the system is divided into
6 regions in phi and the segments within a phi region are
moved over a dedicated custom backplane into a local trigger
module (same card as the FEM?). The local trigger module builds a
(16 x 10) Kappa-Phi histogram, which is then passed to a further
card for final trigger decisions, where the full (16 x 60)
histogram is built. This scheme ensures that the necessary
high data flow only takes place in local regions and over short
distances (we estimated a maximum of around 1 GByte / second
to pass all information from one phi region to the trigger
module at each bunch crossing). Connections between the different
phi sectors would only be needed to pass neighbouring
shift registers
and the information from the edgemost phi bin for the trigger.
This could be done by linking all modules within a single trigger
layer (e.g. all CJC2 modules) in a ring on separate buses.
Though the distributed trigger idea is appealing from the point
of view of minimising data movement, there are also disadvantages.
In particular, it was felt that the proposed re-use of the
generic module for the trigger as well as the FEM would
unnecessarily delay the design of the FEM. The required custom
backplane in this scenario was also felt to be a bit awkward.
It was therefore decided that
we should proceed directly
with the FEM design and separately try to
deal with the large data flows involved in accumulating all
20 MHz track segments in one place for the trigger module.
We have previously thought about solutions involving
merger cards . It may also be possible to use the
algorithm (maybe even the final design) that has already
been propsed to solve the
same problem at
level 2 .
Algorithm Design
A `minimal path'
was defined,
ensuring that the level 1 trigger problem does not interfere
with the more fundamental task of providing track segments
to level 2. For the first design at least, we therefore assume
that ....
- The main segment finding task (20MHz and 80MHz)
does not begin until AFTER L1Keep.
- A level 1 trigger is still planned in which the 20 MHz
segment finding takes place `on the fly' (pivot elements etc).
This may or may not use the same FPGA space as the post L1Keep
segment finding.
The first priority is therefore to build a working algorithm
to perform the 20MHz and 80MHz segment finding based on frozen
shift registers with the bunch crossing of origin known a priori.
This should later be generalised to also deal with 20MHz segment
finding prior to L1Keep.
Andre' introduced his ideas for a generalisation of the solution
proposed in the
PRC feasibility study , such that a similar algorithm takes
place on the fly to produce segments for the level 1 trigger.
His talk is
here
and the block diagram of the algorithm is
here . In this scheme, a full shift register analysis is
performed at each bunch-crossing, so the pivot layer technique is
not used.
Each shift register is divided into two (or more if
necessary) and each combination of half registers from the three
layers forms the input to a CAM (<=31 bits). The geometry probably
makes some of the half-register combinations safely ignorable. An
unencoded scan forms the input for the level 1 trigger algorithm.
Encoded mode is used to pass the information on to the 80MHz
refinement step. The level 1 algorithm looks very quick in this
scenario. It would probably be the responsibility of the level 1
trigger decision unit to determine the event T0 (e.g. by finding
the peak in a histogram of number of valid segments against time).
It would therefore not be necessary to pass information on the
duration of a valid segment around. - All valid segments are
found at each clock cycle.
The multiple match mode can handle 2 or more
valid signals in the same
CAM, but it is probably still desireable to implement some load
ballanceing in the CAMs by carefully choosing which register
elements get sent to which CAM. To help decide this, it would
be very useful to make a table of which valid masks correspond
to which kappa-phi values. Yves was
asked to produce this as an extention to his mask generation
code.
In any case, this proposal seems
to be a fairly natural extention to the post L1Keep plan. It
therefore sits nicely as something to investigate once the basic
post L1Keep plan is simulated in Quartus.
Andre also pointed out an error from the PRC feasibility study
document. When estimating the numbers of valid masks in the
shift registers (table 1),
we gave the numbers of distinct valid
patterns rather than the total numbers of valid masks. - These
were clearly the wrong numbers for the post L1Keep frozen
shift register `minimal' solution (slapped wrist for Paul!)
The number of valid masks thus increases by typically a factor
of 3 compared to our estimates in the feasibility study. -
We still expect to fit the algorithm into one 20k600
FPGA per wire triplet, but the gate usage now looks like 80-90%,
which starts to look a little more tricky!
Next meetings
No dates have yet been defined for future meetings, though we
clearly need to spend a longer period discussing the issues above
as soon as possible. Dave is asked to arrange next meeting dates
in parallel with defining the detailed project schedule. As things
become better defined, it should
eventually be possible to have smaller
scale meetings on specific aspects of the project.
Compiled by P. Newman, 11/5/00