LCG LHC Computing Grid Project Status of High Level Planning LHCC 5 July 2002 Les Robertson, Project Leader CERN, IT Division [email protected] last update: 07/11/2015 04:07 les robertson.
Download ReportTranscript LCG LHC Computing Grid Project Status of High Level Planning LHCC 5 July 2002 Les Robertson, Project Leader CERN, IT Division [email protected] last update: 07/11/2015 04:07 les robertson.
LCG LHC Computing Grid Project Status of High Level Planning LHCC 5 July 2002 Les Robertson, Project Leader CERN, IT Division [email protected] last update: 07/11/2015 04:07 les robertson - cern-it 1 LCG Fundamental Goal of the LCG To help the experiments’ computing projects get the best, most reliable and accurate physics results from the data coming from the detectors Phase 1 – 2002-05 prepare and deploy the environment for LHC computing Phase 2 – 2006-08 acquire, build and operate the LHC computing service last update 07/11/2015 04:07 les robertson - cern-it-2 LCG Phase 1 - High-level Goals To prepare and deploy the environment for LHC computing development/support for applications – libraries, tools, develop/acquire the software for managing a distributed computing system on the scale required for LHC – the local computing fabric, integration of the fabrics into a global grid put in place a pilot service – frameworks, data management (inc. persistency), …….. common components “proof of concept” – the technology and the distributed analysis environment platform for learning how to manage and use the system provide a solid service for physics and computing data challenges last update 07/11/2015 04:07 produce a TDR describing the distributed LHC computing system for the first years of LHC running maintain opportunities for re-use of developments outside the LHC programme les robertson - cern-it-3 Background & Environment LCG o o the detailed requirements of the applications component of the project is being defined with experiments – this work started early this year, but it will be another 18 months before the full scope is fully defined basic requirements for the computing facility from the report of the LHC Computing Review - February 2001 o evolving due to review of trigger rates, event sizes, experience with program prototypes, ---- and will continue to change as experience is gained with applications and the analysis model is developed o technology is in continuous evolution – o driven by market forces (processors, storage, networking, ..) o and by government-funded research (grid middleware) we have to follow these developments - remain flexible, open to change last update 07/11/2015 04:07 les robertson - cern-it-4 Background & Environment (ii) LCG o Regional Computing Centres – o impressive experience of providing distributed services to LHC experiments – must now learn how to collaborate much more closely to provide the integrated service promised by the Grid vision o established user communities – wider than LHC – many external constraints o project funding is from many sources, each with its own constraints The project is getting under way in an environment where – o there is already a great deal of activity o requirements are changing as understanding and experience develop o some fundamental parts of the environment are evolving more or less independently of the project and LHC last update 07/11/2015 04:07 les robertson - cern-it-5 Funding Sources LCG Regional centres – providing resources for LHC experiments in many cases facility shared between experiments (LHC and •The project hasother differing degrees of management non-LHC) and maybe with sciences control and influence Grid projects – suppliers and maintainers of middleware • Some of the funding has been provided because CERN personnel materials HEPand & LHC are seen -asincluding computingspecial ground-breakers contributions for from Gridmember technologyand -- observer states Experiment resources – Industrial contributions -- so we must deliver for LHC and show the people participating in common relevance for otherapplications sciences developments, data challenges, .. -- also must be sensitive to potential computing resources provided through Regionalopportunities Centres for non-HEP funding of Phase 2 last update 07/11/2015 04:07 les robertson - cern-it-7 The LHC Computing Grid Project Organisation LCG Common Computing RRB LHCC Reports (funding agencies) Reviews Resources Project Overview Board Project Execution Board last update 07/11/2015 04:07 Requirements, Monitoring Software and Computing Committee (SC2) les robertson - cern-it-8 LCG SC2 & PEB Roles SC2 includes the four experiments, Tier 1 Regional Centres SC2 identifies common solutions and sets requirements for the project may use an RTAG – Requirements and Technical Assessment Group limited scope, two-month lifetime with intermediate report one member per experiment + experts PEB manages the implementation organising projects, work packages coordinating between the Regional Centres collaborating with Grid projects organising grid services SC2 approves the work plan, monitors progress last update 07/11/2015 04:07 les robertson - cern-it-9 LCG last update 07/11/2015 04:07 SC2 Monitors Progress of the Project Receives regular status reports from the PEB Written status report every 6 months milestones, performance, resources estimates time and cost to complete Organises a peer-review about once a year presentations by the different components of the project review of documents review of planning data les robertson - cern-it-10 LCG RTAG status in application software area data persistency software support process mathematical libraries detector geometry description Monte Carlo generators applications architectural blueprint in fabric area mass storage requirements in Grid technology and deployment area Grid technology use cases Regional Centre categorisation finished – 05apr02 finished – 06may02 finished – 02may02 started starting started finished – 03may02 finished – 07jun02 finished – 07jun02 Current status of RTAGs (and available reports) on www.cern.ch/lcg/sc2 last update 07/11/2015 04:07 les robertson - cern-it-11 LCG Project Execution Organisation Four areas – each with area project manager last update 07/11/2015 04:07 Applications Grid Technology Fabrics Grid deployment les robertson - cern-it-12 Applications Area LCG Area manager – Torre Wenaus Open weekly applications area meeting Software Architects Committee process for taking LCG-wide software decisions Importance of RTAGs to define scope Common projects everything that is not an experiment-specific component is a potential candidate for a common project important changes are under way new persistency strategy evolution from Geant 3 towards Geant 4 and Fluka good time to define common solutions, but there will be inevitable delays in agreeing requirements, organising common resources long term advantages in use of resources, support, maintenance last update 07/11/2015 04:07 les robertson - cern-it-13 LCG Applications Area Key work packages Object persistency system agreement on hybrid solution (root, Relational Database Management System) Software process Common frameworks for simulation and analysis Proposal on event generation RTAG Architectural blueprint RTAG started – opening the way to RTAGs/work on analysis components? last update 07/11/2015 04:07 Grid middleware requirements defined les robertson - cern-it-14 Candidate RTAGs LCG (from launch workshop) Simulation tools Event processing framework Detector description & model Distributed analysis interfaces Conditions database Distributed production systems Data dictionary Small scale persistency Interactive frameworks Software testing Statistical analysis Software distribution Detector & event visualization OO language usage Physics packages LCG benchmarking suite Framework services Online notebooks C++ class libraries last update 07/11/2015 04:07 Completing the RTAGs - setting the requirements – will take about 2 years les robertson - cern-it-15 LCG What is a Grid? last update: 07/11/2015 04:07 les robertson - cern-it 16 The MONARC Multi-Tier Model (1999) LCG Tier 0 - recording, reconstruction CERN IN2P3 FNAL RAL Uni n Lab a Tier2 Uni b Department Lab c Desktop MONARC report: http://home.cern.ch/~barone/monarc/RCArchitecture.html last update 07/11/2015 04:07 les robertson - cern-it-17 [email protected] Tier 1 – full service LCG a Grid virtual LHC Computing Centre Grid TheBuilding Collaborating Computer Centres Alice VO CMS VO last update 07/11/2015 04:07 les robertson - cern-it-19 LCG Virtual Computing Centre The user --sees the image of a single cluster does not need to know - where the data is - where the processing capacity is - how things are interconnected - the details of the different hardware and is not concerned by the local policies of the equipment owners and managers last update 07/11/2015 04:07 les robertson - cern-it-20 LCG Grid Technology Area Area Manager – Fabrizio Gagliardi Ensures that the appropriate middleware is available Dependency on deliverables supplied and maintained by the “Grid projects” Many R&D projects in Europe and US with strong HEP participation/leadership Immature technology – evolving, parallel developments – conflict between new functionality and stability scope for divergence, especially trans-Atlantic It is proving hard to get the first “production” grids going - from demonstration to service Can these projects provide long-term support and maintenance? HICB (High Energy & Nuclear Physics Intergrid Collaboration Board) GLUE - recommendations for compatible US-European middleware LCG will have to make hard decisions on middleware towards the end of this year last update 07/11/2015 04:07 les robertson - cern-it-21 Fabric Area LCG Area Manager – Bernd Panzer Tier 1,2 centre collaboration develop/share experience on installing and operating a Grid exchange information on planning and experience of large fabric management look for areas for collaboration and cooperation Grid-Fabric integration middleware Technology assessment likely evolution, cost estimates CERN Tier 0+1 centre Automated systems management package Evolution & operation of CERN prototype – integrating the base LHC computing services in the LCG grid last update 07/11/2015 04:07 les robertson - cern-it-22 LCG Grid Deployment Area Grid Deployment Area manager – not yet appointed Job is to set up and operate a Global Grid Service stable, reliable, manageable Grid for – Data Challenges and regular production work integrating computing fabrics at Regional Centres learn how to provide support, maintenance, operation Grid Deployment Board – Mirco Mazzucato Regional Centre senior management Grid deployment standards and policies authentication, authorisation, formal agreements, computing rules, sharing, reporting, accounting, .. first meeting in September last update 07/11/2015 04:07 les robertson - cern-it-23 LCG Grid Deployment Teams – the plan suppliers’ integration teams provide tested releases common applications s/w Trillium - US grid middleware certification, build & distribution LCG infrastructure coordination & operation grid operation fabric operation regional centre A last update 07/11/2015 04:07 fabric operation regional centre B DataGrid middleware user support call centre … les robertson - cern-it-24 fabric operation regional centre X fabric operation regional centre Y LCG Status of Planning Launch workshop in March 2002 – established broad priorities Establishing the high-level goals, deliverables and milestones Beginning to build the PBS and WBS – as the staff builds up and the detailed requirements and possibilities emerge Detailed planning will take some time - ~end of 2002, beginning 2003 - many things are not yet clear Applications requirements – need further work by SC2 (RTAGs) Grid Technology – negotiation of deliverables from Grid projects Grid Deployment – agreements with Regional Centres (GDB) This is computing – success requires flexibility – getting the right balance between reliable, tested, solid technology exploiting leading edge developments that give major benefits early recognition of de facto standards last update 07/11/2015 04:07 les robertson - cern-it-25 LCG Proposed High Level Milestones last update: 07/11/2015 04:07 les robertson - cern-it 26 Tactics LCG First data is in 2007 – LCG should focus on long-term goals – the difficult problems of distributed data analysis: unpredictable (chaotic) usage patterns; masses of data; batch and interactive reliable, stable, dependable services LCG must leverage current solutions, set realistic targets short term (this year): use current (classic) solutions for physics data challenges (event productions) consolidate (stabilise, maintain) middleware – and see it used for physics learn what a “production grid” really means by working with the Grid R&D projects get the new data persistency prototype going medium term (next year): make a first release of the persistency system Set up a reliable global grid service with limited but well understood functionality not too many nodes, but in three continents Stabilise it Grow it to include all active Tier2 centres, with support for some Tier 3 centres last update 07/11/2015 04:07 les robertson - cern-it-27 LCG Proposed Level 1 Milestones M1.1 - June 03 First Global Grid Service (LCG-1) available -- this milestone and M1.3 defined in detail by end 2002 M1.2 - June 03 M1.3a - November 03 M1.3b - November 03 M1.4 - May 04 Hybrid Event Store (Persistency Framework) available for general users LCG-1 reliability and performance targets achieved Distributed batch production using grid services Distributed end-user interactive analysis -- detailed definition of this milestone by November 03 M1.5 - December 04 “50% prototype” (LCG-3) available -- detailed definition of this milestone by June 04 M1.6 - March 05 Full Persistency Framework M1.7 - June 05 LHC Global Grid TDR last update 07/11/2015 04:07 les robertson - cern-it-28 Proposed Level 1 Milestones LCG Hybrid Event Store available for general users applications Distributed production using grid services Distributed end-user interactive analysis Full Persistency Framework Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 2002 2003 2004 grid 2005 LHC Global Grid TDR “50% prototype” (LCG-3) available LCG-1 reliability and performance targets First Global Grid Service (LCG-1) available last update 07/11/2015 04:07 les robertson - cern-it-29 Major Risks LCG Complexity of the project – Regional Centres, Grid projects, experiments, funding sources and funding motivation Grid technology immaturity number of development projects US-Europe compatibility Phase 1 funding at CERN about 60% of materials funding not yet identified includes the investments to prepare the CERN Computer Centre for the giant computing fabrics needed in Phase 2 but the personnel requirements are largely fulfilled by special contributions last update 07/11/2015 04:07 les robertson - cern-it-30 LCG LCG and the LHCC LCG Phase 1 was approved by Council deliverables are – common applications tools and components – TDR for Phase 2 computing facility We do not have an LHCC-approved proposal as a starting point LHCC Referees have been appointed During the rest of this year, while the detailed planning is being done, we need some discussion with the referees to Ensure that the LHCC has the background and planning information it needs Agree on the Level 1 milestones to be tracked by the LHCC Agree on reporting style and frequency last update 07/11/2015 04:07 les robertson - cern-it-32