01_Main_Steps_to_Gri.. - TR-Grid

Download Report

Transcript 01_Main_Steps_to_Gri.. - TR-Grid

Main steps to gridify an Application
Miklos Kozlovszky
MTA SZTAKI
[email protected]
Outline
This talk gives a high-level view of application development
in Grids
Contents





Review of concepts: grids and grid applications
Types of Grid applications
Challenges to researchers who write applications
General steps of application gridification
Practical: Preparing and submitting a job based on a simple
non-grid application.
Acknowledgements
 Gergely Sipos, SZTAKI, Hungary
 Mike Mineter, University of Edinburgh, UK, “Application
Development and Aspects of gLite 3.0”, 2006
 Vladimir Dimitrov, IPP-BAS, Bulgaria, “Practical: Porting applications
to the GILDA grid”, Introduction to Grid Computing, EGEE and Bulgarian Grid
Initiatives Plovdiv, 2006
 Douglas Thain, The University of Notre Dame
 GILDA team
What is a grid application?
Definition
Software that interacts with grid services to
achieve requirements that are specific to a
particular VO or user.
Grids: a foundation for e-Research
Enabling a whole-system approach
Collaborative research / engineering / public service …
computers
software
Grid
sensor nets
instruments
Diagram derived from
Ian Foster’s slide
colleagues
Shared data
archives
The vital layer
Application
Application
toolkits, …..
Higher-level grid
services (brokering,…)
Basic Grid services:
AA, job submission, info, …
Where computer science meets
the application communities!
VO-specific developments built
on higher-level tools and core
services
Makes Grid services useable by
non-specialists
Grids provide the compute and
data storage resources
Production grids provide these
core services.
Types of
grid applications
Complexities of grid applications
1.
2.
Simple jobs – submitted to WMS to run in batch mode
Job invokes grid services
•
•
•
•
•
3.
To read & write files on SE
Monitoring
For outbound connectivity (interactive jobs)
To manage metadata
…
Complex jobs
•
An environment controls multiple jobs on users’ behalf
•
•
•
•
High-level services
Portals with workflow
Software written for the VO (or by the user)
…
Invocation of applications
From the UI
 Command Line Interfaces / Scripts
 APIs
 Higher level tools
From desktop Windows applications
 Use Grids without awareness of them!
 But gLite not (yet) fully supporting Windows
(more info: http://jessica.trigrid.it/grid2win/)
From portals
 For recurring tasks: “core grid services” as well as application
layer
 Accessible from any browser
 Tailored to applications
 Different portal solutions, and wide range of capabilities.
Second part of this course: P-GRADE Portal
Multi-Grid P-GRADE Portal
Different jobs of a
EGEE Grid
workflow can be
executed in different e.g. VOCE
grids
P-GRADE-Portal
The portal can be
connected to multiple grids
UK NGS
London
Rome
Athens
Characteristics of VOs
What is being shared?
 resources of storage and/or compute cycles
 software and/or data
Distinct groups of developers and of users?
 Some VOs have distinct groups of developers and users…

Biomedical applications used by clinicians,….
 …. Some don’t

Physics application developers who share data but write own
analyses
 Effect: need to



hide complexity from the 1st type of VOs
expose functionality to 2nd type of Vos
many security issues
Different Goals for App. Development
 I need richer functionality


MPI, parametric sweeps,…
Data and compute services together…
I provide an application for (y)our research
 How!?




Pre-install executables ?
Hosting environment?
Share data
Use it via portal?
We provide applications for (y)our research
 Also need:



Coordination of development
Standards
…
Engineering challenges increasing
I need resources for my research
Challenges to researchers who write
grid applications
Challenges
Research software is
often
Grid applications Grid application developers are
In a research environment
are often used
 Created for one
 by a VO
user: the
 Without support
developer
from developer
 Familiarity makes
 In new contexts
it useable
and workflows
 Short-term goals:
Used until papers
are written and
Need expertise in:
then discarded
Yet their s/w must have:
Stability
Documentation
Usability
Extendability
i.e. Production quality
• software engineering
• application domain
• grid computing
Consequences
Team work!
Engaged in world-wide initiatives – reuse, don’t make your
own! Cross disciplines for solutions.
From research to production software: ~5 times the effort.
 “80% of the time for last 10% of the functionality & reliability”
Standardisation is key
 For re-use, for dynamic configuration of services,..
 Both for middleware and domain specific
Need to follow a deliberate development process
 Waterfall? Rapid prototyping?
 Requirements engineering, design, implementation, validation,
deployment
 Engaged with the user community
More about gLite services
gLite Grid Middleware Services
CLI
API
Access
Authorization
Information &
Monitoring
Auditing
Authentication
Security Services
Metadata
Catalog
File & Replica
Catalog
Storage
Element
Data
Movement
Application
Monitoring
Information &
Monitoring Services
Accounting
Job
Provenance
Package
Manager
Connectivity
Computing
Element
Workload
Management
Data Management
Workload Mgmt Services
More about gLite services
During today the focus is on:
 New functionality in gLite 3.0 Workload Management
 Accessing data on SEs


Can have massive files, too big to copy
How to access these?
 Management of metadata


May have many thousands of files
Need to access and re-use based on characteristics… more
than by their logical file names.
 Monitoring of applications


May be running many long jobs
What’s happening?!
Workload Management System
Helps the user accessing computing resources
 resource brokering
 management of input and output
 management of complex workflows
Support for MPI job even if the file system is not
shared between CE and Worker Nodes (WN) –
easy JDL extensions
Web Service interface via WMProxy
WMProxy
WMProxy is a SOAP Web service providing access
to the Workload Management System (WMS)
Job characteristics specified via JDL
 jobRegister




create id
map to local user
and create job dir
register to L&B
return id to user
 input files transfer
 jobStart




register sub-jobs to
L&B
map to local user
and create sub-job
dir’s
unpack sub-job files
deliver jobs to WM
Complex Workflows
Direct Acyclic Graph (DAG) is a
set of jobs where the input,
output, or execution of one or
more jobs depends on one or
more other jobs
A Collection is a group of jobs
with no dependencies
nodeA
nodeB
 basically a collection of JDL’s
nodeC
nodeE
nodeD
A Parametric job is a job having one or more attributes in the
JDL that vary their values according to parameters
Using compound jobs it is possible to have one shot
submission of a (possibly very large, up to thousands) group
of jobs
 Submission time reduction



Single call to WMProxy server
Single Authentication and Authorization process
Sharing of files between jobs
 Availability of both a single Job Id to manage the group as a
whole and an Id for each single job in the group
Basic tasks
while Porting applications to the Grid
1.
2.
Developing a non-grid application (or inheriting and updating
an ancient one);
Go/no-Go decision about gridification
•
•
Is it suitable for the Grid environment?
“Cost/profit” analysis / feasibility study
Typical Questions Groups:






Current structure of the application
Dependencies of the application
Available resources (manpower, knowledge, etc.)
Requirements for the gridified application
Expected impact of gridification
Requirements for the grid infrastructure
More info: Application Description Template
http://www.lpds.sztaki.hu/gasuc/?m=4
Basic tasks while Porting applications to
the Grid (contd.)
3.
Grid environment access
• Requesting Certificates / VO membership
• Accessing Grid environment
•
•
4.
Appropriate VO UI machine account for command line
Portal GUI account;
Executing, Testing and Debugging the
application;
• Testing the non-grid application (debugging in Grid
environment is a hard task), creating use cases for
single (non-grid) runs;
5.
6.
Constructing the job suite – JDL (Job Description
Language) files, executables, auxiliary scripts
and input/output data files;
Submitting the job to the Grid as small-scale
pilot application;
Basic tasks while Porting applications
to the Grid (contd.)
7.
8.
Executing, Testing and Debugging the pilot
application;
IF something goes wrong
THEN GOTO 4;
9.
IF everything seems to work
THEN increase the scale of the application (increase
problem size, amount of used resources);
10.
Optimizing the grid application;
Porting legacy code applications
Code from the past, maintained because it works
Often supports business critical functions
Not Grid enabled
What to do with legacy codes in service Grids?
•
Bin them and reimplement them as
grid services
•
Reengineer them  who knows the
source code?
Port them onto the Grid with
minimum user effort
•
Porting legacy code applications
with GEMLCA
GEMLCA – Grid Execution Management
for Legacy Code Architecture
Objectives
•
To deploy legacy code applications as
Grid services without reengineering the
original code and minimal user effort
•
To support the development and
execution of “legacy code” grid service
workflows
•
To make these functions available from
a Web Portal
GEMLCA
GEMLCA
P-GRADE
Portal
The GEMLCA-view of a legacy code
Any code that correspond to the following model can be
exposed as Grid service by GEMLCA:
Input data
(files, command
line params, env.
vars)
Legacy code
Output data
(files)
...
Perform computation
Communicate over the network
Query databases
Call shared libraries
Legacy Application example
Workflow to analyse road traffic
Manhattan road
network generator
Traffic simulators
Analyser
Another Legacy Application
example
Molecular Dynamics Study of Water Penetration in Staphylococcal Nuclease
using CHARMm
•
Analysis of several
production runs with
different parameters
following a common heating
and equilibrium phase
Practical tools @ gridification
On-Line Monitoring
Mercury monitor
- to debug/optimize
and visualize
parallel jobs.
- both at the
workflow and job
levels
Practical tools @ gridification
(contd.)
Hiding Grid remote storage system from a legacy application
Parrot
 a handy tool for attaching old programs to new storage
systems
 EGEE (gLite module) Data Access: GFAL, LFN, GUID, SRM,
RFIO, DCAP, and LFC
 does not require any special privileges, any recompiling to
existing programs
 For example, an anonymous FTP service is made available to vi
like so:
parrot vi /anonftp/ftp.cs.wisc.edu/RoadMap
Or example with gsiftp:
$./parrot app_name /gsiftp/<gsiftp usr without gsiftp://>
(More info: http://www.cse.nd.edu/~ccl/software/manuals/parrot.html)
Common problems and obstacles
 The candidate-applications for porting usually are
huge and complex.
 Some of them use low-level network functions
and/or parallel execution features of a specific
non-grid environment.
 Usage of non-standard or proprietary
communication protocols.
 The complete source code might not be available,
might not be well documented or its “out-of-host”
usage is restricted by a license agreement.
Common obstacles (continued)
 The application might be written in many different
programming languages – C, C++, C#, Java,
FORTRAN etc. or even mixture of them.
 Applications may depend on third-party libraries or
executables which are not available by default on
some Grid worker nodes.
 Some application features could cause unintentional
violation of Grid Acceptable Use Policies (Grid AUP).
 Furthermore, the application can have hidden security
weakness which will be very dangerous in case of
remote Grid job execution.
Common obstacles (continued)
 Some applications are pre-compiled or optimized for
using on a machine with particular processor(s) only –
Intel, AMD, in 32-bit or 64-bit mode, etc. But the Grid
is heterogeneous!
 The application may contain serious bugs, which have
never been detected while running in a non-grid
environment.
 Finally, the formal procedure for accepting a new
application to be ported to a Grid for production or
even experimental purpose is not simple.
Therefore, the porting of an arbitrary application to
Grid could be very long, difficult and expensive
process!
GASUC
Grid Application Support Centre
AIM of the Center




Provides assistance to current and future grid users and grid application
developers to port legacy algorithms and applications onto large-scale grid
infrastructures.
Creates a bridge between Academy/Industry and Grid
Supports and develops Grid applications for Grid communities
Disseminates knowledge of concepts and tools used on Globus Toolkit 2,
Globus Toolkit 4, LCG-2 and gLite grid environments.
Established in January 2007
Full EGEE support


Announced officially in EGEE
Follow up in EGEE3

For non-commercial applications: free
Cost model
http://www.lpds.sztaki.hu/gasuc/
Grid Application Development
Support Model
8 steps support model








Contact phase
Pre-selection phase
Analysis phase
Planning phase
Prototyping phase
Testing phase
Execution phase
Dissemination and feedback phase
In the practical session
The application called MatrixDemo will be ported
and executed in SGDEMO grid environment. (The
program is borrowed from the “EGEE
summer school” at MTA SZTAKI, 2006.)
MatrixDemo is written in C programming language
SGDEMO environment (gLite based) is
supporting C, so porting the C or C++ programs
is easy … hopefully
MatrixDemo program
MatrixDemo program performs some matrix
operations – inverting, multiplying, etc.
Usage:
MatrixDemo has command line interface which
accepts several arguments. Starting the program
without any argument will display a short help.

Example:
MatrixDemo I V
This will Invert (I) the matrix defined in the file
named INPUT1 and will store the result in the
file OUTPUT with verbose details (V).
MatrixDemo program (continued)




Prerequisites:
File MatrixDemo.c – the source code of the program.
Files INPUT1 and INPUT2 – they contain matrix data in
the following text format:
rows, columns, cell1, cell2, cell3 …
Where rows is an integer representing the number of
rows. columns represents number of columns, and
cell1, cell2 etc. are the cells of the matrix, floating
point numbers separated by commas (,).
A standard C compiler and linker. In this case we will
use GNU C (gcc) already installed.
File MatrixDemo.jdl – a prepared JDL (Job Description
Language) file.