An Overview of Computational Grid Technologies Marlon Pierce Community Grids Laboratory Indiana University [email protected] Grids in I533 Context Workflow, Information, Sharing, Ontology Services Gaussian, Logical File PubChem, etc Data Mining Systems General Data General.
Download ReportTranscript An Overview of Computational Grid Technologies Marlon Pierce Community Grids Laboratory Indiana University [email protected] Grids in I533 Context Workflow, Information, Sharing, Ontology Services Gaussian, Logical File PubChem, etc Data Mining Systems General Data General.
An Overview of Computational Grid Technologies Marlon Pierce Community Grids Laboratory Indiana University [email protected] Grids in I533 Context Workflow, Information, Sharing, Ontology Services Gaussian, Logical File PubChem, etc Data Mining Systems General Data General Exec General File Services Services Services Web Service Core Specifications (Verbal description on next slide) Security, Reliability, etc Client Environments: Portals, Taverna, etc Grids in I533 Context I533 covers a diverse set of topics. (Web) Services are the core abstraction Execution Services: computational chemistry, data mining, text processing Data Services: PubChem, OGSA-DAI Information and metadata services: Ontologies, information discovery and sharing. Orchestration services (workflow): Taverna, BPEL, etc. Grids are collections of services with some glue Decentralized security, information system agreements (from monitoring to metadata), abstract execution protocols, etc. Service Oriented Architecture Brief History of Grids The term “Grid Computing” was coined by Dr. Larry Smarr, then director of NCSA, back in 1992. The original concept: computing power should be available on demand, for a fee. Just like the electrical power grid. Today, Grids are thought of as federations of services that span organizations. Grids are usually driven by science applications. Most core funding from the DOE, NSF, UK e-Science, and other scientific agencies in the EU, Japan, China, Korea, etc. These agencies all cooperate to some degree. DOD has its own version of things, the Global Information Grid, that is currently unrelated. IBM, MS, Oracle, Sun, etc have varying degrees of interest. ... Grid Computing Research Historically, grid computing has been targeted at simplifying access to high performance computing and giant scientific data sets. Example: NSF TeraGrid includes both hardware and software along with a common administration infrastructure. www.tergrid.org IU is one of the partners. There are many overviews of Grid computing. See for example Globus World presentations from 2004, 2005 Show lots of “gee whiz” pictures of big science problems using the Grid. Usually mention seti@home, and more recently, Google and Bittorent. These annoy me. Seti@home has nothing to do with Grid computing. Grid Computing Research Grid computing is large scale distributed computing research. “Middleware” It’s not the pervasive computing power Grid originally envisioned. As long as its research, we get to keep working on it. I’ll examine some key technologies for building a Grid installation, but not “the” Grid. There is no Grid! Dr. Dave Semeraro has his doubts. Some Desirable Grid Characteristics Grids are collections of services. Accessing computational facilities to run codes. Accessing remote databases, data warehouses and file systems. Transferring large data sets. Accessing remote instruments and sensors. Collections are created from multiple partners: Virtual Organizations Must support decentralized management. Common security abstraction layer Common information infrastructure Monitoring hardware and networks: required and solved Finding resources (i.e. “Semantic Grid”) Research 4Ever! Ex: TeraGrid combines NCSA, SDSC, IU, TACC, ORNL, Purdue, ... Generations Generation 1: UNIX daemons, command-line clients, protocolbased. Generation 2: Based on Web Service standards Authentication: required and solved. Authorization: Research 4Ever! Physical Organisation Virtual Organisation Physical Organisation Virtual Organization View of Deployment Virtual Organisation Virtual Organisation I. Foster, www.usipv6.com/ppt/fosteripv6andGridJune2003.ppt Physical Organisation Physical Organisation Grid Computing Software Examples Globus Toolkit (ANL, ISI) Job managers for science applications, Grid security frameworks, file management tools, etc. Condor (UW) A job scheduler and cycle scavenger optimally running applications on available resources. “High throughput computing” Storage Resource Middleware that provides a uniform interface for Broker connecting to heterogeneous data resources over a network and accessing replicated data sets. (SDSC) OMII UK e-Science program’s software arm. OGSA-DAI (U. Edinburgh) From UK e-Science program. Wraps XML and relational databases as Grid services and provides a workflow client library for query processing. Making Interoperable Tools There are a large number of Grid-related research projects and tools. They need some common protocols Two most important Not just wire protocols but also security procedure protocols. GSI: A global security system GRAM: a global method for executing remote operations. Grid standards and would-be standards are defined through the Global Grid Forum. We will concentrate on the Globus Toolkit in these lectures, but GSI and GRAM are important to several other projects. Condor, SRB, Sun Grid Engine, etc. Globus Services Landscape We’ll start here. www.griphyn.org/documents/document_server/uploaded_documents/doc--1515--GT4_GriPhyN.ppt Grid Security Infrastructure An overview Grid Security Infrastructure Keywords Public Key Infrastructure (PKI) Most Grid use asymmetric encryption keys Based on OpenSSL but with GSSAPI extensions Users have a public key and a private key. I encrypt with your public key and sign with my private key. Public keys can decrypt messages encrypted by private keys and vice versa. Public key: encrypts a message Private key: signs a message. Only you have the private key, so only you can generate that specific signature. Only you can unencrypt, and you know it came from me. PKI tools are part of Java’s SDK, so try them out. Certificate Authorities: establishing trust. Can you trust a public key? Yes, if you trust the signer. Large Grids have CAs. You can run your own with SimpleCA. CAs can be hierarchical. More Keywords: GSS API Generic Security Service API (GSSAPI) PKI is slow and symmetric keys are much faster. GSSAPI establishes a “context” between two communicators by sharing a secret symmetric session key. Very similar protocol to WS-SecureConversation Java implementation part of standard SDK release. Try it out, but it requires Kerberos GSI uses the GSSAPI to establish security contexts. We will see how to program clients in the next lecture. Single Sign On and Delegation Single Sign On A “Grid” implies that you can access lots of machines, but not necessarily anonymously. Charged for usage: supercomputer centers issue allocations. SSO is the ability to login once, get a ticket, and access many machines without constantly providing username and password. GSI is very similar to a somewhat older system called Kerberos, which you can still get. Delegation is the security concept that supports this. In practice, GSI handles delegation by resigning credentials. Take advantage of hierarchical CA organization for trust. Credential Delegation in GSI Butler et al, http://www.globus.org/alliance/publications/papers/butler.pdf A Public Key rainier.extreme.indiana.edu% more usercert.pem Bag Attributes localKeyID: 01 00 00 00 subject=/DC=org/DC=doegrids/OU=People/CN=Marlon Pierce 64229 issuer= /DC=org/DC=DOEGrids/OU=Certificate Authorities/CN=DOEGrids CA 1 -----BEGIN CERTIFICATE----MIIDJjCCAg6gAwIBAgICFMYwDQYJKoZIhvcNAQEFBQAwa TETMBEGCgmSJomT8ixk ----------------------[Stuff deleted]--------------------------------rlCbtrvQjT79qYIutfFSxwre52OV7p7f/3Uufj0wO4f4hq5Jt05uof QU -----END CERTIFICATE----- A Private Key rainier.extreme.indiana.edu% more userkey.pem Bag Attributes localKeyID: 01 00 00 00 1.3.6.1.4.1.311.17.1: Microsoft Enhanced Cryptographic Provider v1.0 friendlyName: 6f50c542f27d23ca349e371673b2ff8d_2586cc29-aa584f69-b023-bbcac12e129e Key Attributes X509v3 Key Usage: 10 -----BEGIN RSA PRIVATE KEY----Proc-Type: 4,ENCRYPTED DEK-Info: DES-EDE3-CBC,42533BEF0D5016EB xxQ8IF5UL1rFeWm4hbZBNYNB5TpHl8FqeRPOJk03fltcHyETdndP4GJqLNx HMcxk fy9As9v49HDSpHde/3jMu9L9q8LXSkG6WmFZgI35nsqjCTcstMdNnZ2P+jxp 9sk7 -----------------------[Stuff Deleted]----------------------------------------------------------1rts6i6ZDYFzsCpnu+rOsa0kolp+r0zRI0uiiIbOxU9jOtVTiHPsUg== -----END RSA PRIVATE KEY----- MyProxy Credential Repository Private keys are troublesome and dangerous. You need to put one on every machine that you may use for initial login. This increases chance it will get stolen. Can be placed on expensive smart cards. Solution: MyProxy Server On-line credential repository. Issues short-term keys to any client that knows the username and password. Very convenient for Web portal applications. J. Basney, http://grid.ncsa.uiuc.edu/myproxy/talks.html Grid as a Virtual Organization Now that we have an SSO, we can set this up across many different partner sites. Use one super-CA or at least mutually trust our partner CAs. This is the beginnings of a “Virtual Organization”. Real organizations contribute resources to the VO. VOs can be long-lived. That is, my org will trust messages signed by your CA. TeraGrid, Open Sciences Grid Ad-hoc Grids are more of a research issue. GSI in Action: GridFTP GSI is not a service itself. You use it to build secure services. These services inherit several capabilities They can authenticate to each other. Messages are secure You can delegate two remote services to take an action on your behalf. GridFTP is an example of a GSI enabled service. Encrypted, non-repudiated, tamper-proof, replay-proof, etc. File operations and transfers, based on standard IETF FTP protocol. Supports parallel TCP Supports striping: several GridFTP servers can act as a logical GridFTP server, each working on a different data subset. A nice summary: www.nesc.ac.uk/talks/563/Day2_1020_GridFTP.ppt GridFTP Third Party Transfer Cartoon GridFTP Client Credential “Move File X to Host B.” Delegated Credential Host A GridFTP Source Server Host B GridFTP Destination Server GridFTP Clients Command line clients globus-url-copy uberftp Programming interfaces: build your own client. Java and Python CoG Kits Java CoG reviewed next lecture. Grid Resource Allocation Management (GRAM) What Is GRAM? GRAM is a protocol for mapping generic user requests to specific actions. Heritage: must execute jobs on supercomputers. Interactive: use Unix fork. Queue Systems: PBS, LSF, Condor, Sun Grid Engine, etc. This must take place as the user. Allocation accounting, logging, general peace of mind at stodgy HPC centers. Note this is very different from e-Business. You don’t need a database account to buy something from Amazon. Pre-Web Service GRAM Components MDS client API calls to locate resources Client MDS: Grid Index Info Server Site boundary MDS client API calls to get resource info GRAM client API calls to MDS: request resource allocation and process creation. GRAM client API state change callbacks Globus Security Grid Resource Info Server Query current status of resource Local Resource Manager Infrastructure Request Create Gatekeeper Job Manager Parse RSL Library Yikes... Monitor & control Allocate & create processes Process Process Process GRAM Job Specifications The major purpose of GRAM is to execute one or more remote commands on the user’s behalf. Abstract UNIX shell, PBS, Condor, etc. So how do you specify the command? Pre-Web Service Grids (i.e. based on Globus 2) uses the Resource Specification Language (RSL). Web Service Grids (i. e. based on Globus 4) use the XML Job Description Language. GRAM Client Tools You can execute remote commands using clients tools We will develop Java clients next time. GT 2 command line examples (with RSL) globusrun: all purpose client globus-job-run: interactive jobs globus-job-submit: batch jobs globus-job-cancel: stop batch jobs GT 4 command line examples (with JDL) globusrun-ws: all purpose client globus-job-run-ws: interactive job submission globus-job-submit-ws: batch job submission globus-job-clean-ws: stop batch jobs. Sample RSL String The following runs the UNIX echo and the This is an argument to globusrun. Use this to execute “echo” and “mpi-hello”. (* Multijob Request *) +(&(executable = /bin/echo) (arguments = Hello, Grid From Subjob 1) (resource_manager_name = resource-manager-1.globus.org) (count = 1) ) ( &(executable = mpi-hello) (arguments = Hello, Grid From Subjob 2) (resource_manager_name = resource-manager-2.globus.org) (count = 2) (jobtype = mpi) ) A Very Simple Job Description <job> <executable>/bin/echo</executable> <directory>/tmp</directory> <argument>12</argument> <argument>abc</argument> <argument>this is an example string </argument> <environment> <name>PI</name> <value>3.141</value> </environment> <stdin>/dev/null</stdin> <stdout>stdout</stdout> <stderr>stderr</stderr> </job> http://www.globus.org/toolkit/docs/4.0/execution/wsgram/user-index.html#s-wsgram-user-commandline More Details on Job Submission The full Job Description Schema is here: http://www.globus.org/toolkit/docs/4.0/execution/wsgram/schema s/gram_job_description.html#SchemaProperties You can do much more complicated things. Run sequences of jobs. Stage files with GridFTP. Delegate jobs to other GRAMs. But this is controversial. Lots of people have worked on job management workflow systems. Several based on Apache Ant, for example. BPEL is the Web Service standard. Grids and Web Services Globus Services Landscape Now we are up here. www.griphyn.org/documents/document_server/uploaded_documents/doc--1515--GT4_GriPhyN.ppt Grids and Web Services The requirements of Grids are very similar to those of Service Oriented Architecture-based systems. Grid and Web Service integration began in 2002. Open Grid Services Architecture: “Physiology of the Grid” paper for Foster et al. Aborted start in Globus Toolkit 3, OGSI Current Globus Toolkit 4 much more successful. OGSA-DAI, Condor, and SRB all have Web Service interfaces. Many UK e-Science projects also follow a similar approach. Sometimes referred to as the “WS-I+” approach to distinguish it from the Globus/IBM approach. See http://grids.ucs.indiana.edu/ptliupages/publications/WebServi ceGrids.pdf See OMII releases GT4 GRAM Structure: WSRF/WSN Poster Child Service host(s) and compute element(s) Client Delegate Delegation Transfer request RFT File Transfer Compute element Local job control sudo GT4 Java Container GRAM GRAM services services GRAM adapter GridFTP FTP control Local scheduler User job FTP data GridFTP Remote storage element(s) www.griphyn.org/documents/document_server/uploaded_documents/doc--150VDS_1.4_Plans.2005.0429.ppt Reliable File Transfer: Third Party Transfer www.griphyn.org/documents/document_server/uploaded_documents/doc--150VDS_1.4_Plans.2005.0429.ppt Fire-and-forget transfer Web services interface Many files & directories Integrated failure recovery RFT Client SOAP Messages RFT Service GridFTP Server Master DSI Protocol Interpreter GridFTP Server Data Channel Data Channel IPC Link IPC Receiver Notifications (Optional) Protocol Interpreter Master DSI IPC Link Slave DSI Data Channel Data Channel Slave DSI IPC Receiver Grid Web Service Extensions WSDL and SOAP form the core of Grid services. WS-Addressing and WS-Security family are important. Globus and friends are working to extend core Web Service standards through OASIS. WS-Resource Framework (WSRF): modeling stateful resources. WS-Notification: Web Service version of one-tomany messaging. Stateful Resources and Grids Web Service Architectures and thus Grids are really message oriented, not RPC based. All state should be in the SOAP message. This allows messages to go through many SOAP intermediaries. Request/response does not really map to Grid requirements. Services may take hours or days to complete, so need callbacks. Services may need to push information to listeners. Ex: computational chemistry codes on TeraGrid, RFT for many TB of data. “Big file 1 is done, now move big file 2” Grid resources may also come and go. Instruments typically generate data at scheduled times. Down for maintenance, upgrades, reconfiguration, etc. WSRF and WS-Notification attempt to solve these Grid requirements. Web Service Resource Framework WSRF is a collection of WSDL specifications and associated messages. WS-Resource WS-ResourceProperties WS-ResourceLifetime WS-ServiceGroup WS-BaseFault See http://www.oasisopen.org/committees/tc_home.php?wg_abbr ev=wsrf WS-Resource The WS-Resource decouples a (stateful) resource from the Web Service that accesses it. For example, a database is a resource that may be accessed through a Web Service. The resource may be defined by metadata. Our database needs to provide clues to the type of data it contains. Need this for discovery. This metadata is contained in WS-ResourceProperties Goals of WS-ResourceProperties Provide a metadata property framework for describing resources. Provide a Web Service interface for performing operations on these properties. Query and retrieve properties. Update values on a resource (controversial). Subscribe to property changes. Use XML Schemas to hold WSDL message definitions that define the resource properties. Associate these messages with WSDL portTypes. The actual values of the Schema are in an XML document. Store it in memory, put it in a database, derive it at query time, ... This requires some understanding of WSDL and SOAP. Upcoming lecture will cover this. Goals of WS-ResourceLifetime Resources may have lifetimes. For example, your quantum chemistry calculation may take a few hours. This may be associated with a WS-Resource. WS-ResourceLifetime defines methods for Destroying a resource at some future time (and t=0 allowed). Learning the lifetime of a resource. Extending the lifetime of a resource. WS-Notification Core Specs WS-BaseNotification WS-Topics Specs for controlling publications and subscriptions of events (i.e. resource property changes.) Subscribers subscribe directly to publishers. Topics are used to organize messages. You may publish or subscribe to a topic rather than a specific resource endpoint. WS-BrokeredNotification Brokers decouple publishers from subscribers. WS-Notification Stateful resources will need to notify one or more listeners when their state changes. For example, a Web lecture has many events. Beginning and end of the lecture. Changes in slides. To my knowledge, no one has tried this. Real examples based on WS-GRAM, RFT. A Skeptical View of WSRF WSRF has several independent implementations. WSRF.NET (UV), Python (LBL), Perl (UK), C/C++ (ANL) ,... But is this critical mass? What about MS, Oracle, and other big Web Service players. OASIS specification approval is glacial. Many specs, even if approved, have died on the vine for lack of backing. Many more are a mess because of complicated dependencies. WS-Addressing has released many versions, screwing up many dependent specs. Competing specs exists. MS’s WS-Eventing, for example. “Semantic Grid” using an entirely different approach for metadata. RDF, OWL provide more natural modeling of metadata than treebased XML Schemas. Ignores UDDI as an information system. I ran out of room. Future Challenges Real time interaction Joy of use Intuitive user interface Global scalability 1000s of simultaneous users Addictive (Observation courtesy Prof. Fran Berman)