Transcript Pr cis
Application Implementation on the Cell B.E Processor: Techniques Employed John Freeman, Diane Brassaw, Rich Besler, Brian Few, Shelby Davis, Ben Buley Black River Systems Company Inc. 162 Genesee St. Utica, NY 13501 IBM Cell BE Processor • Cell BE processor boasts nine processors on a single die • • 1 Power® processor 8 vector processors • Computational Performance • • 205 GFLOPS @ 3.2 GHz 410 GOPS @ 3.2 GHZ • A high-speed data ring connects everything • 205 GB/s maximum sustained bandwidth • High performance chip interfaces • 25.6 GB/s XDR main memory bandwidth Excellent Single Precision Floating Point Performance Experience, Performance, Tools & Techniques •Share Impressions and Experience From the Past ~2 Years •Development Tools and SDKs •Parallelization Techniques •Approaches for Loop Unrolling •Use of C++ templates •SPE Assembly Programming •SPE Memory Management •Performance Metrics and Tools IBM’s ASMVis Tool What Worked, What Didn’t and What Level of Performance was Achieved