Transcript paper.ppt
Parallel Apps November 6, 2000 Hyang-Ah Kim Brenda Liu SoYoung Park Outline 11/6/00 Introduction Barnes background Barnes optimizations Ocean background Ocean optimizations Conclusion Parallel Applications 2 Introduction Minimum problem size Scale application performance Programming models SAS CC-NUMA Parallel efficiency? (speedup 11/6/00 over uniprocessor) / p Parallel Applications 3 Barnes Background N-body galaxy simulation Star on w hich forc es are being computed Small gr oup far enough aw ay to approximate to center of mass Star too close to approximate Large group far enough aw ay to approximate Communication pattern? Irregular Hierarchical 11/6/00 Parallel Applications 4 Barnes Problem Size Optimizations visited: Data placement Dynamic partitioning Prefetching 11/6/00 Work needed to scale is algorithmic Parallel Applications 5 Scaling Performance Performance change from 32 to 128 processors? Degradation: Communicationcomputation ratio, communication pattern, load balance, locality, synchronization How can they be overcome? Increase problem size Application restructuring 11/6/00 Parallel Applications 6 General Findings 11/6/00 Scaling to 128 processors without any change Parallel Applications 7 Scaling Barnes 11/6/00 Memory bottleneck: building shared tree (31% in 128-proc vs. 2% is uniprocessor) Original algorithm: globally shared tree Parallel Applications 8 Scaling Barnes 11/6/00 Parallel Applications 9 Scaling Barnes 11/6/00 New algorithm: MergeTree Parallel Applications 10 Ocean Background Ocean simulation using multigrid solver Communication pattern? Nearest neighbor iterative Hierarchical 11/6/00 Parallel Applications 11 Ocean Problem Size Optimizations visited: Processor-centric 11/6/00 array data structures Data placement Prefetching Work needed to scale is difficult Parallel Applications 12 Programming Models Options Shared Address Space Message Passing SHMEM Motivation if application is regular / predictable? If we can use similar algorithms and partitions across the models? 11/6/00 Parallel Applications 13 Ocean Discussions 11/6/00 Parallel Applications 14 Ocean Discussions 11/6/00 Parallel Applications 15 Conclusion Some guidelines Load balancing for moderate systems, communication for large systems Data partition & placement Very application dependent Optimization Programming 11/6/00 model Parallel Applications 16