Documents in the NTIS Technical Reports collection are the results of federally funded research. They are directly submitted to or collected by NTIS from Federal agencies for permanent accessibility to industry, academia and the public.  Before purchasing from NTIS, you may want to check for free access from (1) the issuing organization's website; (2) the U.S. Government Printing Office's Federal Digital System website; (3) the federal government Internet portal; or (4) a web search conducted using a commercial search engine such as
Accession Number ADA584726
Title Communication Lower Bounds and Optimal Algorithms for Programs that Reference Arrays - Part 1.
Publication Date May 2013
Media Count 48p
Personal Author J. Demmel K. A. Yelick M. Christ N. Knight T. Scanlon
Abstract Communication, i.e., moving data, between levels of a memory hierarchy or between parallel processors on a network, can greatly dominate the cost of computation, so algorithms that minimize communication can run much faster (and use less energy) than algorithms that do not. Motivated by this, attainable communication lower bounds were established for a variety of algorithms including matrix computations. The lower bound approach used initially for Theta(N3) matrix multiplication, and later for many other linear algebra algorithms, depended on a geometric result by Loomis and Whitney: this result bounded the volume of a 3D set (representing multiply-adds done in the inner loop of the algorithm) using the product of the areas of certain 2D projections of this set (representing the matrix entries available locally, i.e., without communication). Using a recent generalization of Loomis' and Whitney's result, we generalize this lower bound approach to a much larger class of algorithms, that may have arbitrary numbers of loops and arrays with arbitrary dimensions as long as the index expressions are a ne combinations of loop variables. In other words, the algorithm can do arbitrary operations on any number of variables like A(i(sub 1), i(sub 2), i(sub 2) - 2i(sub 1), 3 - 4i(sub 3) + 7i(sub 4), ...). Moreover, the result applies to recursive programs, irregular iteration spaces, sparse matrices, and other data structures as long as the computation can be logically mapped to loops and indexed data structure accesses. We also discuss when optimal algorithms exist that attain the lower bounds; this leads to new asymptotically faster algorithms for several problems.
Keywords Algorithms
Communication lower bounds
Computation science
Data structures
Data transfer
Differential equations
Geometric model
Hbl theory(Holder-brascamp-lieb theory)
Linear algebra
Matrix multiplication
Parallel processing
Parallel processors

Source Agency Non Paid ADAS
NTIS Subject Category 72B - Algebra, Analysis, Geometry, & Mathematical Logic
Corporate Author California Univ., Berkeley. Dept. of Electrical Engineering and Computer Science.
Document Type Technical report
Title Note Technical rept.
NTIS Issue Number 1403
Contract Number HR0011-12-2-0016

Science and Technology Highlights

See a sampling of the latest scientific, technical and engineering information from NTIS in the NTIS Technical Reports Newsletter

Acrobat Reader Mobile    Acrobat Reader