Documents in the NTIS Technical Reports collection are the results of federally funded research. They are directly submitted to or collected by NTIS from Federal agencies for permanent accessibility to industry, academia and the public.  Before purchasing from NTIS, you may want to check for free access from (1) the issuing organization's website; (2) the U.S. Government Printing Office's Federal Digital System website; (3) the federal government Internet portal; or (4) a web search conducted using a commercial search engine such as
Accession Number ADA584739
Title Algorithms for Large-Scale Astronomical Problems.
Publication Date Aug 2013
Media Count 124p
Personal Author B. Fu
Abstract Modern astronomical datasets are getting larger and larger, which already include billions of celestial objects and take up terabytes of disk space. Meanwhile, many astronomical applications do not scale well to such large amount of data, which raises the following question: How can we use modern computer science techniques to help astronomers better analyze large datasets. To answer this question, we applied various computer science techniques to provide fast, scalable solutions to the following astronomical problems. We developed algorithms to better work with big data. We found out that for some astronomical problems, the information that users require each time only covers a small proportion of the input dataset. Thus we carefully organized data layout on disk to quickly answer user queries, and the developed technique uses only one desktop computer to handle datasets with billions of data entries. We made use of database techniques to store and retrieve data. We designed table schemas and query processing functions to maximize their performance on large datasets. Some database features like indexing and sorting further reduce the processing time of user queries. We processed large data using modern distributed computing frameworks. We considered widely-used frameworks in the astronomy world, like Message Passing Interface (MPI), as well as emerging frameworks such as MapReduce. The developed implementations scale well to tens of billions of objects on hundreds of compute cores. During our research, we noticed that modern computer hardware is helpful to solve some sub-problems we encountered. One example is the use of Solid-State Drives (SSDs), whose random access time is faster than regular hard disk drives.
Keywords Access time
Astronomical bodies
Astronomy applications
Data bases
Distributed computing
Distributed data processing
Information retrieval
Message processing
Parallel processing
Processing equipment
Random access computer storage
Scaling factor
Solid state electronics

Source Agency Non Paid ADAS
NTIS Subject Category 54B - Astronomy & Celestial Mechanics
72B - Algebra, Analysis, Geometry, & Mathematical Logic
62 - Computers, Control & Information Theory
Corporate Author Carnegie-Mellon Univ., Pittsburgh, PA. Dept. of Computer Science.
Document Type Thesis
Title Note Doctoral thesis.
NTIS Issue Number 1403
Contract Number N/A

Science and Technology Highlights

See a sampling of the latest scientific, technical and engineering information from NTIS in the NTIS Technical Reports Newsletter

Acrobat Reader Mobile    Acrobat Reader