Some
Challenges in Building Petabyte Data Stores
Jim
Gray, Ph.D.
Microsoft
Research
Seminar Hosts:
UCSF
Medical Information Sciences Program and UCSF
Molecular Design Institute
3:00PM
March 6, 2000
HSW 302
Abstract
The talk begins with a short survey of
some BIG databases we face in the next decade. The web is about a Terabyte
of HTML. Satellite image databases grow that much in a few hours. The Sloan
Sky Survey will be 40 terabytes, EOS/DIS will be 15 Petabytes. Then there
is the global digital library that will record everything everywhere. After
this motivation and a short demonstration of the Russian/American satellite
image TerraServer we are building, discussion shifts to simple storage
metrics: MAPS and SCANS are better storage performance metrics than KAPS
(KB objects accessed per second). When combined with $/MAPS and $/SCAN,
they show why tape is doomed as a near-line storage device. The talk revisits
the 5 minute rule for trading off DRAM for disk accesses. Then it pops
back to the global level and assess our progress in building reliable storage
systems (good) and HSMs (abysmal).
Biographical Summary:
Dr. Gray is a specialist in database and
transaction processing computer systems. At Microsoft his research focuses
on scaleable computing: building super-servers and workgroup systems from
commodity software and hardware. Prior to joining Microsoft, he worked
at Digital, Tandem, IBM and AT&T on database and transaction processing
systems. He is editor of the Performance Handbook for Database and Transaction
Processing Systems, and co-author of Transaction Processing Concepts and
Techniques. He is a Member of the National Academy of Engineering, Fellow
of the ACM, a member of the Presidents IT Advisor Council (PITAC), and
Editor of the Morgan Kaufmann series on Data Management. He received the
1998 Turing award for his work on transaction processing and database systems.
Jim Gray, Microsoft Research, 301 Howard
St #830, SF CA 94105 tel: 415-778-8222 fax -8210 Gray@Microsoft.com http://research.microsoft.com/~gray
UCSF MDI Events