A checkpointing algorithm for an SCl based distributed shared memory system

Kalaiselvi, S. ; Rajaraman, V. (1999) A checkpointing algorithm for an SCl based distributed shared memory system Microprocessors and Microsystems, 22 (9). pp. 515-522. ISSN 0141-9331

Full text not available from this repository.

Official URL: http://linkinghub.elsevier.com/retrieve/pii/S01419...

Related URL: http://dx.doi.org/10.1016/S0141-9331(98)00116-1

Abstract

Distributed Shared Memory (DSM) systems combine the ease of programming of Shared Memory Parallel Computers and scalability of message passing multicomputers. IEEE has proposed an interface standard known as SCI standard to construct DSM systems. When the number of processors in a parallel computer increase it is imperative to build fault tolerance. This article presents an algorithm for checkpointing and rollback recovery of an SCI based DSM system using the provisions of the standard. It is shown that this checkpointing and rollback recovery procedure judiciously combines the features of both shared memory and message passing distributed memory system.

Item Type:Article
Source:Copyright of this article belongs to Elsevier Science.
Keywords:Fault Tolerant Computing; Checkpointing; Rollback Recovery; SCl Standard
ID Code:38370
Deposited On:29 Apr 2011 08:43
Last Modified:29 Apr 2011 08:43

Repository Staff Only: item control page