Google

Saturday, November 17, 2007

Grid computing:-

**Grid computing is a phrase in distributed computing which can have several meanings:

1->A local computer cluster which is like a "grid" because it is composed of multiple nodes.

2-> Offering online computation or storage as a metered commercial service, known as utility computing, computing on demand, or cloud computing.

3->The creation of a "virtual supercomputer" by using spare computing resources within an organization.

4-> The creation of a "virtual supercomputer" by using a network of geographically dispersed computers. Volunteer computing, which generally focuses on scientific, mathematical, and academic problems, is the most common application of this technology.

5->These varying definitions cover the spectrum of "distributed computing", and sometimes the two terms are used as synonyms. This article focuses on distributed computing technologies which are not in the traditional dedicated clusters; otherwise, see computer cluster.

Functionally, one can also speak of several types of grids:

Computational grids (including CPU Scavenging grids) which focuses primarily on computationally-intensive operations.

Data grids or the controlled sharing and management of large amounts of distributed data.
Equipment grids which have a primary piece of equipment e.g. a telescope, and where the surrounding Grid is used to control the equipment remotely and to analyze the data produced.

BENEFITS:-

1,Flexibility to meet changing business needs
2,High quality of service at low cost
3,Faster computing for better information
4,Investment protection and rapid ROI
5,A shared infrastructure environment


Grids versus conventional supercomputers:-

"Distributed" or "grid" computing in general is a special type of parallel computing which relies on complete computers (with onboard CPU, storage, power supply, network interface, etc.) connected to a network (private, public or the Internet) by a conventional network interface, such as Ethernet. This is in contrast to the traditional notion of a supercomputer, which has many processors connected by a local high-speed computer bus.

The primary advantage of distributed computing is that each node can be purchased as commodity hardware, which when combined can produce similar computing resources to a multiprocessor supercomputer, but at lower cost. This is due to the economies of scale of producing commodity hardware, compared to the lower efficiency of designing and constructing a small number of custom supercomputers. The primary performance disadvantage is that the various processors and local storage areas do not have high-speed connections. This arrangement is thus well-suited to applications in which multiple parallel computations can take place independently, without the need to communicate intermediate results between processors.

The high-end scalability of geographically dispersed grids is generally favorable, due to the low need for connectivity between nodes relative to the capacity of the public Internet. Conventional supercomputers also create physical challenges in supplying sufficient electricity and cooling capacity in a single location. Both supercomputers and grids can be used to run multiple parallel computations at the same time, which might be different simulations for the same project, or computations for completely different applications. The infrastructure and programming considerations needed to do this on each type of platform are different, however.

There are also differences in programming and deployment. It can be costly and difficult to write programs so that they can be run in the environment of a supercomputer, which may have a custom operating system, or require the program to address concurrency issues. If a problem can be adequately parallelized, a "thin" layer of "grid" infrastructure can allow conventional, standalone programs to run on multiple machines (but each given a different part of the same problem). This makes it possible to write and debug programs on a single conventional machine, and eliminates complications due to multiple instances of the same program running in the same shared memory and storage space at the same time.

Design considerations and variations:-

One feature of distributed grids is that they can be formed from computing resources belonging to multiple individuals or organizations (known as multiple administrative domains). This can facilitate commercial transactions, as in utility computing, or make it easier to assemble volunteer computing networks.

One disadvantage of this feature is that the computers which are actually performing the calculations might not be entirely trustworthy. The designers of the system must thus introduce measures to prevent malfunctions or malicious participants from producing false, misleading, or erroneous results, and from using the system as an attack vector. This often involves assigning work randomly to different nodes (presumably with different owners) and checking that at least two different nodes report the same answer for a given work unit. Discrepancies would identify malfunctioning and malicious nodes.

Due to the lack of central control over the hardware, there is no way to guarantee that nodes will not drop out of the network at random times. Some nodes (like laptops or dialup Internet customers) may also be available for computation but not network communications for unpredictable periods. These variations can be accommodated by assigning large work units (thus reducing the need for continuous network connectivity) and reassigning work units when a given node fails to report its results as expected.

The impacts of trust and availability on performance and development difficulty can influence the choice of whether to deploy onto a dedicated computer cluster, to idle machines internal to the developing organization, or to an open external network of volunteers or contractors.

No comments: