Page 1 :
MCA 20 201 CA, , MODULE 1, , INTRODUCTION TO PARALLEL PROCESSING, Parallel processing is a method to improve computer system performance by, executing two or more instructions simultaneously., simultaneously.The, The system may have two or more, ALUs to be able to execute two or more instruction at the same time. The system, may have two or more processors operating concurrently., The purpose of parallel processing is to speed up the computer processing, capability and increase its throughput., , Processor with multiple functional units:, , 1. Evolution of Computer system (assignment topic), , 1
Page 2 :
MCA 20 201 CA, , MODULE 1, , 2. Parallelism in uniprocessor systems, A uniprocessor (one, , CPU) system can, , perform, , two, , or, , more, , tasks, , simultaneously. It is possible to achieve parallelism with a uniprocessor system., Example : Instruction pipeline: An instruction pipeline reads instruction from, the memory while previous instructions are being executed in other segments of, the pipeline. Thus we can execute multiple instructions simultaneously, Note that a system that performs different operations on the same instruction is, not considered parallel. Only if the system processes two different instructions, simultaneously can it be considered parallel., , Basic Uniprocessor Architecture :, A typical uniprocessor computer consists of three major components:, 1., , The main memory, , 2., , The central processing unit (CPU), Set of general purpose registers along with PC(Program counter), A special purpose CPU status registers for storing the current state of CPU, and program under execution., One ALU, One local cache memory, , 3., , The input-output (I/O) subsystem., , There is a common synchronous bus architecture for communication between, CPU,Main Memory and I/O subsystem., , 2
Page 3 :
MCA 20 201 CA, , MODULE 1, , Example 1: supermini VAX, VAX-11/780 uniprocessor system, , The CPU contains the master controller of the VAX system. There are 16, 32-bit, 32, general purpose register one of which is a Program Counter (PC).There is also a, special CPU status register containing about the current state of the processor being, executed. The CPU contains an ALU with an optional Floating-point, point accelerator, and, some local, ocal cache memory with an optional diagnostic memory. The CPU can be, intervened by the operator through the console connected to floppy disk., 3
Page 4 :
MCA 20 201 CA, , MODULE 1, , The CPU, the main memory( 2^32 words of 32 bit each) and the I/O subsystem, are, , all, , connected, , to, , a, , common, , bus,, , the, , synchronous, , backplane, , interconnection(SBI).Through this bus, all I/O devices can communicate with each, other with CPU or with the memory. I/O devices can be connected directly to the SBI, through the unibus and its controller or through a mass bus and its controller., Example 2: mainframe IBM system 370/model 168 uniprocessor computer, , The CPU contains the instruction decoding and execution units as well as a, cache., Main memory is divided into four units referred to as logical storage units, (LSU), that are four way interleaved. The storage controller provides multiport, connections between the CPU and the four LSU’s., Peripherals are connected to the system via high speed I/O channels which, operate asynchronously with the CPU., , 4
Page 5 :
MCA 20 201 CA, , MODULE 1, , Parallel Processing Mechanism:, A number of parallel processing mechanisms have been developed in, uniprocessor computers. We identify them in the following six categories:, 1. Multiplicity of functional units, 2. Parallelism and pipelining within the CPU, 3. Overlapped CPU and I/O operations, 4. Use of a hierarchical memory system, 5. Balancing Of Subsystem Bandwidth, 6. Multiprogramming and time sharing, 1. Multiplicity of functional units :, The early computer has only one ALU in its CPU and hence performing a, long sequence of ALU instructions takes more amount of time., The CDC-6600 has 10 functional units built into its CPU. These 10 units are, independent of each other and may operate simultaneously. A score board is used, to keep track of the availability of the functional units and registers being, demanded. With 10 functional units and 24 registers available, the instruction, issue rate can be significantly increased., , 5
Page 6 :
MCA 20 201 CA, , MODULE 1, , 2. Parallelism and pipelining within the CPU :, Parallel adders, using such techniques as carry, carry-look, look ahead and carry –save,, are now built into almos, almostt all ALUs. High speed multiplier recording and, convergence division are techniques for exploring parallelism and the sharing of, hardware resources for the functions of multiply and Divide. The use of multiple, functional units is a form of parallelism with the CPU., Various phases of instructions executions are now pipelined, including, instruction fetch, decode, operand fetch, arithmetic logic execution, and store, result., 3. Overlapped CPU and I/O operations :, I/O operations can be performed simultaneously with the CPU competitions by using, separate I/O controllers, channels, or I/O processors., , 6
Page 7 :
MCA 20 201 CA, , MODULE 1, , The direct memory access (DMA) channel can be used to provide direct information, transfer between the I/O devices and the main memory. The DMA is conducted on a cycle, stealing basis, which is apparent to the CPU., , 4. Use of a hierarchical memory system :, The CPU is 1000 times faster than memory access. A hierarchical memory, system can be used to close up the speed gap. The hierarchical order listed is:, registers, Cache, Main Memory, Magnetic Disk, Magnetic Tape, The inner most level is the register files directly addressable by ALU., Cache memory can be used to serve as a buffer between the CPU and the, main memory. Virtual memory space can be established with the use of disks and, tapes at the outer levels., , 7
Page 8 :
MCA 20 201 CA, , MODULE 1, , 5. Balancing Of Subsystem Bandwidth :, CPU is the fastest unit in computer. The bandwidth of a system is defined as, the number of operations performed per unit time. In case of main memory the, memory bandwidth is measured by the number of words that can be accessed per, unit time., Bandwidth Balancing Between CPU and Memory: The speed gap, between the CPU and the main memory can be closed up by using fast, cache memory between them. A block of memory words is moved from, the main memory into the cache so that immediate instructions can be, available most of the time from the cache., Bandwidth Balancing Between Memory and I/O Devices:Input-output, channels with different speeds can be used between the slow I/O devices, and the main memory. The I/O channels perform buffering and, multiplexing functions to transfer the data from multiple disks into the, main memory by stealing cycles from the CPU., 6. Multiprogramming and time sharing:, These are software approaches to achieve concurrency in a uniprocessor system., Multiprogramming: The interleaving of CPU and I/O operations among, several programs is called as Multiprogramming. Example Whenever a, process P1 is tied up with I/O processor for performing input output, operation at the same moment CPU can be tied up with an process P2. This, allows simultaneous execution of programs., Time sharing: Multiprogramming mainly deals with sharing of many, programs by the CPU. Sometimes high priority programs may occupy the, CPU for long time and other programs are put up in queue. This problem, can be overcome by a concept called as Time sharing in which every, process is allotted a time slice of CPU time and thereafter after its, 8
Page 9 :
MCA 20 201 CA, , MODULE 1, , respective time slice is over CPU is allotted to the next program if the, process is not completed it will be in queue waiting for the second chance, to receive the CPU time., , 3. Parallel Computer structures, Parallel computers are those systems that emphasize parallel processing., , Division of parallel computers:, 1. Pipeline computers, 2. Array processors, 3. Multiprocessor systems., , 1. Pipeline computers :, A pipeline computer performs overlapped computations to exploit temporal, parallelism., Normally, the process of executing an instruction in a digital computer, involves 4 major steps:, a) instruction fetch(IF) from the main memory., b) instruction decoding(ID),identifying the operation to be performed., c) operand fetch(OF),if needed in the execution, and, d) execution(EX) of the decoded arithmetic logic operation., In a nonpipelined computer, these 4 steps must be completed before the next, instruction can be issued., , 9
Page 10 :
MCA 20 201 CA, , MODULE 1, , In a pipelined computer, successive instructions are executed in an, overlapped fashion.In, In a pipelined computer,, a pipeline has two ends, the input end, and the output end. Between these ends, there are multiple stages/segments such, that output of one stage is connected to input of next stage and each stage performs, a specific operation. Interface registers are used to hold the intermediate output, between two stages. These interface registers are also called latch or buffer., All the stages in the pipeline along with the interface registers are controlled, by a common clock. If each stage takes one clock cycle, then we can see that a, single instruction may take several cycles to complete., Execution sequence of instructions in a pipelined processor can be visualized, using a space-time, time diagram. For example, consider a processor having 4 stages, and let there be 2 instructions to be executed. We can visualize the execution, sequence through the following space, space-time diagrams:, Non overlapped execution(space time diagram for a nonpipelined processor):, , 10
Page 11 :
MCA 20 201 CA, , MODULE 1, , Total time = 8 Cycle, Overlapped execution(space time diagram for a pipelined processor ):, , Total time = 5 Cycle, , 11
Page 12 :
MCA 20 201 CA, , MODULE 1, , Performance of Pipelined Execution :, The following parameters serve as criterion to estimate the performance of, pipelined execution:, 1. Speed Up :, It gives an idea of “how much faster” the pipelined execution is as, compared to non-pipelined execution., Speed Up(S)=Non pipelined execution time/pipelined execution time, 2. Efficiency:, Efficiency(ᵑ)=Speedup/Number of stages in Pipelined architecture, 3. Throughput:, Throughput is defined as number of instructions executed per unit time., Throughput=Number of instruction executed/Total time taken, Types of pipeline:, a) Uniform delay pipeline:, In this type of pipeline, all the stages will take same time to complete an, operation.In uniform delay pipeline,, Cycle Time (Tp) = Stage Delay, If buffers are included between the stages then,, Cycle Time (Tp) = Stage Delay + Buffer Delay, , 12
Page 13 :
MCA 20 201 CA, , MODULE 1, , b) Non-Uniform delay pipeline:, In this type of pipeline, different stages take different time to complete an, operation. In this type of pipeline., Cycle Time (Tp) = Maximum(Stage Delay), If buffers are included between the stages,, Tp = Maximum(Stage delay + Buffer delay), , 2. Array processors:, An array processor uses multiple synchronized arithmetic logic units to, achieve spatial parallelism. An array processor is a synchronous parallel computer, with multiple arithmetic logic units, called processing elements (PE) , that can, operate in parallel in a lockstep fashion. By replication of ALUs, one can achieve, the spatial parallelism . The PEs are synchronized to perform the same function at, the same time. An appropriate data –routing mechanism must be established, among the PEs., Array Processor performs computations on large array of data. Consider the, simple task of adding two groups of 10 numbers together., , 13
Page 14 :
MCA 20 201 CA, , MODULE 1, , Types of Array Processor :, These are two types of Array Processors:, a) Attached Array Processor, b) SIMD Array Processor., a) Attached Array Processor :, It is an auxiliary processor attached to a general purpose computer.Its, computer., intent is to improve the performance of the host computer in specific numeric, calculation tasks., , b) SIMD Array Processor:, It is an array processor that has a single instruction multiple data, organization.SIMD is the organization of a single computer containing, multiple processors operating in parallel. The processing units are made to, , 14
Page 15 :
MCA 20 201 CA, , MODULE 1, , operate under the control of a common con, control, trol unit, thus providing a single, instruction stream and multiple data streams., Block Diagram, , It contains a set of identical processing elements (PE's), each of which is, having, , a, , local, , memory, , M., , Each, , processor, , element, , includes, , an ALU and registers, registers., The master control unit controls all the operations of the processor, elements. It also decodes the instructions and determines how the instruction, is to be executed., The main memory is used for storing the program. The control unit is, responsible for fetching the instructions. Vector instructions are send to all, PE's simultaneously and results are returned to the memory., 15
Page 16 :
MCA 20 201 CA, , MODULE 1, , Why use the Array Processor :, 1. Array processors increases the overall instruction processing speed., 2. As most of the Array processors operates aasynchronously, synchronously from the host, CPU, hence it improves the overall capacity of the system., 3. Array Processors has its own local memory, hence providing extra, memory for systems with low memory., , 3. Multiprocessor systems:, A multiprocessor system is defined as "a system with more than one, processor", and, more precisely, "a number of central processing units linked, together to enable parallel processing to take place". The key objective of a, multiprocessor is to boost a system's execution speed.These systems have multiple, m, processors working in parallel that share the computer clock, memory, bus,, peripheral devices etc., , 16
Page 17 :
MCA 20 201 CA, , MODULE 1, , Types of Multiprocessors :, 1. Symmetric multiprocessors : In these types of systems, each processor, contains a similar copy of the operating system and they all communicate, with each other. All the processors are in a peer to peer relationship i.e. no, master - slave relationship exists between them., 2. Asymmetric multiprocessors: In asymmetric systems, each processor is, given a predefined task. There iiss a master processor that gives instruction to, all the other processors. Asymmetric multiprocessor system contains a, master slave relationship. Asymmetric multiprocessor was the only type of, multiprocessor available before symmetric multiprocessors were created., cr, Characteristics of multiprocessors :, 1. A multiprocessor system is an interconnection of two or more CPUs with, memory and input-output, output equipment., 17
Page 18 :
MCA 20 201 CA, , MODULE 1, , 2. Multiprocessors are classified as multiple instruction stream, multiple data, stream (MIMD) systems., 3. Multiprocessing improves the reliability of the system., 4. Multiprocessing can improve performance by decomposing a program into, parallel executable tasks., 5. Multiprocessor are classified by the way their memory is organized., A multiprocessor system with common shared memory is classified as a, shared-memory, memory or tightly coupled multiprocessor., Each processor element with its own private local memory is classified, as a distributed, distributed-memory or loosely coupled system., There are several physical forms available ffor, or establishing an interconnection, network., a) Time-shared, shared common bus: A common-bus, bus multiprocessor system consists, of a number of processors connected through a common path to a memory, unit., , 18
Page 19 :
MCA 20 201 CA, , MODULE 1, , b) Multiport memory : A multiport memory system employs separate buses, between each memory module and each CPU. The module must have, internal control logic to determine which port will have access to memory at, any given time. Memory access conflicts are resolved by assigning fixed, priorities to each memory port., , c) Crossbar switch: Consists of a number of crosspoints that are placed at, intersections between processor buses and memory module paths.The small, square in each crosspoint is a switch that determines the path from a, processor to a memory module., , 19
Page 20 :
MCA 20 201 CA, , MODULE 1, , Performance of parallel computers:, The metrics are :, 1. Speedup :, , 2. Efficiency :, , 20
Page 21 :
MCA 20 201 CA, , MODULE 1, , 3. Redundancy :, , 4. Utilization:, , Data flow computers:, Data Flow Computer Architecture is the study of special and general purpose, computer designs in which performance of an operation on data is triggered by the, presence of data items., Example :z=x+y*2, 21
Page 22 :
MCA 20 201 CA, , MODULE 1, , 22
Page 23 :
MCA 20 201 CA, , MODULE 1, , Activity templates are stored in the activity store. Each activity template has a, unique address which is entered in the instruction queue when the instruction is, ready for execution. Instruction fetch and data access are handled by the fetch and, update units. The operation unit performs the specified operation and generates the, result to be delivered to each destination field in the template., , 4. Architectural Classification schemes, Three computer architectural classification schemes are :, 1. Flynn’s classification(1966), 2. Feng’s scheme(1972), 3. Handler’s classification(1977), , 1. Flynn’s classification :, It is based on the multiplicity of instruction streams and data streams in a, computer system.The most popular taxonomy of computer architecture was, defined by Flynn in 1966. Flynn’s classification scheme is based on the notion, of a stream of information. The term stream is used here to denote a sequence, of items(instructions or data)as executed or operated upon by a single, processor., Two types of information flow into a processor: instructions and data., An instruction stream is a sequence of instructions as executed by the, machine., A data stream is a sequence of data including input, partial or temporary, results ,called for by the instruction stream., , 23
Page 24 :
MCA 20 201 CA, , MODULE 1, , M J Flynn classifies the computer on the basis of number of instruction and, data items processed simultaneously., I., , Single Instruction Stream , Single Data Stream(SISD), , II., , Single Instruction Stream , Multiple Data Stream(SIMD), , III., , Multiple Instruction Stream , Single Data Stream(MISD), , IV., , Multiple Instruction Stream, Multiple Data Stream(MIMD), , Main components:, Both instructions and data are fetched from the memory modules., Instructions are decoded by the control unit, which sends the decoded, instruction stream to the processor units for execution. Data streams flow, between the processors and the memory bidirectionally. Multiple memory, modules may be used in the shared memory subsystem. Each instruction stream, is generated by an independent control unit. Multiple data streams originate from, the subsystem of shared memory modules., I., , Single Instruction Stream , Single Data Stream(SISD) :, It represents the organization containing single control unit ,a processor, unit and a memory unit. Instruction are executed sequentially and system, may or may not have internal parallel processing capabilities., Single control unit (CU) fetches single Instruction Stream (IS) from, memory. The CU then generates appropriate control signals to direct single, processing element (PE) to operate on single Data Stream (DS) i.e. one, operation at a time ., Instructions are executed sequentially but may be overlapped in their, execution stages(pipelining).Most SISD uniprocessor systems are pipelined., 24
Page 25 :
MCA 20 201 CA, , MODULE 1, , An SISD computer may have more than one functional unit in it. All the, functional units are under the supervision of one control unit., Examples: older generation mainframes, minicomputers, workstations and, single processor/core PCs., Block Diagram:, , Example :, , 25
Page 26 :
MCA 20 201 CA, , II., , MODULE 1, , Single Instruction Stream,Multiple Data Stream(SIMD) :, It represents an organization that includes many processing units under the, supervision of a common control unit.All PEs receive the same instruction, broadcast from the control unit but operate on different data sets from distinct, data streams.The shared memory subsystem may contain multiple modules., Examples:, Processor Arrays: Thinking Machines CM-2,, 2, MasPar MP-1, MP & MP-2,, ILLIAC IV, Vector Pipelines: IBM 9000, Cray X-MP, Y-MP, MP & C90, Fujitsu, Fujit VP,, NEC SX-2,, 2, Hitachi S820, ETA10, Most modern computers, particularly those with graphics processor units, (GPUs) employ SIMD instructions and execution units., , Block Diagram :, , Example :, , 26
Page 27 :
MCA 20 201 CA, , III., , MODULE 1, , Multiple Instruction Stream , Single Data Stream(MISD) :, Multiple instructions operate on a single data stream. Uncommon architecture, which is generally used for fault tolerance. Heterogeneous systems operate on the, same data stream and must agree on the result. Examples include the Space, Shuttle flight control computer., Block diagram :, , 27
Page 28 :
MCA 20 201 CA, , MODULE 1, , Example :, , 28
Page 29 :
MCA 20 201 CA, , IV., , MODULE 1, , Multiple Instruction Stream, Multiple Data Stream(MIMD) :, It refers to a computer system capable of processing several programs at, the same time.Multiple-instruction multiple-data streams (MIMD) parallel, architectures are made of multiple processors and multiple memory modules, connected together via some interconnection network. They fall into two broad, categories: shared memory or message passing., Processors exchange information through their central shared memory in, shared, , memory, , systems,, , and, , exchange, , information, , through, , their, , interconnection network in message passing systems.Examples: most current, supercomputers, networked parallel computer clusters and "grids", multiprocessor SMP computers, multi-core PCs., Block Diagram, , Example :, , 29
Page 30 :
MCA 20 201 CA, , MODULE 1, , 2. Feng’s scheme(1972), scheme(1972):, Feng’s classification is mainly based on degree of parallelism to classify, parallel computer architecture., The maximum number of binary digits that can be process per unit time is, called maximum parallelism degree P., The average parallelism degree a., , Where T is a total processor cycle, , 30
Page 31 :
MCA 20 201 CA, , MODULE 1, , The utilization of computer system within T cycle is given by:, , When, , =, , it means that utilization of computer system is 100%. The, , utilization rate depends on the application program being executed.A bit slice is, a string of bits, one from each of the words at the same vertical bit position., The maximum parallelism degree P(C) of a given computer system C is, represented by the product of the word length n and the bit-slice length m; that, is,, P(C)= n.m, The pair (n,m) corresponds to a point in the computer space., There are four types of processing methods:, i., , Word –serial and bit-serial(WSBS) :has been called bit parallel, processing because one bit is processed at a time., , ii., , Word –parallel and bit-serial(WPBS) :has been called bit slice, processing because m-bit slice is processes at a time., , iii., , Word –serial and bit-parallel(WSBP) :is found in most existing, computers and has been called as Word Slice processing because, one word of n bit processed at a time., , iv., , Word –Parallel and bit-parallel(WPBP) :is known as fully, parallel processing in which an array on n x m bits is processes at, one time., 31
Page 32 :
MCA 20 201 CA, , MODULE 1, , Feng’s computer system classification :, , 3. Handler’s Classification :, Handler’s proposed an elaborate notation for expressing the pipelining and, parallelism of computers. He divided the computer at three levels., Processor Control Unit(PCU), Arithmetic Logic Unit(ALU), Bit Level Circuit(BLC), PCU corresponds to CPU, ALU corresponds to a functional unit or PE’s in an, array processor. BLC corresponds to the logic needed for performing operation in, ALU., , 32
Page 33 :
MCA 20 201 CA, , MODULE 1, , He uses three pairs of integers to describe computer:, Computer = (k*k’, d*d , w*w’), Where,, k= no. of PCUs, k’=no. of PCUs which are pipelined, d=no. of ALUs control by each PCU, d’=no. of ALUs that can be pipelined, w=no. of bits or processing elements in ALU, w’=no. of pipeline segments, “*” operation is used to show that units are pipelined., “+” operator is used to show that units are not pipelined., “v” operator is used to show that computer hardware can work in one, of the several mode., “~” operator is used to show that range of any parameter, Consider the following model and observe how handlers differentiate, them on the basis of degree of parallelism and pipelining., a) CDC 6600, b) Cray-1, c) Illiac-IV, , 33
Page 34 :
MCA 20 201 CA, , MODULE 1, , a) CDC 6600:, This model consist single processor which is supported by 10 I/O, processor. One control unit control one ALU with 60 bit word length. The ALU, has 10 functional unit which are in pipelined manner. 10 I/O processor work, parallel with each other and with CPU. Each I/O processor contains 12 bit ALU., The description for 10 I/O processor:, CDC 6600 I/O= (10, 1, 12), The description for main processor:, CDC 6600 main= (1, 1*10,60), In this model main and I/O processor are pipelined. So that * operator will be, used to combine both of them., CDC 6600 = (I/O processor) * (main processor), b) Cray-1:, This model having 64 bit single processor computer. ALU has 12 functional, unit and 8 of which are pipelined. The range of functional units are 1to 14., The description for Cray-1:, Cray-1 = (1, 12*8,(1~14)), c) Illiac-IV:, It is made up from arrays which are connected in a mesh. The total 64, arrays have 64 bit ALUs, it has two DEC PDP-10 as the front end. Whereas, Illiac-IV accept data from one PDP-10 at a time., PDP-10 Illiac - IV= (2,1, 36)*(1,64,64), , 34
Page 35 :
MCA 20 201 CA, , MODULE 1, , This model will also work in half word mode, at that time it has 128, processor of 32 bit each instead by normal 64 processor of 64 bit each., PDP10 Illiac-IV/2 =(2,1,36)*(1,128,32), Combining this with above we get, PDP-10 Illiac-IV = (2, 1,36)*[(1,64,64)v(1,128,32)], , 5. Parallel processing Applications., 1. Predictive Modeling and simulations, a) Numeric weather prediction, b) Oceanography and astrophysics, c) Socioeconomics and government use, 2. Engineering Design and automation, a) Finite element analysis, b) Artificial intelligence and automation, c) Remote sensing applications, 3. Energy Resources Exploration, a) Seismic exploration, 4. Medical , Military and Basic Research, a) Computer assisted tomography, b) Genetic engineering, c) Weapon research and defense, d) Basic research problem, , 35
Page 36 :
MCA 20 201 CA, , MODULE 1, , 1. Predictive Modeling and simulations :, Multidimensional modeling of the atmosphere, the earth environment ,outer, space and the world economy has become a major concern of world scientists., Predictive modeling is done through extensive computer simulation, experiments, which often involve large scale computations to achieve the, desired accuracy and turnaround time. Such numerical modeling requires state, of the art computing at speeds approaching 1000 million megaflops or beyond., a) Numeric weather prediction(NWP):NWP uses mathematical models of, atmosphere and oceans.Taking current observations of weather and, processing these data with computer models to forecast the future state of, weather.Uses data assimilation to produce outputs., b) Oceanography and astrophysics : Used to study wealth of ocean using, multiprocessors having large computational power with low power, requirements.ROMS were used originally but now MPI programming, methods are used.Computational astrophysics refers to the methods and, computing tools developed and used in astrophysics research.PIC ,PM and nbody simulations are different important techniques for computational, astrophysics., c) Socioeconomics and government use: Large computers are in great, demand in the areas of econometrics, social engineering, government, cenus,crime control and the modeling of the world economy.Parallel, processing is used for modelling of a economy of a nation/world.Programs, system which involves cluster computing device to implement parallel, algorithms of scenario calculations ,optimization are used in such economic, models.Such program system serves for conducting multi-scenario, calculations to design a suitable development strategy for a region., , 36
Page 37 :
MCA 20 201 CA, , MODULE 1, , 2. Engineering Design and automation :, Fast supercomputers have been in high demand for solving many, engineering design problems, such as the finite element analysis needed for, structural designs and wind tunnel experiments for aerodynamic studies., Industrial development also demands the use of computers to advance, automation ,artificial intelligence and remote sensing of earth resources., a) Finite element analysis :FEA is a numeric method commonly used for, multiphysics problem.Used in design of huge structures like ships, dams,, supersonic jets etc.In FEA extremely large amount of partial differential, equations are to solved concurrently and hence parallel processing elements, are used., b) Artificial intelligence and automation :AI is the intelligence exhibited, by machines or software.AI systems requires large amount of parallel, computing for which they are used., Types:, 1. Image processing., 2. Expert Systems ., 3. Natural Language Processing(NLP) ., 4. Pattern Recognition, c) Remote sensing applications :It is a software application that processes, remote sensing data.Remote sensing applications read specialized file, formats that contain sensor image data, georeferencing information, and, sensor metadata. Computer analysis of such remotely sensed earth, resources, , data, , has, , many, , applications, , in, , agriculture,, , forestry, , etc. Explosive amounts of pictorial information needs to be processed in, this area., 37
Page 38 :
MCA 20 201 CA, , MODULE 1, , 3. Energy Resources Exploration:, Computers can play an important,t role in the discovery of oil and gas and the, management of their recovery, in the development of workable plasma fusion, energy, and in ensuring nuclear reactor safety. Using computers in the energy area, results in less production costs and higher safety measures., , 4. Medical , Military and Basic Research:, In the medical area ,fast computers are needed in computer-assisted tomography ,, artificial heart design , liver diagnosis , brain damage estimation and genetic, engineering studies., Military defense needs to use supercomputers for weapon design, effects, simulation and other electronic warfare., Almost all basic research areas demand fast computers to advance their studies., , 38