Monday 29 October 2012

The Illiac IV, the first supercomputer

History

1946 UI faculty attempt to build a computer that can play checkers. ENIAC (Electronic Numerical Integrator And Calculator) is built at the University of Pennsylvania by J. P. Eckert and John Mauchly. A year later, the President's Science Advisory Committee Panel on Computers in Higher Education states: "After growing wildly for years, the field of computing now appears to be approaching its infancy."
1948 John von Neumann, a pioneer in computer design at the Institute for Advanced Study in Princeton, New Jersey, suggests that the Illinois research group build a computer. J. Robert Oppenheimer gives Illinois permission to build a copy of von Neumann's proposed machine. John Bardeen co-invents the transistor at Bell Telephone Laboratories, for which he wins the Nobel Prize in 1956. (He wins another in 1972 for co-developing the theory of superconductivity.) He would become professor of physics and electrical engineering at the University in 1951
. 1949 The U.S. Army and the University of Illinois jointly fund the construction of two computers, ORDVAC and ILLIAC. The Digital Computer Lab is organized. Ralph Meagher, a physicist and chief engineer for ORDVAC, is head.
1951 ORDVAC (Ordnance Variable Automated Computer), one of the fastest in existence, is completed. It was ten feet long, two feet wide, eight and one-half feet high, contained 2,800 vacuum tubes, and weighed five tons.
1952 ORDVAC moves to the Army Ballistic Research Laboratory in Aberdeen, Maryland. It is used remotely from Illinois via a teletype circuit up to eight hours each night until the ILLIAC computer is completed.
ILLIAC, the first computer built and owned entirely by an educational institution, becomes operational. It was used by Lajaren Hiller, director of the Experimental Music Studio, to compose and play the Illiac Suite, the first computer-composed composition. UI faculty publish what is believed to be the first journal article in behavioral and social sciences involving a computer.
1955 A four-bit prototype transistorized computer is constructed at UI's Digital Computer Laboratory.
1957 UI faculty demonstrate a flip-flop 10 times faster than any other design in use. The Digital Computer Laboratory becomes a department in the Graduate College. Studies are underway of advances such as transistors, parallel operation, high-speed circuitry, and improved logic to better the usefulness, speed, and reliability of computers.
1958 UI establishes an experimental music studio where digital computers are used to generate music for the first time. Professor James E. Robertson, an electrical engineer who was an expert in error-checking systems, pioneers basic techniques of efficient binary division. The SRT division algorithm, now found both in hardware and software implementations of the divide instruction and widely used in the most powerful microprocessors, is named after D. Sweeney, Robertson, and T.D., who independently invented the method at about the same time. The campus got an IBM 650, which was used in the design of research instruments like high-energy particle accelerators and radio telescopes.
1961 UI faculty demonstrate advanced "virtual load" circuits with one nanosecond rise and fall times. Using the ILLIAC as a computational engine, UI faculty introduce PLATO, ...
1962 ILLIAC II, a transistorized computer 100 times faster than the original ILLIAC, becomes operational. ACM Computing Reviews says of the machine, "...ILLIAC II, at its conception in the mid-1950s, represents, together with some other independent design projects of the same period, the spearhead and breakthrough into a new generation of machines." Researchers from the ALCOR group in Europe join UI faculty in the design of an ALGOL compiler. ILLIAC I was retired.
1963 A pattern recognition computer, being designed at Illinois since 1960, becomes the ILLIAC III project. The machine was to analyze bubble chamber photographs of high energy particle events. (Due to a building fire, it was never finished.) Professor Donald B. Gillies discovered three Mersenne prime numbers in the course of checking out ILLIAC II, including the largest then known prime number, 211213-1, which is over 3,000 digits, putting him in the Guiness Book of Records for a time.
1965 The University of Illinois and Burroughs collaborate on the development of the ILLIAC IV, the largest and fastest computer in the world. The ILLIAC IV project, headed by Professor Daniel Slotnick, pioneers the new concept of parallel computation. Slotnick had worked under John von Neumann at Princeton. ILLIAC IV was a SIMD computer (single instruction, multiple data) and it marked the first use of circuit card design automation outside IBM. It was also the first to employ ECL (Emitter-Coupled Logic) integrated circuits and multilayer (up to twelve layers) circuit boards on a large scale. Most notable was its use of semiconductor memory. Undergraduate degree program in math/computer science is established in the College of Liberal Arts and Sciences.
1966 - ILLIAC The Department of Defense Advanced Research Projects Agency contracted the University of Illinois to build a large parallel processing computer, the ILLIAC IV, which did not operate until 1972 at NASA's Ames Research Center. The first large-scale array computer, the ILLIAC IV achieved a computation speed of 200 million instructions per second, about 300 million operations per second, and 1 billion bits per second of I/O transfer via a unique combination of parallel architecture and the overlapping or "pipe-lining" structure of its 64 processing elements
1967 ILLIAC II is retired, the second addition to DCL was completed, and the department installed its first IBM 360, which was incorporated into the ILLINET, one of the earliest computer networks..
1972 UI Professor John Bardeen shares the Nobel Prize in physics for developing the theory of superconductivity. It is Bardeen's second; the first was for inventing the transistor. Undergraduate degree program in computer science is established in the College of Engineering.
1974 ILLIAC IV becomes operational at the Institute for Advanced Computation, Moffett Field, California. The Office of Telecommunications and Computer Services Office merged to form CCSO, the Computing and Communications Services Office.
1975 Illinois is awarded UNIX license number one by Bell Laboratories. Graduate student Greg Chesson becomes the third person to contribute to the Bell Labs UNIX kernel.
1976 Illinois researchers use computers to prove the four-color theorem, a long standing conjecture in graph theory.
1978 University of Illinois Library, the largest public university library in the country, is among the first to provide public on-line access to a major collection. Today, the catalog accesses more than four million records in UI's collection.

Illiac IV 
  • Major speedup alternatives:
    • Overlap (pipelining & buffering)
    • Multiprocessors
    • SIMD (duplicate the PE. not the CU)
      • Vector processor
  • Three earlier designs (vacuum tubes and transistors) culminating in the Illiac IV design, all at the University of Illinois
    • Logical organization similar to the Solomon (prototyped by Westinghouse)
    • Sponsored by DARPA, built by various companies, assembled by Burroughs
    • Plan was for 256 PEs, in 4 quadrants of 64 PEs, but only one quadrant was built
    • Used at NASA Ames Research Center in mid-1970s
Illiac IV Architectural Overview
  • One CU (control unit), 64 PEs (processing elements), each PE has a PEM (PE memory)
  • CU operates on scalars, PEs on vectors
    • All PEs execute the instruction broadcast by the CU, if they are in active mode
    • Each PE can perform various arithmetic and logical instructions
    • Each PE has a 2048-word 64-bit memory, can be accessed in less than 350 ns
    • PEs can operate on data in 64-bit, 32-bit, and 8-bit formats
  • Data can be routed between PEs in various ways
  • I/O is handled by a separate Burroughs B6500 computer (stack architecture)
The Illlac IV Array
  • Array = CU + processor array
  • CU (Control Unit)
    • Controls the 64 PEs (vector operations)
    • Can afso execute instructions (scalar ops)
    • 64 64-bit scratchpad registers
    • 4 64-bit accumulators
  • PE (Processing Element)
    • 64 PEs, numbered 0 through 63
    • RGA = accumulator
    • RGB = for second operand
    • RGR = routing register, for communication
    • RGS = temporary storage
    • RGX = index register for insfruction addrs.
    • RGD = indicates active or inactive slate
  • PEM (PE Memory)
    • Each PE has a 2048-worci 64-bit local random-access memory
    • PE 0 can only access PEM 0, etc.
  • PU (Processing Unit) = PE + PEM
  • Data paths
    • CU bus 8 words of instructions or data can be fetched from a PEM and sent to the CU (Instructions distributed in PEMs)
    • CDB (Common Data Bus) broadcasts information from CLJ to all PEs
    • Routing network PE / is connected to PE/-1,PE/+1,PE 1-8, and PE 1+8
    • Wraps around, data may require multiple transfers to reach its destination
    • Mode bit line single line from RGD of each PE to the CU
Programming Issues
  • Consider the following FORTRAN code:
DO 10 I = 1, 64
10 A(1)=B(1)+C(1)
    • Put A(1), B(1), C(1)on PU 1, etc.
      • Each PE loads RGA from base+1, adds base+2, stores into base, where "base" is base of data in PEM
      • Each PE does this simultaneously, giving a speed up of 64
    • For less than 64 array elements, some processors will sit idle
    • For more than 64 array elements, some processors might have to do more work
  • For some algorithms, it may be desirable to turn off PEs
    • 64 PEs compute, then one half passes data to other half, then 32 PEs compute, etc.
Illiac IV I/O System
  • I/O system = I/O subsystem, DFS, and a B6500 control computer
  • I/O subsystem
    • CDC (Control Descriptor Controller) interrupts the B6500 upon request by the CU, also loads programs and data from the DFS into the PEM array
    • BIOM (Buffer I/O Memory) buffers (much faster) data from DFS lo CPU
    • IOS (I/O Switch) selects Input from DFS vs. real-time data
  • DFS (Disk File System?)
    • 1 Gbit, 128 heads (one per track)
    • 2 channels, each of which can transmit or receive data at 0.5 Gb/s over a 256-bit bus (1 Gb/s using both channels)
  • B6500 control computer
    • CPU, memory, peripherals (card reader, card punch, line printer, 4 magnetic tape units, 2 disk files, console printer, and keyboard)
    • Manages requests for system resources
    • OS, compilers, and assemblers
    • Laser memory
      • 1 Tbit write-once read-only laser memory
      • Thin film of metal on a polyester sheet, on a rotating drum
      • 5 seconds to access random data
    • ARPA network link
      • High-speed network (50 Kbps)
      • Illiac IV system was a network resource available to other members of the ARPA network
Illiac IV Software
  • Illiac IV delivered to NASA Ames Research Center in 1972, operational sometime (?) after mid -1975
    • Eventually operated M-F, 60-80 hours of uptime, 44 hours of maintenance / downtime
  • No real OS, no shared use of Illiac IV, one user at a time
    • An OS and two languages (TRANQUIL & GLYPNIR) were written at Illinois
    • At NASA Ames, since POP-10 and PDP-11 computers were used in place of the B6500, new software was needed, and a new language called CFD was written for solving differential equations


Friday 26 October 2012

MEMORY SYSTEM DESIGN (part 1)

                                                                        Today i an going to share something interesting about MEMORY SYSTEM DESIGN. The development of high density,high speed semiconductor memory has been the prime cause of the accelerated use of microprocessor.in this section we will discuss about the details of designing the memory subsystem and interfacing it to microprocessor's bus lines.
If we see this topic, from the system  designer's point of view  some interesting are of interest :

1)Word length and Capacity :  How does one organize memory chip so as to realize a target memory of some desired word length and capacity ?
2)Arranging Chip on the memory board : How are these chips arranged on a memory board as an array of rows and colums ?
3)Memory allocation: What area of the addressing space of the microprocessor should these larger memory cover ?
4)Memory interface: How the target memory interfaced  to the microprocessor external bus lines in a way that meets these assignments ?
5)Memory system Speed: How can someone design faster memory chips?

                          Before discussing in detail, it is necessary to clarify that only increasing the processor's input clock frequency alone does not guarantee an increase in the overall system performance .The fast processor will be forced to slow down of the data it receives comes from a slower memory. So properly matching the processor and memory bandwidths is the best way of eliminating this performance degradation. The simplest way to achieve this is the faster memory chips. we can also use different techniques like utilization of new accessing modes available on recent large DRAMs and increasing the memory bandwidth by designing interleaved memories.
For detail information of different parameters of memory and bandwidth click here.
Here we focus on the following topics
                                         a)Read Only Memory
                                (alternatices ) 1- PROM
                                                     2-EPROM
                                                     3-EEPROM
                                         b)Read and Write Memory
                                        (types ) 1-Static RAM
                                                     2-Dynamic RAM

ROMs : The basic types of permanent storage is the ROM, also called masked ROM , which permits only reading of its contents and not writing new information into it. Technologies used for ROM manufacture include bipolar TTL and MOS (PMOS ,NMOS ,CMOS and FAMOS). ROM memories are programmed or burned initially during manufacture by using special "masks". They are very fast.

PROMs: A variation of ROM is the "programmable ROM" or "write once memory".The time required to write information in a PROM is measured in milliseconds.PROM memories can be programmed by the customer PROM memories can be programmed by the customer only once and are usually manufactured in bipolar technology.PROM are programmed using special equipment call "PROM burners".Each PROM usually has a pic compatible counterpart in the form of ROM.How ever correcting error is not possible with PROM.

EPROMs : The last disadvantage of PROMs can be alleviated by using another type of ROM : the EPROM (erasable PROM). These memories are programmable more than once. The basic difference from RAM memories is that their write time is significantly longer.The contents of an EPROM memory can be erased by exposing the chip to ultraviolet light and and the memories can be programmed again .The EPROM are more expensive and slower than bipolar PROMs.

EEPROMS : The electrically erasable PROM is a type of PROM which can have selected part of its Memory electrically charged. Electrically erasability implies the potential for programming the device without removing it from circuit and without the need of ultraviolet.There are two type of EEPROMs :
                                                                1) metal nitride oxide semiconductor EEPROM
                                                                2)floating gate EEPROMs
The memory part can be erased in a few millisecond and rewritten at the rate of few hundres microseconds. Erasing and writing require threshold voltages of a few volts DC.These electrically erasable PROMs are generally too slow to  be used as memories in the final product and are expensive


                                                                .................. to be continued

Monday 15 October 2012

Different Sorting in C


Sorting means arranging a set of data in some order. There are different methods that are used to sort the data in ascending order. Some of the  methods are
  1. Selection sort
  2. Bubble sort
  3. Insertion sort