[Company Logo Image]                      

Home ] Up ] Contents ]

The Ports ] The Motherboard ] [ The Processor ]

 

The Microprocessor Behind The Personal Computer


Very Basically, a microprocessor combines the functions of a CPU (Central Processing Unit) within one chip. It includes a ALU (Arithmetic Logic Unit), internal registers and a CU (control unit) for sequencing the system. The processor has three buses, a bi-directional data bus, mono-directional address bus and control bus. The data bus carries data between various components of the system, typically from memory to the processor or input output controller. The address bus carries an address generated by the processor, which will select one internal register within one of the chips attached to the system and specifies the source or destination of the data which will carry along the data bus. The control bus carries various synchronization signals. The processor needs some sort of clock to synchronize the precise timing references of the system. The 8086 processor model is still intact in the latest Pentium processors. The processor design includes a Bus Unit, and ALU, Execution Unit (EU) and an instruction queue. Later Pentium designs include cache, a page unit, a Floating Point Unit, a branch target buffer and RISC (Reduced Instruction Set Computer) concepts in the execution unit. 


To understand the PCs capabilities and performance a brief history of the microprocessors follows.

Intel Microprocessors

Intel had introduced the 8086 processor three years before the announcement of the IBM PC. However because of the cost of designing the personal computer around this microprocessor IBM choose the 8088 microprocessor also released by Intel. The 8088 microprocessor has a 16 bit internal bus but only supports a 8 bit external bus making it easier to use standard 8 bit peripheral chips that were already around, and allowed a smaller entry level of system memory. Using the 8088 processor then, also mapped the way for easy migration to the 8086 and 286 microprocessors that were to follow. The 8088 processor accomplished 1 MB addressing by using a technique known as segmentation.  A two step process used to address memory. First a segment register was loaded with a pointer to a 64 KB block of memory, then normal 8 bit registers could manipulate data in that 64 KB segment. The segment register needed to be loaded with new data to access memory outside the 64 KB. (It was not until the 80386 processor that memory addressing mode allowed full linear mapping).
The PC AT was announced in 1984 by IBM, and used the Intel 286, 16 bit processor, which supported 16 bit bus transfers and 24 bit memory addressing and protected mode memory management which allows programs to be written that prevent one portion effect another portion and hence one requirement of multitasking. Increasing the expansion bus to 16 bits and remaining backward compatible allowed existing expansion cards to work in the new architecture. All PC designs still support the 16 bit (AT) bus, so cards that were designed for the original IBM PC should still work properly in a modern bus motherboard ( Some manufacturers will stop supporting this bus soon).
The 80386 microprocessor was announced by Intel in 1985. The processor could process 32 bit data and access memory on a 32 bit bus. Chips were added to the motherboard to allow the AT bus to run asynchronously to the processors clock, and permits their speeds to run independently.  The memory was also moved from the external bus to the microprocessors local bus and no longer dependent on the speed of the external bus. By adding cache memory (much faster, smaller and more expensive) to the local bus the whole system was speeded up and freed the external bus from some of the constraints namely a bottleneck on the PC. Windows applications also pushed the PC to its limits and soon the graphics card performance became the bottle neck in the PC system performance. The 386 introduced linear addressing along with Demand Paging. Demand paging automatically detects when a block of memory is not in system memory and requires retrieving from the hard disk, Virtual Addressing. The 386 processor allowed Virtual 8086 mode. Each user or task could operate as though having the entire system to itself. 32 bit operating systems can use the full features of the 386 protected modes and offer 32 bit support. Bank Interleaving increased access by partitioning memory into multiple blocks that could be accessed simultaneously. The 386 was later shipped with a 16 bit internal bus which lowered system costs in a competitive market and was named the 386SX. The original 386 was named the 386DX.
In April 1989 Intel announced the 486 microprocessor. Apart from performance gains, not much changed to the architecture design however the new chip took advantage of advances in transistor size by adding a math coprocessor and a small amount of cache on the chip. The processor bus had changed somewhat from the 386 processor to allow burst transfer. Only one pointer, the start address needed to be loaded in a register to process blocks of memory. When the graphics card became the major bottleneck in the PC system, VESA (Video Electronics Standards Association) used the 486 local bus and extended it to include VESA local bus slots, adding them behind the AT bus slots, thus combining both buses in a system card, to provide high speed peripheral performance without replacing the function of the AT bus. Every time Intel introduced a new microprocessor they changed the processor architecture, so the chipset had to change. To solve this problem Intel introduced a new bus called PCI (Peripheral Connection Interface) that would attach to the microprocessor bus via a local bus to PCI bridge chipset. Only the bridge chip needs to be changed if the microprocessor and local bus design change which Intel do to improve functionality, speed and take advantage of new technology. External buses were reaching a limit . Incorporating on chip cache meant it was possible to run the processor at much higher clock rates inside while the external bus runs at lower speeds. A PLL (Phase Lock Loop) is also used and will accept an input from a reference clock and can multiply or divide the clock to accomplish the processor internal bus running faster. The 486 processor introduced a new System Management Mode, totally hidden from the other modes, but can be entered from the other modes and was developed to be used in notebook technology to allow power management functions to perform transparent to the operating system and applications. The 486 processor was eventually released in a SX version which had no math coprocessor . The original 486 was renamed the 486DX.
The Intel Pentium processor (P54C) was introduced in March 1993. The design completely enhanced the math coprocessor performance and increased the size of on chip cache. The Pentium local bus width is 64 bits and, under certain conditions the processor could execute two instructions in a single clock cycle. The Pentium also includes advanced system integrity features such as parity checking on each byte of data transferred on the external bus and generated on the address bus. Internal parity checking is done on instruction and data caches and nearly all internal registers and internal ROM instructions and data. The Pentium will also shut down if internal errors are detected. Did you notice Intel stopped using X86 to describe the model of processor and favored a naming convention.
The Intel Pentium Pro processor was introduced in November 1995. It is a superpipelined superscaler processor supporting ECC (Error Correcting Code), Fault Analysis & Recovery, Functional Redundancy Checking, Multi-branch prediction, data flow analysis and supports multiple processors and is supplied with 16 KB of L1 cache and 32 KB of on die L2 cache that operates at the processor bus speed. The processor can address 64 GB of main memory through the addition of four more address lines. This is a RISC chip with a 486 hardware emulator on it. Several techniques are used by this chip to produce more performance. A performance increase is achieved by dividing processing into stages, three instructions can be decoded in each one, as opposed to two for the Pentium. In addition, instruction decoding and execution decoupled, instructions can still be executed if one pipeline stops. The Pentium Pro was first aimed at the server market and optimised to run 32 bit code.
The Intel Pentium MMX (P55C) processor was announced (quietly) in January 1997, followed by an uproar by consumers who were not issued the new processor in pre-Christmas purchased PC's. The MMX (Matrix Math Extensions or Multimedia Extensions) chip incorporates a lot of RISC (reduced Instruction Set Computer) architecture as opposed to CISC (Complex Instruction Set Computer), and will be the subject of another guide. Multimedia extensions enhance audio, video playback and graphics performance. All Intel CPU processors support MMX extensions. The MMX processor was also the last in the line to be mounted in the  ZIF socket on the motherboard (presumably so that they could patent the design and stop AMD from taking over the market as most popular desktop processor).
The Pentium II processor from Intel includes MMX instructions (which enhance multimedia performance), it has 32kb onboard L1 cache. The L2 cache is mounted on a riser card (dual cavity package) along with the CPU, interconnected by the DIB (Dual Independent Bus) and fits into a slot on the motherboard. The Pentium II processor included SMP (Symmetric Multi-Processor) support for 2 CPU's through the GTL+ bus and uses two MMX execution units both execution units and the secondary cache are supplied with ECC. Pentium Pro and Pentium II processors contain a bug in the FPU (Floating Point Unit). The conversion of certain large negative numbers into integers sometimes fails to detect an overflow. Software solutions are available.
In February 1999, Intel unveiled its latest processor, the Pentium III. The Pentium III in addition to being faster than the Pentium Pro and Pentium II  processors has many new features, including a unique processor ID and new processor instructions, Streaming SIMD Extensions, or SSE. SIMD, stands for Single Instruction Multiple Data, the capability to process more than one data element in one instruction. Though SSE adds new features, existing applications are not affected. These new instructions do for the Pentium II what MMX did for the Pentium.

The competition

To further complicate processor options several manufacturers introduced clone processors. AMD (Advanced Micro Device) produced 286 processors under license from Intel and then claimed the license covered 386 and 486 designs. Up until the introduction of the K5 (Pentium equivalent), there was no real performance or functionality gains over Intel's processors. The K5 is not a clone of the Intel Pentium and claims performance gains due to superscaler design such as dual pipelines, branch prediction and execution in anticipation of a branch. AMD introduced the K6 in mid 1996 and like the Intel MMX, used 64KB L1 cache. The chip fits into existing processor sockets on the motherboard unlike the Intel design which needed a new motherboard. The AMD K6-2 is similar to the K6 except it offers 3DNow! technology, which is AMD's version of MMX - but much more powerful. The K6-2 has been proven to outperform a Pentium II machine of an equivalent clock speed. The K6-2 also introduced the 100MHz FSB (front side bus). The K6-2 should work in any system a K6 (Socket 7), however the K6-2 requires less voltage. The K6-2 is the best performing member of the Pentium-compatible family of Socket 7 processors. The K6-3 a higher performance version of the K6-2, due to Tri-level cache design and improved manufacturing process. The K6-III is roughly comparable in performance to the Pentium II. I do not have a lot of details about the AMD processors at thee time of typing.
Cyrix designed there processors from the ground up, using non of Intel's technology. Initial designs of 386 and 486 processors are not actual clones of Intel's processors, but hybrids. All the designs use a 486 like processor with a five stage pipeline, which allows many instructions to be executed in a single clock cycle. However a smaller level of data and instruction cache has been added to the chip. Cyrix also licensed its processor design to IBM, SGS Thompson and Texas Instruments. Texas Instruments also developed its own version of the 486 processor with larger caches and PCI bus interfaces built in.

The Maths CoProcessor 

This processor goes by several names, the coprocessor, the math coprocessor, the floating point processor and the NPX (Numerical Processor Extension). The processor can only directly work with whole integer numbers. Math's functions perform calculations on numbers in non-integer format, so Intel introduced the Maths Coprocessor , capable of performing numeric operations 20 to 100 times faster than equivalent software routines using integer arithmetic processors. The trend is to have the math coprocessor integrated on the same chip as the integer processor. In the past Intel based computers were slow compared to RISC workstations, but since the release of the Pentium processor Intel redesigned the structure and functions, so performance is 5 to 10 times that of 486 processors and competitive with RISC workstations.T he math coprocessor or also capable of handling integers packed numeric data. The math coprocessor can output data in several formats, internally all data is represented as temporary real numbers, a standard 80 bit format. To software, the coprocessor appears as additional registers, data types and instructions. The coprocessor has a number of embedded constants such as PI, Sine, Cosine, Tangent, etc and arithmetic functions in addition to add and subtract.

The Future

VLIW (Very Long Instruction Word) processors receive several instructions packed into a single instruction word from compiled software that is executed along a set of parallel execution units for simultaneous processing. Most programs process blocks of instructions between branches, typically small blocks and if compiled so that instructions and tasks are arranged such that the very long instruction word (VLIW) contained no branching, pipelines wouldn't stall. Which is the technology behind caching.

Too be continued...

 

Send mail to totalsupport.www@virgin.net with questions or comments about this web site.
Copyright © 2000 TotalSupport Computer Workshop
Last modified: December 22, 2000