This article continues a lengthy series. You may be interested in the start of silicon valley, Fairchild, the founding of Intel, the start of the x86 architecture, Intel’s pivot to become a processor company, the i960 and i486, the Intel Inside campaign, and the FDIV bug and the Pentium Pro.
On the 8th of January in 1997, Intel introduced the Pentium MMX (P5) which officially was a meaningless initialism, but at some point during development meant Matrix Math Extensions. Intel had learned from experience that advertising matters, and this meant that the Pentium MMX had a rather large advertising budget.
The purpose of MMX was to accelerate multimedia, graphics, and certain productivity workloads. The architecture extension did this quite well, but it wasn’t a performance boost that was immediately gained just by purchasing the a new chip. Software had to be rewritten to take advantage of the new instructions, and not all software categories were candidates. To get the most advantage out of MMX, an application would need to have small native data types like 8bit pixels or 16bit audio samples, have repeated and computationally intense operations done on such data types, and have quite a bit of parallel execution potential. This is due to MMX working only with integers and being single-instruction multiple-data in design (SIMD); that is, a single instruction performed the same action on multiple data elements at the same time. Intel had the kind of expertise required to build this from earlier work on the i860 and i750, but they weren’t the only company using SIMD. This kind of design had been on display in UNIX workstation chips like SPARC with VIS and like PA-RISC with MAX-2. This was, however, completely new for a consumer level chip.
The greatest magic trick Intel performed with MMX was that it was mostly independent of the rest of the chip design. The primary change to enable the efficient use of MMX on the P5 was in instruction decoding to allow two MMX instructions to be decoded, scheduled, and issued in a single clock cycle. Yet, keeping MMX largely independent meant that Intel could first offer the technology in the P5 which was an iterative development from the earlier i386 and i486 chips, but then later offer the technology in the newer P6 which was a completely new reimplementation of the x86 architecture. In this sense, it was something like an on-die accelerator, not completely unlike having an NPU on-die in CPUs today.
MMX added new registers (MM0 to MM7), four data types (packed byte, packed word, packed doubleword, quadword), and 57 instructions. These instructions and the silicon that made them work offered a speed increase of about 30% to 500% depending upon the given operation. The one catch was in context switching which could take up to 50 CPU cycles depending upon the software in use at the time. In real world testing, the extra $41 that a customer would spend on a Pentium MMX over a regular Pentium was well worth it, but the performance benefit wasn’t usually as high as Intel had claimed. For example, if a user was in Photoshop where the MMX instructions were in use and providing benefit, but that user then switched over to an older version of Word to embed an image in a document, the context switch could cost 50 cycles and slow the machine considerably. Yet, the Pentium MMX did have a larger cache, and this offered performance benefits of 10% to 30% in unmodified code. So, taken as a whole, it was still well worth the spend.
At this time, Netscape and Java were hot buzzwords; the internet and Windows 95 were continuing to drive PC sales, and Intel was benefiting greatly from these developments. With the internet’s growth, commodity PCs running Linux and the Apache web server (having surpassed NCSA’s httpd in market share during 1996) were beginning to move Intel’s consumer and workstation hardware into datacenters. This was a huge change, and one that would only become more important over time.
AMD hadn’t been sleeping. The launch of Pentium Pro and the Pentium MMX had required a response, and that response was the AMD K6 launched on the 2nd of April in 1997. Like the Pentium Pro, it moved to an internal RISC architecture. It was built of 8.8 million transistors on a 350nm CMOS process. It featured a 32K data cache, 32K instruction cache, MMX, 66MHz FSB, and a clock rate up to 233MHz. It even slotted into Socket 7 like a Pentium. For a brief moment, AMD held the performance crown.
On the 7th of May in 1997, Intel launched the Pentium II. Just weeks after the K6 was released and while AMD was still struggling to get high-end parts onto store shelves, the Pentium II was out and readily available. To make things worse for AMD, the Pentium II was good; it was really good. Essentially, the Pentium II was a Pentium Pro retrofitted for a lower price point. The slow 16bit code execution was fixed by adding segment register caches and a flag to skip pipeline flushes. Then, Intel moved the L2 cache off chip, increased the L2 cache, added the new MMX technology, and provided the off-chip L2 and the CPU itself in a single package known as Slot 1.
The Pentium II was made of 7.5 million transistors on a 350nm CMOS process. It was available at speeds of 233MHz, 266MHz, and 300MHz, all of which supported a 66MHz FSB. The Pentium II ushered in the world of SDRAM and AGP. Given that the Pentium Pro and Pentium II shared the P6 architecture, SDRAM and AGP advancements could be enjoyed with Pentium Pro via Socket 8 to Slot 1 adapters, but most consumers would have found the Pentium II to be the better microprocessor for home and small office use. Here, the mixture of 16bit and 32bit code so common at the time would shine with peak performance. Whether one were running NT or MS-DOS, the Pentium II had a rather large performance lead in most workloads.
Of course, with the Pentium II, Intel continued its media campaigns. Intel technicians had begun wearing “bunny suits” at Fab3 in Livermore in 1973. Intel took some creative liberty with those early clean room suits, and created the Bunny People campaign. The Bunny People were first shown in television ads for the Pentium MMX that ran during Super Bowl XXXI, and the campaign continued to feature prominently with the Pentium II. The company released bean bag plush dolls of the Bunny People as promotional items at CompUSA in December of 1997, they were available in limited numbers at Intel gift stores. Due to their popularity, the company then sold them through America Online for $6.99 each. The dolls were 8 inches tall and came in 5 colors. More BunnyPeople dolls were launched over the following few years with more colors, and three different heights. A 14 inch version was released, a keychain version at just 4 inches in height, and a massive 36 inch version. The 14 inch dolls even had accessories like clean room air packs and laptops.
Pentium MMX and Pentium II were hits, and the ad campaign was incredibly successful. By Q4, over 90% of the chips Intel shipped had been introduced earlier in that same year. Intel closed 1997 with $6.9 billion in income on revenues of $25.07 billion. These numbers are especially impressive given the company spent $2.3 billion on R&D and another $4.5 billion in capex. Much of the company’s success is owed to their international nature. While both Europe and Southeast Asia had had some economic difficulty in the first half of the year, the USA and China had remained strong markets. For all of the success the company enjoyed, and for all of the transformation their products enabled, Andy Grove was Time magazine’s Man of the Year, IndustryWeek’s Technology Leader of the Year, and Chief Executive’s CEO of the Year in 1997. The leadership of Intel shifted slightly in 1997, with Grove becoming chairman, Moore becoming chairman emeritus, and Craig R. Barrett becoming President and COO.
To start off the year, Intel released new Pentium II CPUs on the 26th of January in 1998. These were made on a 250nm CMOS process, were packaged for Slot 1, had 16K data and instruction caches, 512K L2 cache, and supported bus speeds of either 66MHz or 100MHz. With 66MHz FSB, these Pentium II’s could support clock rates of 266MHz, 300MHz, or 333MHz. For 100MHz FSB parts, clock rates were 350MHz, 400MHz, or 450MHz.
On the 12th of February in 1998, Intel launched the i740 GPU featuring a highly pipelined architecture. It supported perspective-correct texture mapping, Gouraud shading, alpha-blending, antialiasing, Z Buffering, and many more features. It even had some dedicated circuitry for video conferencing support. This GPU came with 2MB to 8MB of SGRAM and used the 64bit AGP bus at 100MHz. Given these advanced features, nearly 100 OEMs had signed up to produce i740 boards. This amounted to Intel having shipped about 4 million i740 chips leading up to the launch. At Computex in 1998, 59 boards were on display. Despite the advanced specifications, the excitement, and the large number of board partners, the i740 was a market failure, and Intel sold the part for as little as $7 to clear inventories. There were two problems. First, the card was meant to be cheap with an initial price around $35 for volume purchases, but AGP was generally only available on more expensive motherboards. Second, while AGP was designed to allow faster access to memory (VRAM was expensive), doing so frequently hurt overall performance. It hurt performance severely enough that 3dfx cards on PCI with larger amounts of slower on-board memory were generally the better option. Jon Peddie did an excellent right up of the i740 for those interested.
On the 9th of March in 1998, Intel introduced the EtherExpress PRO/100 server adapters for high-performance web servers and large-scale networks. Two variants were available. The EtherExpress PRO/100+ used the Intel 82558 chip and supported PCI hot plug (a technology developed in partnership with Compaq). The EtherExpress PRO/100 Intelligent Server Adapter had an Intel i960 on-board and added adaptive load balancing capabilities. Pricing for the PRO/100+ was $119 and the PRO/100 ISA cost $399. Of course, most buyers of the PRO/100 ISA would be corporate and they’d have had volume discounts.
Mobile Pentium II parts were released on the 2nd of April at clock speeds of 233MHz, 266MHz, and 300MHz. Like the new desktop parts, these featured 16K data and instructions caches, 512K of L2, and were made on 250nm CMOS. Unlike the desktop parts, these had two different package types: MMC-1 (280-pin), Mini-Cartridge (240-pin). The Mini-Cartridge package was the equivalent of Slot 1 while the MMC-1 package included the northbridge and voltage regulator.
Intel had increasingly become the high-end market favorite. The Pentium II and Pentium Pro were dominating in expensive PCs, workstations, and servers, but AMD and Cyrix had completely consumed the budget market. To remedy this gap in Intel’s offerings, the company introduced the Celeron on the 15th of April in 1998. The first Celeron was a 266MHz P6 chip on Slot 1 without any L2 cache. It was manufactured on a 250nm CMOS process. These chips didn’t perform well, and they therefore didn’t sell well despite their low price.
On the 20th of May in 1998, Barrett was promoted to the position of CEO, Grove remained chairman, and Moore remained chairman emeritus. Craig R. Barrett was born on the 29th of August in 1939. He received his Ph.D. in Materials Science from Stanford in 1964. He then worked for the university’s Materials Science and Engineering department until 1974. During his time at Stanford, he authored more than forty papers on the microstructure of materials, and wrote a textbook, The Principles of Engineering Materials, which was published in 1973. He joined Intel as a manager in technology development upon leaving Stanford, and he did well at the company. He became senior vice president in 1987, executive vice president in 1990, and then President and COO in 1997 as previously mentioned. Upon rising to CEO, he aimed to keep Intel’s leadership position as the importance of networking grew.
On the 29th of June, Intel released their successor to the Pentium Pro, the Pentium II Xeon. While these were architecturally the same as the refreshed Pentium II lineup from earlier that year, these processors had dramatically larger cache sizes and higher clock speeds. The first two released were 400MHz parts with either 512K or 1MB of L2 cache. These larger caches thus necessitated a new slot known as Slot 2. Up to four Pentium II Xeons could be used in a single system. The 512K part was priced at $1124 while the 1MB part was priced at $2836.
On the 24th of August in 1998, Intel released a new Celeron to address the the earlier part’s performance issues. The new Celeron could offer nearly double the performance of its predecessor depending upon the workload, and this was mostly enabled by the presence of an on-die L2 cache of 128K. Initial parts were clocked at 300MHz or 333MHz and were built on a 250nm process. Later parts in this line up would clock as high as 533MHz, and these were shifted to Socket 370 as the Celeron had no need for off-chip cache.
Despite the company having lost a little desktop market share to AMD and Cyrix (for cost, but also AMD’s 3DNow! extension which added vectors and floating point to MMX with K6-2), Intel had gained around 60% of the web server market. The company’s capex continued to be high at $4 billion and R&D spend was $2.7 billion. The company had 64,500 employees, $31.4 billion in assets, and income of $6 billion on revenues of $26.2 billion.
On the 26th of February in 1999, Intel launched the Pentium III built of 9.5 million transistors on 250nm CMOS process. The Pentium III was a further development of the P6 with Streaming SIMD Extensions (SSE) taking the place of the MMX, the FPU getting some optimization, and the L1 cache controller being redesigned. These parts featured clocks of either 450MHz or 500MHz on early SKUs, and they supported a bus speed of 100MHz. These were still Slot 1 parts and they featured a 512K off-chip L2 cache and 16K instruction and data caches on-chip. SSE was essentially the extension of MMX to floating point operations and made everything from image rendering and 3D modeling to video encoding far faster. Pentium III Xeons followed on the 17th of March with cache sizes of 512K, 1MB, and 2MB packaged for Slot 2, clocked at 500MHz, and supporting quad-processor configurations.
The Pentium III was quickly followed by by a more powerful version on the 25th of October in 1999. This refresh moved the L2 cache on-die at a size of 256K and increased the cache’s bus to 256bits. The FSB speed for these new Pentium IIIs was bumped to 133MHz allowing for the trasnfer of 1.06GB per second. To allow the system to make use of this efficiently, the Pentium III gained 6 fill buffers (previously 4), 8 bus queue entries (previously 4), and 4 writeback buffers (previously just 1). All of this was enabled by a new 180nm CMOS process and 29 million transistors. These Pentium IIIs are often referred to as the Pentium III E as when an original Pentium III and new Pentium III featured the same clock speed, the newer chip was suffixed with an E. As for those speeds, they were 500MHz, 533MHz, 600MHz, 650MHz, 667MHz, 700MHz and 733MHz. Still, more variants were released starting in December at speeds of 750MHz to 1GHz. For models making use of the 133MHz FSB, they’d be referenced with a B suffix. So, the highest possible performing part would have been a Pentium III 1000 EB. Many of these later parts were also released on 370-pin PGA rather than Slot 1 as once again with on-chip L2, there was little reason for the more expensive packaging. Pentium III Xeons were released concurrently starting at 600MHz for Slot 2 and these were far cheaper than the earlier Xeons if buyers were willing to have just 256K L2. The largest cache size was 2MB, and that carried quite the price premium at $3692 for a 900MHz part (that particular SKU wasn’t available until the 21st of March in 2001).
Rumors of a 64bit Intel chip had started gaining some steam in 1998, and in 1999, Intel’s first working Itanium chips were made. In Intel’s annual report, Itanium was hailed as a “revolutionary new IA-64 architecture designed to meet the needs of powerful Internet servers.” The company was included in the Dow Jones Industrial Average in November of 1999 and was ranked eighth on Fortune’s list of most admired companies. Intel closed the millenium with income of $7.3 billion on revenues of $23.9 billion. For the first time, more than half of Intel’s sales were outside the USA.
ARF now has over 5.25 thousand subscribers and averages around 3.5 thousand readers each day. Thank you. I never really imagined that so many people would have an interest in the history of the computing industry. Many of you work at the companies whose history I cover, and many of you were present for time periods and events I cover. A few of you are mentioned by name in my articles. All corrections to the record are welcome; feel free to leave a comment.