Computer Hardware Reviews at Computer Power User Magazine. Your source for overclocking software guides, building your own computer, pc cooling and computer modding.
Home | Forums | Article Search | Subscribe & Shop | Contact Us | Log Out


AMD Today & Tomorrow Email This
Print This
View My Personal Library

Spotlight
November 2003 • Vol.3 Issue 11
Page(s) 50-58 in print issue
Add To My Personal Library

AMD Today & Tomorrow
The Follower Leads


An AMD Fab30 employee poses for the camera.
From the early '70s until the late '90s, AMD engineers made a pretty successful business out of reverse-engineering Intel's processors to create their own. Until K7, all AMD processors were compatible with Intel, meaning that they used the same chipsets and motherboards. Although AMD's technology was diverging from Intel's for the few preceding generations, with Athlon, the whole game changed. AMD had to build an infrastructure to support the new pathway its engineers were paving, and along the way, it helped alter the course of adoption for several technologies, most notably DDR memory.

Now taking the technology lead by introducing 64-bit computing on the desktop PC, AMD must capitalize on experience gained as a smaller player. Lacking the muscle to mandate change, AMD has had to really listen to the market and understand customer needs.

Who Needs 64-bit Computing?

"The simple answer is everybody needs 64-bit [computing], and the question really is not who needs it, but when do they need it." That's according to the person often singled out as the originator of AMD64, CTO and VP of AMD's Computation Products Group, Fred Weber. He has led the 64-bit charge, but with a steady eye on immediate customer needs and current market realities. "The 64-bit capability is for some people immediately valuable and for others is valuable later, but first and foremost this is going to be our flagship 32-bit new processor. We've brought a lot of new technology to the machine as part of that."

Are we there yet? Exactly when that 64-bit capability will be fully realized is not entirely in AMD's hands. A finished, stable 64-bit OS is needed for 32-bit software to be ported over to 64-bits. Check out "The 64-Bit Question" on page 63 of this issue for more on the 64-bit application question.

Under The Hood Of Athlon 64

The new technology incorporated into Athlon 64 breaks down roughly into two categories: that which enables 64-bit computing and that which enhances existing 32-bit as well as 64-bit performance. (For more AMD64-related coverage, see our Athlon 64 FX-51 review on page 22.)

32-bit compatibility. In retrospect, AMD's decision to extend the existing 32-bit x86 architecture rather than forge a new pathway to 64-bits was, according to Weber, an obvious choice. "You wake up in the morning and you realize that you need a higher-performance engine in a car; what on earth would make you think that you should get rid of the gas pedal and the brake and the steering wheel while you're at it? All you need is a higher-performance engine." This line of thinking led to an architecture that incorporates the features of a standard 32-bit x86 machine, such as the one the majority of us sit in front of every day. With the ultimate goal of taking every one of us to the 64-bit computing party eventually, AMD felt that Athlon 64 could achieve rapid adoption largely to the extent that it had competitive performance with today's 32-bit applications. That meant using an existing 32-bit OS, such as Window XP, and direct execution, not something filtered through x86 emulation. This decision was counter to what Intel had done with its Itanium implementation of 64-bits for the server/workstation market.

On-chip memory controller. After 64-bits, the sexiest innovation for Athlon 64 has to be the on-chip memory controller with an integrated on-chip northbridge running at full processor speed. Weber claims that bringing the memory controller onto the processor from its usual place on the chipset northbridge results in up to a 30% reduction in latency to main memory. "In a typical, traditional PC, whether an Athlon or an Intel Pentium 4 system, DRAM is on the order of 100 nanoseconds away from the processor. When you're running at multiple GHz, that means that every time you have to wait to get something out of memory, you're waiting for 200 to 300 cycles for that memory to come in and accomplishing nothing in that time period while you're waiting for data to come back from memory." Weber goes on to say, "There are a lot of other ‘tricks' that are used in processors to avoid this problem—out-of-order execution being the most obvious (which is used to expose memory requests much earlier so that you can actually do work while the memory request is outstanding), also prefetching and things like that. But even [with] all of those tricks, in the end, sometimes you just have to wait for memory, and it's well known that the latency to first memory access is a first-order effect on the performance of processors. So we've attacked that problem very directly by putting the memory controller directly on the processor."

There may be drawbacks to locking maximum memory speed and type to an integrated northbridge on the CPU instead of a separate chip on the motherboard. This means that you could be stuck if you want to upgrade as faster memory standards come along. At this time, AMD is not saying that future standards such as DDR500 will work with today's 64-bit processors. They may work, but no guarantees. According to AMD, however, when the Athlon 64 chips were designed, the memory controller was cordoned off from the rest of the processor architecture to accommodate the market reality that memory never stands still. In anticipation of newer memory availability, the memory controller is not deeply intertwined and can be reworked separately. This won't help current owners, but it should benefit the platform going forward.

HyperTransport technology ("White Paper: HyperTransport Technology," CPU November 2002, pages 42 to 45) works in concert with the integrated memory controller, according to Weber. "Since the memory controller is not separate from the CPU, we don't have to drag memory data back and forth across this bus, but instead all of the bandwidth of that interconnect can be used for I/O operations getting your graphics data out to your graphics card and moving your data back and forth to your disk and your network. So the Hyper-Transport bus increases the total amount of bandwidth available to your I/O subsystem and the memory controller onboard reduces the load on that I/O subsystem, which again gives much higher performance in I/O intensive environments."

Where the FX and standard Athlon 64 differ is in the actual bandwidth of the memory controller. The higher-end FX's will carry a 128-bit memory interface, while the non-FX mainstream processor will sport a 64-bit memory interface. The difference of total theoretical bandwidth is two-fold, with 6.4GBps for the FX and 3.2GBps for the standard Athlon 64. With the FX being carried over from the Opteron, systems will require the use of registered DIMMs. Non-FX systems won't have this requirement and can use cheaper and easier-to-find unbuffered DIMMs.

Performance gains. Our testing (see page 22) indicates that, except as noted below, performance gains for 32-bit apps over Athlon XP 3200+ range from 1% to 84%, with the average increase in performance being 22.5%. There were apps that ran slower on the FX-51. The Content Creation applications in View Perf 7.1 that ran slower averaged a 20% drop in performance over the XP, while SiSoft Sandra 2003 benchmarks averaged a 9% drop.

The 64-bit Gamble

What does AMD have to say about the possibility that all this great technology may not gel in the marketplace? Weber is very confident in saying that for the server and workstation space, the rollout is long overdue. "Workstations for years have been over-constrained by being only 32-bit capable, and it's amazing that the 32-bit x86 processor has done as well as it has in the workstation and server space because it really is a race car with tiny little wheels. So the early size of adoption is going to move very quickly to take advantage of 64-bit capability. In the desktop, we think it will move quickly, but whatever pace 64-bit adoption moves at, we think we're in the right place with our 32-bit processor, with the investment protection that when 64-bits is necessary, the processor is there ready to do it. So in a sense, it can't fail. We're absolutely certain 64 bits will happen and that we took the right approach."



International SEMATECH's "Wafer Sleuth" yield-management system is helping AMD quickly determine where yield limiters occur.
What Else Is Up AMD's Sleeve?

AMD has long been an advocate of automation in manufacturing process control and may well be at the forefront of designing, integrating, and implementing the complex software that is increasingly a requirement for modern large wafer, small die microprocessor fabrication. Although AMD wasn't even talking publicly about these technologies until quite recently, Bob Johnson, principal analyst with Gartner Dataquest has said, "AMD has a leadership position in process control automation. They have developed a comprehensive suite of software to automate fab operations." Or as one AMD engineer put it, "This industry is kind of a hybrid of electrical engineering, mechanical engineering, chemical engineering, device physics, and a whole bunch of things that come together to create these very complicated circuits."

AMD and Intel buy the same tools, but AMD recognizes that as the big player, Intel has a greater mindshare of the equipment supplier community. So the opportunity to push innovation there is not a big one for AMD. What it can do is innovate on the implementation side and get a competitive advantage through manufacturing science and know-how and applying control software and manufacturing execution software to run a much more efficient operation. Although Intel cannot be out-gunned, as it has virtually unlimited ammo, it may be vulnerable to being out-aimed. So AMD trains its sights on a limited number of specific targets, transistor performance for instance, and focuses deliberately on getting as close as possible to that target each time.

Once a chip is designed and proven, in order to get it into your home PC at a reasonable price, it has to go into mass production. The standard manufacturing model for IC (integrated circuits) is pretty straightforward: Create a set of design rules for each step in the manufacturing process and have a bunch of equipment set to perform each step exactly as it was performed during the development process to maximize the transfer of technology from R&D to manufacturing. Tweak as needed to achieve mature yield, and once you've reached the set goal of so many acceptable dies per wafer, freeze the process "recipe" so it becomes more or less static. Success is defined in terms of the best replication of all steps in the process in the fastest possible time. This is a good production method, especially if you are making toothbrushes. But the engineers at AMD don't believe it's the best way to cook a batch of modern microprocessors.

Better Electronics Through Chemistry

For this the engineers use a technique called APM (Automated Precision Manufacturing) that captures the dynamics of a process and adjusts variables on that process to keep on target. APM is actually derived from the chemical process industry. They looked at how modern oil refineries and chemical plants run and applied traditional control technology to their IC fabrication process. This involves isolating and compensating for variability at every step and to such a degree that the final outcome becomes predictable.

AMD's integrated suite of more than 200 AMD-patented or patent-pending technologies works through things such as feedback and feed-forward control, dynamic targeting, and sampling the process based on the uncertainty of that process (as opposed to having some fixed rule of sampling once every eight hours) to move away from a static operating model where the process is run, gets to a mature yield, and then is never changed. "The problem with that model is you have multiple pieces of equipment that are all slightly different, and each one of these processes in our industry has reactions and things that are very typical to dynamic manufacturing, inherent in them. As a result, if you don't change things, you're going to live with a high degree of variability, and that means your signal-to-noise ratio is going to be very low and you're not going to be able to predict anything," says Thomas Sonderman, AMD's Director of Advanced Process Control, Wafer Fabrication Group, and a chemical engineer himself. "You're just going to have to set it up and start running and hope you get what you want. But it's kind of like peeling back an onion: When you start pulling out more and more variability, your signal to noise goes up, and as a result, you can get much more predictive in what you're able to do."

How far does AMD go to track variables in the manufacturing process? Sonderman reports that for the past 10 years (going back to 150mm wafers) AMD has tracked every die on every wafer going through the fab. Using a universal coordinate system to determine each die's position on each individual wafer allows AMD to very quickly isolate a product-related failure or a wafer-level failure right down to on which wafer, and where on the wafer, the failure is happening. This is obviously essential to defining where problems are originating so they can be quickly identified and tools can be brought back online as soon as possible. For tool performance variations, it's finding the right conditions to run a tool and making sure it runs that way day in and day out, regardless of where it is in its maintenance cycle, while monitoring tool output as a control for consistent performance. We recently had the opportunity to spend a good deal of time querying Sonderman about AMD's unique approach to automated manufacturing. We'll attempt here to pass our gleanings on to you.



"The idea of open cassettes and making the environment as clean as possible has gone away. Now there are minienvironments with a class-one clean room inside a wafer transport mechanism, and the actual environment outside that can be less stringent," says Thomas Sonderman of AMD.
Automated Precision Manufacturing

AMD has designed APM to encompass several interacting component processes that all start with designing a product to meet customer needs and end with selling that customer the product at a price that keeps you and I in Athlons, and AMD in business. In between is everything that must occur to turn component materials into processors. There are several areas of technology deployment happening simultaneously in the fab. Integrated Production Scheduling is primarily a cost-saving automation that involves allocation and movement of materials to keep the manufacturing process operating efficiently. Equipment Performance Optimization monitors and maintains the tools used for fabrication; APC(Advanced Process Control) tracks, updates, tweaks, and oversees the individual recipe for each step in the fabrication process; and Yield Management Systems keep score on abnormal dies to isolate and identify where things went wrong, which could be anything from a faulty tool to defective materials. By definition, this cannot be a static process. Without continuous, customer-centric product improvements, AMD risks being overwhelmed by its much larger competitor.

Integrated Production Scheduling. As companies move to 300mm, they will have to implement more automated factory controls. Having already integrated "place & go" materials processing at 200mm, AMD plans on incorporating a "revolutionary" technology into its 300mm operations: ABS (Agent-Based Scheduling). Through the crafty use of software, ABS turns manufacturing components into anthropomorphic, well, ants, capable of making decisions and negotiating their own deals. Sonderman spilled some details, and this is what we picked up: There are lots (a lot is 25 wafers), wafers, and pieces of equipment in a fab. Wafers are trying to maximize their value at the lowest possible cost. Equipment is trying to maximize its utilization, uptime, and efficiency. So they each have different business rules or goals that they want to achieve. ABS enables each piece of material or machinery to advocate for its goals.

He gave this example: "A wafer comes out of a given lot and says, ‘I'm going to predict what my output is going to be based on everything that's happened to me, based on my pedigree.' Then it says, ‘Now I need to go get another thing done to me. I need another film put on me or I need another masking operation done to me.' But now it says, ‘Well, what's my priority? Am I a standard process? Am I a hot lot? Do I have a customer that needs me in two weeks vs. four weeks?' Then the wafer goes out and negotiates with different tools. Maybe there are five different tools that could achieve his objective function, and he is going to go out and negotiate with them and say, ‘Which of you tools is going to allow me to get done to me what I need done at the lowest possible cost?' And each tool obviously has its own business conditions trying to stay running. A tool may be coming up for routine preventative maintenance, so it may be a lot harder for that tool to meet the objective function vs. the other tools, and so it may choose to go down for maintenance and have that lot go to a tool that is more capable of achieving whatever it is that particular wafer needs. So when you start doing those kinds of things, then you are talking about this fully integrated, highly controlled fab environment where wafers are negotiating with tools in order to maximize the profitability of the fab. And that is analogous to how chemical plants are run today."

This is not yet how AMD operates. "Today we're doing it with real-time dispatching so it's still somewhat reactive and doesn't have the ability to learn, which is what ABS technology delivers." 300mm is getting busier by the minute.

Advanced Process Control. Today AMD has about 70% of its processes under APC-based control. Which means that variables such as temperature, pressure, flow rates, and time can be slightly modified to ensure that the process will deliver what the customer wants (the "objective function of the target volume," as AMD calls it). An important way of thinking about semiconductor manufacturing goes back to the chemical industry's use of "recipes." Sonderman explains, "Instead of just put-ting your cookies in at 400 degrees and hoping that in 12 minutes they come out exactly like you want them, what we're doing is adjusting the time a little bit and adjusting the temperature a little bit, and maybe even moving the cookies around a little bit in the oven to ensure that every cookie on the cookie sheet comes out exactly the way we want them and they all look and feel and taste the same way. So the recipe is the same and the mean temperature that they all see may be X, but we may do some subtle things during the cooling sequence to ensure that they all come out the exact same way."

In addition to wafer-level recipe control, APC encompasses fault detection and E-Diagnostics, next-generation SPC (Statistical Process Control), and eventually, integrated fab-wide control. These features have the ongoing task of producing rapid product performance improvement. Designing manufacturing processes and controls with this in mind is central to Sonderman's approach. "One of the things that we are certainly driving at AMD is putting more and more of the design for controllability and the design for manufacturability into the whole process development mindset so that when processes are delivered to manufacturing, they automatically have the ability to be controlled in a very precise manner. . . . A lot of people talk about advanced process control, but there's always some human, a technician, [who's] verifying the decision. What we've decided is that if people can control very complicated chemical plants that can blow up and do very nasty things if they don't run right, we believe we can take that similar technology and do it in an equivalent fashion in our industry, and that's what the genesis of the whole thing has been."

Yield Management Systems. For every new product, AMD performs an analysis based on the complexity of the product, the number of fabrication steps, the number of metal layers, and the critical dimensions that are going to be used to define the operating geometries. Engineers then come up with a yield entitlement, which is what the yield needs to be for a particular process to be profitable at the optimal manufacturing margin. The ability to ramp as quickly as possible to this yield entitlement or mature yield is especially important for AMD because it only has one fab. To start using up material for a new product on a new process that has subpar yield would be to sacrifice profit margin.

These are the business realities behind rumblings from this or that manufacturer saying that a product has been delayed due to low yield. It may have little to do with whether a given product is performing as expected, but if the percentage of unusable products coming off the line is higher than the factored allotment, the manufacturer cannot make a profit.

According to Sonderman, the Athlon processor ramp rate was the best for a new product in AMD's history, and ramping to mature yield for Opteron actually reduced that timeframe by two-thirds. International SEMATECH (sematech.org; a global consortium of semiconductor manufacturers) benchmarks AMD's fab as consistently best in class in key efficiency areas. For the fab to be accelerating at its best previous ramp rate by that percentage and still be a year later to market than expected really paints the picture of how hard this is. In researching the 64-bit Processor Timeline for this article, it became obvious that delays of a year or even multiple years are so common as to be predictable.

Equipment Performance Optimization. Sonderman outlined how various data-mining software is used to take a huge, routinely updated database and cycle that data through analysis routines, doing commonality analysis, correlation analysis, and similar types of operations. The software then spits out which particular process steps are in jeopardy. The APC technology then advises engineers to look at a particular tool to see which sensors, pressure gauges, or mass flow controllers (the things that actually drive the process) are operating abnormally. For this the software does not need to have a vast knowledge of every possible problem with a piece of equipment. It just uses fault detection and E-diagnostics to ascertain the health of the tool to decide if the current state is different from a known good state. If the tool is processing abnormally, it can be shut down to prevent producing bad product. As larger wafers are introduced, this becomes even more critical.

Automating for 300mm. Sonderman paints the picture for us. "So what we see for 300mm is that everything is automated. You're doing wafer-level, fully automated control; all the material movement is fully automated. You are going to a much more die-based and wafer-based analysis versus analyzing lots and material within lots. You're really doing everything at a wafer-level, and you really get to the chemical processing facility model where you have a control room environment. You really don't have people in the fab, other than people who are handling unique operating conditions or exceptions, and the fab is pretty much being run through software. Obviously you have a lot of very intelligent people creating the software, and people who are required to fix the tools when they go offline, but it's not like in a traditional fab where you have a lot of people moving material around, and you have a lot of people standing in front of tools making decisions."

This contrasts with AMD's current state of automation. At this point, while a process is running, the software is not actually manipulating the process within the run. That is all being done based on the inherent tool software. Between runs and on a run-to-run basis, the software can modify what the recipe looks like, in many cases, without actually having metrology (measured results) available. Every lot is controlled within the fab, and then certain lots are marked for metrology. That measurement then feeds information back into the controllers to ensure the right process adjustments are being made. Run-to-run control is about defining the process settings for a given lot of 25 wafers. In the future, and to a small extent today, there will be wafer-to-wafer control, which is the ability to change the recipe for each given wafer.

AMD has done benchmarking in preparation for the migration to 300mm. Sonderman is confident: "We have a capability that is second to none in terms of how we run our manufacturing operation."

Why isn't everyone doing this? Well, it's very difficult for one thing. Understanding all the variables for even a single step in the manufacturing process is a challenging task. Multiply that by 600 or so steps, and you are in the area of formidable. Tying that into all the tools, machinery, and tasks involved in running a modern fab that are not directly tied to any particular step; converting that human know-how into software tools that can factor in all variables, interactions, and potential outcomes; and determining which variables to tweak to achieve the specified outcome takes a very special kind of software designer; actually, a lot of very intelligent software designers.

Also, it isn't absolutely essential (yet). Intel appears to be working at 90nm without full automation. The company's Copy EXACTLY! model of duplicating the manufacturing process across many fabs over vast distances seems to have worked for them. Perhaps it is because, with the volume of product Intel sells, it is more cost-efficient to throw away a higher percentage of bad dies than it would be to fully automate as AMD is doing. Other top manufacturers have implemented some form of APC, but AMD's methodical approach is more widely deployable because it has created the framework to apply these principles anywhere in the system. So rather than focus only on certain process steps that have the highest degree of variability, AMD can apply them to any step in the fab and chain them together. Eventually, the process will be too small to not be fully automated. At that point, whoever hasn't built the necessary know-how to automate processing will have to acquire it.

Richard Heye On Building Infrastructure

Richard Heye shares his job title as vice president/general manager of AMD's Microprocessor Business Unit with Marty Seyer as one of "two in a box," overseeing all marketing, platform engineering, infrastructure development, and program management functions for AMD's Computation Products Group. Before coming to AMD in 1997, Heye was at Apple Computer and Digital Equipment Corporation, so he knows a thing or two about building infrastructure and 64-bit computing products. Heye recently sat down with us (twice) to discuss the challenges of building the necessary infrastructure to support AMD64. Here are highlights from our discussions. Click here to view the additional interview with Richard Heye.

Not the Intel model. "When AMD started to go and design its own infrastructure, we explicitly made a conscious business decision not to get into the motherboard business. One of the reasons we decided not to do that is we wanted to be able to really work with our partners and they should never feel threatened. They should never feel like we could take away their business. Because we didn't want to compete with Taiwan, we wanted to work with them collaboratively. . . . In our chipset business, for example, we do make chipsets, but unlike Intel, when we do a chipset any unique intellectual property that's needed to communicate, for example, to our microprocessor, we give that IP royalty-free to anyone who wants it. . . . We're not going to tell NVIDIA or ATI or VIA or any of those guys how to do a graphics engine. They do that better than we do. We will say, 'If you want to talk to the microprocessor and you want to use the HyperTransport bus, we'll give you as much technical support as you need to be successful in order to bring your product to market as quickly as possible.' And that is not the Intel model. . . . So, by having collaboration, by having good business cases, we have a very robust infrastructure. Case in point is: We're announcing our eighth-generation microprocessor, Athlon 64, and we're going to have chipsets from NVIDIA, VIA, SiS, and ULI (formerly ALI) at launch. And that's pretty darn good. We're going to also have a wide variety of motherboards from all the major motherboard vendors. . . . The reason they're doing that is that they have faith that AMD is going to be able to bring Athlon 64 to market and we're going to be able to drive the industry. Because at the end of the day, if they can't sell motherboards or chipsets, they're not going to do it."

We did our own chipset. "When I arrived, there was no infrastructure. Or I can phrase it differently: When I arrived, there was a beautiful infrastructure; they didn't even need me. Because when I arrived, they were just announcing K6, and K6, along with all the previous microprocessors that AMD had built and brought to market, were all Intel-compatible interfaces. So I could take a K6, buy any motherboard in the world that worked with an Intel processor, [and] plug it in and it just worked. So to some extent for K6, I was just incremental head count. . . . Now when Athlon came onboard, that was a challenge because that was the first time in the history of AMD in the microprocessor division where the interface was no longer compatible with Intel's. . . . So the first thing we did, quite frankly, was we did our own chipset because at that time we had no credibility. VIA, ALI, and SiS were the three major chipset vendors for that timeframe, and there's no way they were going to go and embark on a brand-new chipset for AMD with no track record. . . . So we did our own chipset and we did our own reference design. We designed internally a standard Taiwanese-class motherboard that worked with the microprocessor."



As AMD's Automated Precision Manufacturing becomes fully implemented, seeing real humans in the cleanroom environment will become more unusual.
'OK, we'll do one motherboard for you.' " . . . The first challenge we faced was going to Taiwan and saying, 'Listen, we would like you to take this reference design, do what you do best—make necessary modifications to meet your specific needs—and ship that board to work with AMD Athlon.'. . . We actually had to show them a working motherboard and say, 'Hey, this is for real; we're not making this up. We have a technically viable part, and it actually works really well.' Athlon was a fine, fine part. To the motherboard vendors' credit, they actually said, 'OK, we'll do one motherboard for you.' It was sort of a test case from their point of view. When I say 'give them credit' you have to understand it was not to their advantage to do an AMD motherboard because they already had a whole line of Intel motherboards—that was the major part of the market—and obviously they didn't want to gratuitously annoy Intel because you don't want to annoy a major vendor, and yet they actually did it. And two things happened: They started making money on it and they started growing our market share. Because Intel does their own motherboards and we don't, they were able to grow their share pretty quick. The wonderful thing about infrastructure in business is if there's a market and you can meet the market needs, you can make money . . . it was really hard to kickstart that momentum in the beginning. I can remember literally week in and week out tracking exactly the number of motherboards that were produced in Taiwan, down to the single digit, and just tracking it and getting that infrastructure ramping up. . . . In the old days it was trying to get one motherboard. Now some of these top vendors have three, four, five, six motherboards for AMD in the works, using different chipsets, going after different targeted markets and segments."

16 time zones later. ". . . I learned early on that you always have to be honest to get their trust. In any big engineering projects, you have good days and bad days. On the bad days, you just tell them, 'Hey, we got these problems but we're working through them. Stay with us and as soon as we fix the problems, we'll pass it on to you and we'll keep going.' I think that sort of really open relationship with third-party vendors actually got us a lot of respect. . . . We've built this fairly large Taiwanese lab where now we have a lot of support for the motherboard vendors locally. Because the reality is that if you are producing a motherboard in Taiwan, you want pretty quick access to technical support. You don't want to wait 16 time zones later to call Austin, Texas, you want to be able to just pick up the phone and talk to someone in your own language and in your own time zone."

From an art to a science. "The proof in all this is that in the history since Athlon shipped, you can search for all the stories you want, and you won't find a story that says the AMD infrastructure is melting down, [that] it has quality problems. We had availability the first two or three quarters of the Athlon ramp because we were growing and we had fits and starts for a while, but once we got over that knothole of figuring out how to work with Taiwan, set up processes and procedures, got the lab in place, it's been working really well. I'm not saying it's easy, but it went from an art to a science. . . . That's where we are on infrastructure right now; it's running pretty good."

Back On Planet CPU

If only things went as smoothly on the street where were we live. Our big plans for a spicy Athlon 64 motherboard roundup picnic (see the article on page 60) were rained out by availability issues for all but six of the boards (with a few ants walking around on some of the boards, as performance was mixed). But we'll revisit those issues in the future. Until then, we've had a blast bringing you more of the bigger picture . . . INSIDE AMD.

by Joan Wood


Factory Wide Contol

View the large version of this graphic.
(NOTE: These pages are PDF
(Portable Document Format) files.
You will need Adobe Acrobat
to view these pages.
Download Adobe Acrobat Reader
)


Intel's Two Cents

As the world's largest producer of desktop processors and AMD's primary competitor in that market, we were curious about Intel's take on AMD's introduction of 64-bit computing for the desktop PC. We turned to Intel Spokes-person George Alfs, who expressed Intel's position in this way:

"Intel and the industry are focused on bringing benefits that PC users can take advantage of now—areas like wireless connectivity; greater multitasking performance via Hyper-Threading Technology; and easier/faster connectivity for PCs and servers with technologies such as USB 2.0, PCI Express, and Gigabit Ethernet.

"Adding 64-bit addressability means little without the necessary software, related tools, utilities, and technologies to ensure the PC performs at its best and can work with applications and peripherals. With just 5% of servers using 64-bit memory addressability, there is little need for 64-bits on the desktop today.

"The Pentium 4 processor 3.20GHz with groundbreaking HT Technology provides up to 25% higher performance in some cases than an equivalent non-HT enabled system and enables a better user experience in multitasking environments and with multithreaded applications."

P4 Extreme Edition. Not that Intel is nervous or anything, but it looks like it also wants to play in the high-end gamer's market. Just before this issue went to press, we received a spiffy new CPU from Intel called the Pentium 4 Extreme Edition (watch for benchmarks in the next issue), which comes with an additional 2MB of on-die L3 cache. Add that to the P4's current 512KB of L2 cache, and we're talking about a whopping total of 2.5MB of total cache. As a result, the transistor count shoots up from 55 million for the Northwood P4 to 169 million for this P4 Extreme. It will debut at 3.2GHz and with the standard 800MHz frontside bus. The CPU itself sounds much like a Xeon on crack for gamers, and we can hardly wait. Just to whet your appetite, here are a couple of teaser benchmarks: Quake III turned in a score of 463fps, and 3D-Mark03 came in at 6023. Expect a full review and comparison to the Athlon 64 FX-51 next month.


Apple's G5: The Other 64-bit Consumer Processor

Athlon 64 isn't the first time that a 64-bit processor has shown up for desktop computers. Apple recently released its G5 system, based on IBM's 64-bit PowerPC 970 processor, which Apple CEO Steve Jobs named "the world's fastest personal computer." We don't know about that, but just like the Athlon 64, the G5's CPU is based upon a 0.13-micron SOI (silicon-on-insulator) process and is a 64-bit chip that can run 32-bit applications natively.

Running at 2GHz and with only 58 million transistors, the G5 is not as fast as the FX-51's 2.2GHz speed, nor is it as packed with trannies as the Athlon 64 with 105.4 million. In fact, the G5 processor is closer to the Athlon XP processor's 54.9 million transistors. It's a bit short in the L2 cache department, too, with only 512KB compared to the Athlon 64's 1,024KB. G5 systems are similar to the FX-51, as 400MHz DDR memory is used across a dual-channel 128-bit 6.4GBps bus. Apple has shown performance numbers that reflect well upon a G5 vs. a dual Intel Xeon 3GHz system, but funnily enough, the Opteron wasn't in the comparison.

According to Jobs, the G5 was designed to be SMP (Symmetric Multi-Processing) from the ground up, and even though a dual G5 is available, we'll be waiting for the cows to come home before it touches an eight-way Opteron system. A 3GHz G5 is apparently due out by Q3 of next year. We're sure you can hardly wait. . . .


AMD64 In A Nutshell

Just in case you don't have a knack for memorizing long lists of processor specs but still want to be trendy, here's a little AMD64 crib sheet. Enjoy.

Legacy mode. Being fully backward-compatible with direct execution using native 16-bit or 32-bit code on a 16-bit or 32-bit OS, no recompile is required for legacy mode. It functions just like an Athlon or Athlon XP, but with 20% to 25% performance improvement for most apps. This is how most of us would be using AMD64 if it arrived on our doorstep today.

32-bit compatibility mode. A new 64-bit OS is required for 32-bit compatibility mode, which supports x86-protected mode only. 16-bit and 32-bit code is directly executed by the CPU with no requirement to recompile. This mode doubles the process size of 32-bit Windows (currently limited to 2GB) to enable larger (up to 4GB) apps. The new 64-bit OS provides 32-bit libraries and "thunking" layer for translating 32-bit system calls to 64-bit calls, and AMD64's additional/wider registers are not accessible in this mode. Perfor-mance is similar to legacy mode with potential benefits for disc I/O dependent apps and some potential penalty for "thunking."

64-bit mode. Naturally, a new 64-bit OS is required for this mode. In order to run, all 32-bit apps must be ported/recompiled to 64-bit, including all kernel-level programs, drivers, and anything linked or plugged in to the app, even hardware drivers that the app installs. A 64-bit app can access 32-bit libraries in this mode. Users will see a performance increase over 32-bit apps due to additional registers and increased complexity due to memory addressability beyond 4GB.

Registers. AMD64 x86-64 has a 32-bit x86 machine within the 64-bit machine. Additional functionality for 64-bit mode includes eight more 64-bit GPRs (general-purpose registers) as well as 64-bit versions of the original eight 32-bit x86 GPRs, doubling the number and width of GPRs. SSE support includes eight new SSE registers and the addition of SSE2 support. Legacy x87 technology is included for now but will be phased out and replaced by SSE over time.

AMD64 core improvements. Athlon was great, but AMD64 includes bigger and better branch prediction, deeper pipelining (with two additional pipelines) that allows higher clock speeds but decreases IPC (instructions per clock), higher bandwidth for both L1 & L2 cache (effectively doubling bandwidth), and lower latency for L2 cache. An increased number of entries in TLB (translation lookaside buffers) are aimed at the server market but benefit 3D rendering on the desktop, too. AMD64 has double the number of SSE registers and the addition of SSE2 support and double the number and double the width of GPRs (general purpose registers). The integrated on-chip memory controller provides lower latency memory accesses and is aided by processor-level transistor technology. The downside of an on-chip memory controller is lack of support for new memory standards as they emerge.


64-Bit Processor Timeline

1991

  • MIPS: First real 64-bit processor, R4000 RISC, named "Microprocessor of the Year"
1992
  • DEC: Alpha EV4 64-bit RISC processor (0.75-micron, 150MHz, superscalar and superpipelined, 1.7
    million transistors)
  • SGI: Acquires MIPS Technologies
1993
  • DEC: Alpha EV4 reaches 200MHz
  • SGI MIPS: R4400 ships (150MHz)
1994
  • DEC: Alpha AXP (2X 16KB cache, $1,083), Alpha EV5 1 BIPS
  • HP & Intel: Announce joint effort that eventually becomes Merced/Itanium
  • IBM: A10/Cobra & A30/Muckie 64-bit PowerPC
  • SGI MIPS: R8000 optimized for floating-point operation

1995

  • DEC: AlphaServer 8400 supports up to 12 EV5s & 14GB memory.
  • Fujitsu: HAL SPARC64
    first 64-bit workstation
  • Sun: UltraSPARC

1996

  • DEC: EV5 500MHz 2 BIPS
    peak execution
  • Microsoft: Promises 64-bit Windows NT for Intel's Merced launch
  • Motorola: PowerPC 620 processor
  • Nintendo: Nintendo64 game system with MIPS R4300 processor
  • SGI MIPS: R10000 (R10K) premiers out-of-order execution & multiple FPU

1997

  • HP & Intel: Announce IA-64 architecture; say 64-bit will be server-only for five years after shipping
  • IBM: RS64
  • Intel: 0.18-micron Merced for 1999 will run all software currently operating on 32-bit Intel processor-based machines.
  • SGI MIPS: Embedded R4700 named "Microprocessor of the Year,"; ships R12K, cancels H1 & H2 cores, and will adopt IA-64

1998

  • Compaq: Acquires DEC, announces Compaq Alpha EV6
  • Intel: Acquires 64-bit Alpha chip operations from DEC
  • Microsoft: Ships prebeta 64-bit development kits

1999

  • AMD: Discloses details of x86 64-bit SledgeHammer
  • IBM: POWER3, RS64-III
  • Intel: Demonstrates first computation cluster using Itanium
  • Microsoft: Ships
    Windows NT
  • Compaq DEC: Alpha EV67

2000
  • AMD: Releases SledgeHammer & x86-64 spec
  • IBM: Announces it will build
    Alpha processors for Compaq
  • Intel: Ships limited quantities of Itanium
  • Microsoft: Demonstrates 64-bit Windows
  • Sun: UltraSPARC II
  • HP: PA-8600

2001

  • HP: Acquires Compaq, including DEC
  • IBM: POWER4
  • Microsoft: Plans Windows XP 64-bit Edition for Itanium
  • SGI MIPS: MIPS-derived first 64-bit processor designed for use in space
  • Sun: UltraSPARC III

2002

  • AMD: Demos 64-bit Hammer systems to journalists running 64-bit
  • Linux kernel, 32-bit x86 and 64-bit x86-64 applications
  • Apple: Announces it will be using IBM 64-bit processor
  • Fujitsu: SPAC64V

2003

  • AMD: Opteron for servers, 940-pin Athlon 64 FX-51, 754-pin Athlon 64 processors for desktop
  • Apple: Launches first 64-bit desktop computer IBM PowerPC G5-based Power Mac
  • IBM: PowerPC 970
  • HP/Compaq/DEC: Alpha EV-7 supports "switchless/glueless" 64-way multiprocessing
  • Microsoft: Window XP Professional for AMD64 beta



Want more information about a topic you found of interest while reading this article? Type a word or phrase that identifies the topic and click "Search" to find relevant articles from within our editorial database.

Enter A Subject (key words or a phrase):
ALL Words (‘digital’ AND ‘photography’)
ANY Words (‘digital’ OR ‘photography’)
Exact Match ('digital photography'- all words MUST appear together)



Home      Copyright & Legal Information      Privacy Policy      Site Map      Contact Us
Copyright © 2010 Sandhills Publishing Company U.S.A. All rights reserved.