Multicore CPUs like Intel's Core 2 Extreme and AMD's dual-core Athlon 64 have brought about better performance, better power management, and a way for the industry to free itself from a slavish devotion to sheer clock speed.
But multicore CPU architectures are creating a nightmare for programmers, particularly those who want to take full advantage of the new chips' power. The upshot? Much of your brand-new CPU's potential, like an uneducated brain, is going to waste.
That's quite a change from the days when the industry was fixated on clock speed. In those days, developers got a "free ride," says Jerry Bautista, director of technology management for Intel's Microprocessor Technology Lab.
"Even if (programmers) did nothing and the clock speed doubled, their software would run significantly faster," Bautista says of the days when the megahertz wars raged. "When we go down the path of parallelism, that free ride is over."
Help is finally on the way. The chip industry is sending out the cavalry ... in the form of new development tools.
In general, "multicore" chips include two or more cores -- the central processing units of a chip -- on a single piece of silicon. This allows properly coded software to break computing tasks down into separate pieces, known as "threads," and process the threads simultaneously, in parallel, instead of sequentially, as older single-core chips require.
Although multicore platforms have been around for some time in academia and research, it's been just over two years since the chips were commercially introduced by the likes of Sun Microsystems, IBM, Intel and AMD. Now, as core counts are poised to take off with eight, 32 and even 64 cores, the software that will run on them is seriously lagging. With the exception of the gaming industry, the vast majority of software publishers aren’t programming for multithreaded chips.
Indeed, the potential benefits of multicore chips are rendered obsolete if the software itself isn't coded to take advantage of its primary selling point: namely, parallelism.
Put another way, for the software to run at maximum speed, programmers will have to develop multithreading applications that take advantage of them. As Alan Zeichick, president and principal analyst for Camden Associates notes, that's hard -- hard, the way earning a Ph.D. in computer science is hard.
Typical 9-to-5 programmers, who are used to programming single-thread apps, are ill-equipped to handle things such as memory locks or calculation delays. What's more, there are added complications like scalability: Code written for eight-, 16- or 32-core systems won't necessarily scale up to work on systems with 64 cores -- like Tilera's recently announced Tile64 -- or even more.
Now the two largest chipmakers in the world are redoubling their efforts to help programmers catch up with the hardware.
To date, the measures by AMD and Intel involve everything from hybrid software-hardware solutions to developing new tools, benchmarks and compilers that programmers can use to scale and troubleshoot the code they produce for multicore systems. For instance, AMD's "Hardware Extensions for Software Parallelism" initiative aims to better integrate software and hardware to take advantage of software parallelism.
For Intel, the renewed push toward both promoting and simplifying development for multicore platforms is most readily evident on the company's research blog. Over the past few weeks Intel researchers have published a number of papers detailing how the company is aiming to narrow the formidable gap in application development for multicore systems.
The company also recently released its Threading Building Blocks C++ library, an open-source project to provide programmers with generic code that implements "tasks" instead of "threads." Intel claims the library is future-proof, with compatibility for its four-, eight- and (someday) even 100-core processors.
The overriding philosophy, according to Intel's Bautista, is to tackle the problem from both a hardware and software perspective and to use grants and incentives to cultivate more academic research on parallelism, something he says has been lacking in recent years.
Because these massively multicore chips don't exist yet, Bautista freely admits that Intel doesn't have all the answers.
To be fair, neither does Intel's smaller rival, AMD, even with its new initiative, Hardware Extensions for Software Parallelism. Like Intel, AMD's philosophy is to cast as wide a net as possible, seeding the developer community at large with the tools necessary for parallel programming, while also tackling the problem from both the hardware and software sides.
In that vein, the company last week released Light-Weight Profiling, or LWP, a tool that provides real-time analysis of how to improve application performance for multicore platforms.
In essence, LWP lets the code itself make on-the-fly decisions -- while it is running -- about how best to boost the performance in a parallel environment.
"We want to enable the tuning processes to be done in an automated way at runtime," says Earl Stahl, AMD's vice president of software engineering, adding that such additions will be particularly beneficial to multithreaded runtime environments such as Sun Microsystems' Java Virtual Machine and Microsoft's .Net Framework.
These kind of self-optimizing environments may indeed be where the solution lies, according to Zeichick. He says both AMD and Intel are trying to encourage operating-system developers -- like Microsoft, Apple and Linux vendors -- to incorporate these technologies into the operating systems themselves, a move that would save the chipmakers millions in terms of development.
"Ultimately, I think they will be successful. If you take Microsoft or Red Hat, they need to have their systems scale up. It's as simple as that. Platform vendors have a vested interest in this … a bigger interest really than anyone else," he says.
Whether this happens or not, AMD and Intel certainly have to figure out a way to close the substantial gap between hardware and software that has emerged over the past two years. If not, both companies risk releasing quad-, octa- and many-core platforms to a public that really couldn't care less.
As Bautista reminds us: "It's not the hardware that's really compelling; it's what you can do with the hardware that really is the interesting stuff."