Advanced Micro Devices revealed numerous details about the Bulldozer micro-architecture, which will power the company's next-generation central processing units (CPUs) for desktops, servers and workstations. Apparently, the main goal of AMD's designers when it came to Bulldozer was to ensure maximum sharing of resources within multi-core microprocessors to get high performance amid moderate low power and die sizes.
Traditional approach to creation of multi-core microprocessor is very straightforward: each core acts independently and shares only the most obvious resources with other: L3 cache, memory controller, processor bus, etc. In Bulldozer designs, cores will be able to dynamically share fetch and decode blocks, caches and other units. At least in initial designs, multi-core chips will consist of several major blocks, each of which will have two independent integer cores (that will share fetch, decode and L2 functionality) with dedicated schedulers and two 128-bit FMAC pipes with one FP scheduler. This means that each major block (Bulldozer Module) is, according to AMD, essentially a tightly-linked dual-core microprocessor with shared fetch, decode and floating point units.
Such "dual-core" major block will not be as efficient as two traditional cores, but will consume less power and will use less die space, which in effect means that more of major building blocks can be installed without increasing thermal design power or die size to unacceptable levels. Moreover, AMD reasonably claims that the approach is more efficient than simultaneous multi-threading or chip-level multi-threading. In fact, according to AMD, each major block can provide 80% performance of a dual-core microprocessor.
AMD also implied that Bulldozer will feature a new predication-directed instruction preset mechanism to overlap instruction miss requests to cache or memory and thus improve efficiency of execution. In particular, this will help AMD to maximize utilization of the "dual-core" major blocks of Bulldozer microprocessors.
Given the fact that AMD's Bulldozer architecture seems to be very modular, we can expect AMD to tailor designs in accordance with performance and/or power requirements.
The first Bulldozer processors will be made using 32nm SOI fabrication process in 2011 and that with 33% increase of the number of cores, up to 50% of additional performance may be received in server applications, at least, based on AMD's internal simulations.
More info : TechReport - AMD's Bulldozer architecture revealed