Originally posted at http://cs.ucsb.edu/~cgb/stateOfTheCell.html, and thus, looks much better there.
The Cell Broadband Engine Architecture (which we shall refer to as simply the Cell architecture) was designed as a compromise between the general-purpose but slower CPU and the specific-purpose and faster GPU. It is a heterogeneous architecture: it contains processing units that specialize in different tasks. However, critics (and even some fans) of the Cell architecture claim that it is incredibly difficult to produce good, fast code on it. Having spent the last quarter working with the Cell architecture, we agree with this sentiment. But why?