Evan Cheng 1606e8e4cd Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.


Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@66875 91177308-0d34-0410-b5e6-96231b3b80d8
2009-03-13 07:51:59 +00:00
..
2009-01-21 04:58:48 +00:00
2009-01-26 03:31:40 +00:00
2009-01-21 04:58:48 +00:00
2009-01-06 23:10:38 +00:00
2009-01-26 22:33:37 +00:00
2009-01-26 22:33:37 +00:00
2009-01-26 03:37:41 +00:00
2009-01-06 23:10:38 +00:00
2009-01-06 23:10:38 +00:00
2009-01-26 22:33:37 +00:00

//===- README.txt - Notes for improving CellSPU-specific code gen ---------===//

This code was contributed by a team from the Computer Systems Research
Department in The Aerospace Corporation:

- Scott Michel (head bottle washer and much of the non-floating point
  instructions)
- Mark Thomas (floating point instructions)
- Michael AuYeung (intrinsics)
- Chandler Carruth (LLVM expertise)
- Nehal Desai (debugging, i32 operations, RoadRunner SPU expertise)

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR
OTHERWISE.  IN NO EVENT SHALL THE AEROSPACE CORPORATION BE LIABLE FOR DAMAGES
OF ANY KIND OR NATURE WHETHER BASED IN CONTRACT, TORT, OR OTHERWISE ARISING
OUT OF OR IN CONNECTION WITH THE USE OF THE SOFTWARE INCLUDING, WITHOUT
LIMITATION, DAMAGES RESULTING FROM LOST OR CONTAMINATED DATA, LOST PROFITS OR
REVENUE, COMPUTER MALFUNCTION, OR FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL,
OR PUNITIVE  DAMAGES, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES OR
SUCH DAMAGES ARE FORESEEABLE.

---------------------------------------------------------------------------
--WARNING--:
--WARNING--: The CellSPU work is work-in-progress and "alpha" quality code.
--WARNING--:

If you are brave enough to try this code or help to hack on it, be sure
to add 'spu' to configure's --enable-targets option, e.g.:

        ./configure <your_configure_flags_here> \
           --enable-targets=x86,x86_64,powerpc,spu

---------------------------------------------------------------------------

TODO:
* Create a machine pass for performing dual-pipeline scheduling specifically
  for CellSPU, and insert branch prediction instructions as needed.

* i32 instructions:

  * i32 division (work-in-progress)

* i64 support (see i64operations.c test harness):

  * shifts and comparison operators: done
  * sign and zero extension: done
  * addition: done
  * subtraction: needed
  * multiplication: done

* i128 support:

  * zero extension, any extension: done
  * sign extension: needed
  * arithmetic operators (add, sub, mul, div): needed
  * logical operations (and, or, shl, srl, sra, xor, nor, nand): needed

    * or: done

* f64 support

  * Comparison operators:
    SETOEQ              unimplemented
    SETOGT              unimplemented
    SETOGE              unimplemented
    SETOLT              unimplemented
    SETOLE              unimplemented
    SETONE              unimplemented
    SETO                done (lowered)
    SETUO               done (lowered)
    SETUEQ              unimplemented
    SETUGT              unimplemented
    SETUGE              unimplemented
    SETULT              unimplemented
    SETULE              unimplemented
    SETUNE              unimplemented

* LLVM vector suport

  * VSETCC needs to be implemented. It's pretty straightforward to code, but
    needs implementation.

* Intrinsics

  * spu.h instrinsics added but not tested. Need to have an operational
    llvm-spu-gcc in order to write a unit test harness.

===-------------------------------------------------------------------------===