Commit Graph

3238 Commits

Author SHA1 Message Date
Adhemerval Zanella
18560fae0b This patch fixes the MC object emission of 'nop' for external function calls
and also fixes the R_PPC64_TOC16 and R_PPC64_TOC16_DS relocation offset.
The 'nop' is needed so a restore TOC instruction (ld r2,40(r1)) can be placed
by the linker to correct restore the TOC of previous function.

Current code has two issues: it defines in PPCInstr64Bit.td file a LDinto_toc
and LDtoc_restore as a DSForm_1 with DS_RA=0 where it should be
DS=2 (the 8 bytes displacement of the TOC saving). It also wrongly emits a
MC intruction using an uint32_t value while the PPC::BL8_NOP_ELF
and PPC::BLA8_NOP_ELF are both uint64_t (because of the following 'nop').

This patch corrects the remaining ExecutionEngine using MCJIT:

ExecutionEngine/2002-12-16-ArgTest.ll
ExecutionEngine/2003-05-07-ArgumentTest.ll
ExecutionEngine/2005-12-02-TailCallBug.ll
ExecutionEngine/hello.ll
ExecutionEngine/hello2.ll
ExecutionEngine/test-call.ll



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166682 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 14:29:13 +00:00
Bill Schmidt
37900c5dcb This patch addresses a PPC64 ELF issue with passing parameters consisting of
structs having size 3, 5, 6, or 7.  Such a struct must be passed and received
as right-justified within its register or memory slot.  The problem is only
present for structs that are passed in registers.

Previously, as part of a patch handling all structs of size less than 8, I
added logic to rotate the incoming register so that the struct was left-
justified prior to storing the whole register.  This was incorrect because
the address of the parameter had already been adjusted earlier to point to
the right-adjusted value in the storage slot.  Essentially I had accidentally
accounted for the right-adjustment twice.

In this patch, I removed the incorrect logic and reorganized the code to make
the flow clearer.

The removal of the rotates changes the expected code generation, so test case
structsinregs.ll has been modified to reflect this.  I also added a new test
case, jaggedstructs.ll, to demonstrate that structs of these sizes can now
be properly received and passed.

I've built and tested the code on powerpc64-unknown-linux-gnu with no new
regressions.  I also ran the GCC compatibility test suite and verified that
earlier problems with these structs are now resolved, with no new regressions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166680 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 13:38:09 +00:00
Adhemerval Zanella
aa71428378 Initial TOC support for PowerPC64 object creation
This patch adds initial PPC64 TOC MC object creation using the small mcmodel
(a single 64K TOC) adding the some TOC relocations (R_PPC64_TOC,
R_PPC64_TOC16, and R_PPC64_TOC16DS).

The addition of 'undefinedExplicitRelSym' hook on 'MCELFObjectTargetWriter'
is meant to avoid the creation of an unreferenced ".TOC." symbol (used in
the .odp creation) as well to set the R_PPC64_TOC relocation target as the
temporary ".TOC." symbol. On PPC64 ABI, the R_PPC64_TOC relocation should
not point to any symbol.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166677 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25 12:27:42 +00:00
Nadav Rotem
2704834661 Implement a basic VectorTargetTransformInfo interface to be used by the loop and bb vectorizers for modeling the cost of instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166593 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 17:22:41 +00:00
Micah Villmow
aa76e9e2cf Add in support for getIntPtrType to get the pointer type based on the address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166578 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-24 15:52:52 +00:00
Bill Schmidt
726c23705c This is another TLC patch for separating code for the Darwin and ELF ABIs
for the PowerPC target, and factoring the results.  This will ease future
maintenance of both subtargets.

PPCTargetLowering::LowerCall_Darwin_Or_64SVR4() has grown a lot of special-case
code for the different ABIs, making maintenance difficult.  This is getting
worse as we repair errors in the 64-bit ELF ABI implementation, while avoiding
changes to the Darwin ABI logic.  This patch splits the routine into
LowerCall_Darwin() and LowerCall_64SVR4(), allowing both versions to be
significantly simplified.  I've factored out chunks of similar code where it
made sense to do so.  I also performed similar factoring on
LowerFormalArguments_Darwin() and LowerFormalArguments_64SVR4().

There are no functional changes in this patch, and therefore no new test
cases have been developed.

Built and tested on powerpc64-unknown-linux-gnu with no new regressions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166480 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-23 15:51:16 +00:00
Nadav Rotem
cbd9a19b5d Reapply the TargerTransformInfo changes, minus the changes to LSR and Lowerinvoke.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166248 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 23:22:48 +00:00
Ulrich Weigand
6c28a7eec8 This patch fixes failures in the SingleSource/Regression/C/uint64_to_float
test case on PowerPC caused by rounding errors when converting from a 64-bit
integer to a single-precision floating point. The reason for this are
double-rounding effects, since on PowerPC we have to convert to an
intermediate double-precision value first, which gets rounded to the
final single-precision result.

The patch fixes the problem by preparing the 64-bit integer so that the
first conversion step to double-precision will always be exact, and the
final rounding step will result in the correctly-rounded single-precision
result.  The generated code sequence is equivalent to what GCC would generate.

When -enable-unsafe-fp-math is in effect, that extra effort is omitted
and we accept possible rounding errors (just like GCC does as well).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166178 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 13:16:11 +00:00
Bob Wilson
3b9a911efc Temporarily revert the TargetTransform changes.
The TargetTransform changes are breaking LTO bootstraps of clang.  I am
working with Nadav to figure out the problem, but I am reverting it for now
to get our buildbots working.

This reverts svn commits: 165665 165669 165670 165786 165787 165997
and I have also reverted clang svn 165741

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166168 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-18 05:43:52 +00:00
Bill Schmidt
7a6cb15a92 This patch addresses PR13949.
For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8
bytes are to be passed in the low-order bits ("right-adjusted") of the
doubleword register or memory slot assigned to them.  A previous patch
addressed this for aggregates passed in registers.  However, small
aggregates passed in the overflow portion of the parameter save area are
still being passed left-adjusted.

The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the
caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on
the callee side.  The main fix on the callee side simply extends
existing logic for 1- and 2-byte objects to 1- through 7-byte objects,
and correcting a constant left over from 32-bit code.  There is also a
fix to a bogus calculation of the offset to the following argument in
the parameter save area.

On the caller side, again a constant left over from 32-bit code is
fixed.  Additionally, some code for 1, 2, and 4-byte objects is
duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only.  The
LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to
handle both ABIs, and I propose to separate this into two functions in a
future patch, at which time the duplication can be removed.

The patch adds a new test (structsinmem.ll) to demonstrate correct
passing of structures of all seven sizes.  Eight dummy parameters are
used to force these structures to be in the overflow portion of the
parameter save area.

As a side effect, this corrects the case when aggregates passed in
registers are saved into the first eight doublewords of the parameter
save area:  Previously they were stored left-justified, and now are
properly stored right-justified.  This requires changing the expected
output of existing test case structsinregs.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166022 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-16 13:30:53 +00:00
Micah Villmow
2c39b15073 Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165941 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 16:24:29 +00:00
Adhemerval Zanella
f35c62bf02 PowerPC: add EmitTCEntry class for TOC creation
This patch replaces the EmitRawText by a EmitTCEntry class (specialized for
each Streamer) in PowerPC64 TOC entry creation.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165940 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-15 15:43:14 +00:00
Micah Villmow
fb384d61c7 Revert 165732 for further review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165747 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 21:27:41 +00:00
Micah Villmow
f3840d2c16 Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165726 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 17:21:41 +00:00
Bill Schmidt
a867f37897 This patch addresses PR13947.
For function calls on the 64-bit PowerPC SVR4 target, each parameter
is mapped to as many doublewords in the parameter save area as
necessary to hold the parameter.  The first 13 non-varargs
floating-point values are passed in registers; any additional
floating-point parameters are passed in the parameter save area.  A
single-precision floating-point parameter (32 bits) must be mapped to
the second (rightmost, low-order) word of its assigned doubleword
slot.

Currently LLVM violates this ABI requirement by mapping such a
parameter to the first (leftmost, high-order) word of its assigned
doubleword slot.  This is internally self-consistent but will not
interoperate correctly with libraries compiled with an ABI-compliant
compiler.

This patch corrects the problem by adjusting the parameter addressing
on both sides of the calling convention.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165714 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-11 15:38:20 +00:00
Nadav Rotem
e3d0e86919 Add a new interface to allow IR-level passes to access codegen-specific information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165665 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 22:04:55 +00:00
Bill Schmidt
26160f4e64 When generating spill and reload code for vector registers on PowerPC,
the compiler makes use of GPR0.  However, there are two flavors of
GPR0 defined by the target:  the 32-bit GPR0 (R0) and the 64-bit GPR0
(X0).  The spill/reload code makes use of R0 regardless of whether we
are generating 32- or 64-bit code.

This patch corrects the problem in the obvious manner, using X0 and
ADDI8 for 64-bit and R0 and ADDI for 32-bit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165658 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 21:25:01 +00:00
Bill Schmidt
a5d0ab5553 The PowerPC VRSAVE register has been somewhat of an odd beast since
the Altivec extensions were introduced.  Its use is optional, and
allows the compiler to communicate to the operating system which
vector registers should be saved and restored during a context switch.
In practice, this information is ignored by the various operating
systems using the SVR4 ABI; the kernel saves and restores the entire
register state.  Setting the VRSAVE register is no longer performed by
the AIX XL compilers, the IBM i compilers, or by GCC on Power Linux
systems.  It seems best to avoid this logic within LLVM as well.

This patch avoids generating code to update and restore VRSAVE for the
PowerPC SVR4 ABIs (32- and 64-bit).  The code remains in place for the
Darwin ABI.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165656 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-10 20:54:15 +00:00
Bill Wendling
6765834754 Create enums for the different attributes.
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165488 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-09 07:45:08 +00:00
Adhemerval Zanella
1c7d69bbe2 PR12716: PPC crashes on vector compare
Vector compare using altivec 'vcmpxxx' instructions have as third argument
a vector register instead of CR one, different from integer and float-point
compares. This leads to a failure in code generation, where 'SelectSETCC'
expects a DAG with a CR register and gets vector register instead.

This patch changes the behavior by just returning a DAG with the 
vector compare instruction based on the type. The patch also adds a testcase
for all vector types llvm defines.

It also included a fix on signed 5-bits predicates printing, where
signed values were not handled correctly as signed (char are unsigned by
default for PowerPC). This generates 'vspltisw' (vector splat)
instruction with SIM out of range.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165419 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-08 18:59:53 +00:00
Adhemerval Zanella
cd585084e5 PowerPC: Fix object creation with PPC::MTCRF8 instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165411 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-08 18:25:11 +00:00
Adhemerval Zanella
51aaadb7bd Add floating-point to and from integer conversion
This patch add altivec support for v4i32 to v4f32 and for v4f32 to
v4i32 vector rounding conversion.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-08 17:27:24 +00:00
Micah Villmow
3574eca1b0 Move TargetData to DataLayout.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165402 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-08 16:38:25 +00:00
Bill Schmidt
b2544ece59 This patch splits apart PPCISelLowering::LowerFormalArguments_Darwin_Or_64SVR4
into separate versions for the Darwin and 64-bit SVR4 ABIs.  This will
facilitate doing more major surgery on the 64-bit SVR4 ABI in the near future.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165336 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-05 21:27:08 +00:00
Will Schmidt
d875533b7f - Mark the BCC and BLR defs as isCodeGenOnly per error output from
llvm-tblgen -gen-asm-matcher.

 PPCInstrInfo.td |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)





git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165315 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-05 15:16:11 +00:00
Will Schmidt
916381569a - add tokens to PPCInstrInfo.td and PPCInstr64Bit.td to resolve
"Instruction 'foo' has no tokens" errors during llvm-tblgen
-gen-asm-matcher attempts.  At this time, the added
tokens are "#comment" style rather than the actual mnemonic.  This will
be revisited once the rest of the base asmparser bits get straightened
out for ppc64-elf-linux.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165237 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-04 18:14:28 +00:00
Will Schmidt
e37091931c test commit / whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165233 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-04 16:20:24 +00:00
Sylvestre Ledru
94c22716d6 Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164768 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-27 10:14:43 +00:00
Sylvestre Ledru
7e2c793a2b Fix a typo 'iff' => 'if'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164767 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-27 09:59:43 +00:00
Bill Wendling
2c18906118 Remove the `hasFnAttr' method from Function.
The hasFnAttr method has been replaced by querying the Attributes explicitly. No
intended functionality change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164725 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-26 21:48:26 +00:00
Roman Divacky
5236ab3fdd Specify MachinePointerInfo as refering to the argument value and offset of the
store when handling byval arguments. Thus preventing reordering of the store
with load with post-RA scheduler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164553 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-24 20:47:19 +00:00
Bill Schmidt
419f376564 Small structs for PPC64 SVR4 must be passed right-justified in registers.
lib/Target/PowerPC/PPCISelLowering.{h,cpp}
 Rename LowerFormalArguments_Darwin to LowerFormalArguments_Darwin_Or_64SVR4.
 Rename LowerFormalArguments_SVR4 to LowerFormalArguments_32SVR4.
 Receive small structs right-justified in LowerFormalArguments_Darwin_Or_64SVR4.
 Rename LowerCall_Darwin to LowerCall_Darwin_Or_64SVR4.
 Rename LowerCall_SVR4 to LowerCall_32SVR4.
 Pass small structs right-justified in LowerCall_Darwin_Or_64SVR4.

test/CodeGen/PowerPC/structsinregs.ll
 New test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164228 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-19 15:42:13 +00:00
Roman Divacky
6fc3ea2f99 Fix the isLocalCall() by checking for linker weakness as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164155 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 18:27:49 +00:00
Roman Divacky
f145c135f3 Avoid symbol name clash when filling TOC.
Patch by Adhemerval Zanella.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164141 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 17:10:37 +00:00
Roman Divacky
4cd56014ae On PPC64 emit the environment pointer. Patch by Adhemerval Zanella.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164139 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 16:55:29 +00:00
Roman Divacky
eb8b7dc536 Optimize local func calls to not emit nop for TOC restoration.
Patch by Adhemerval Zanella.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164138 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 16:47:58 +00:00
Roman Divacky
536a88ad5b When creating MCAsmBackend pass the CPU string as well. In X86AsmBackend
store this and use it to not emit long nops when the CPU is geode which
doesnt support them.

Fixes PR11212.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164132 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-18 16:08:49 +00:00
Craig Topper
6ffb4024d8 Change unsigned to uint32_t to match base class declaration and other targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@164001 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-16 18:10:23 +00:00
Craig Topper
86a1c32e67 Use LLVM_DELETED_FUNCTION in place of 'DO NOT IMPLEMENT' comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163974 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-15 17:09:36 +00:00
Michael Liao
6c7ccaa3fd Fix PR11985
- BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 21:43:09 +00:00
Roman Divacky
ba9d069d79 Enable exceptions handling on PPC64 now that cr misaligned spilling
was fixed in r163713.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163715 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 15:29:32 +00:00
Roman Divacky
9d760ae5c6 This patch corrects logic in PPCFrameLowering for save and restore of
nonvolatile condition register fields across calls under the SVR4 ABIs.                                            
                                                                                                                   
 * With the 64-bit ABI, the save location is at a fixed offset of 8 from                                           
the stack pointer.  The frame pointer cannot be used to access this                                                
portion of the stack frame since the distance from the frame pointer may                                           
change with alloca calls.                                                                                          
                                                                                                                   
 * With the 32-bit ABI, the save location is just below the general
register save area, and is accessed via the frame pointer like the rest
of the save areas.  This is an optional slot, so it must only be created                                           
if any of CR2, CR3, and CR4 were modified.                                                                      
                                                                                                                   
 * For both ABIs, save/restore logic is generated only if one of the     
nonvolatile CR fields were modified.                                   

I also took this opportunity to clean up an extra FIXME in
PPCFrameLowering.h.  Save area offsets for 32-bit GPRs are meaningless
for the 64-bit ABI, so I removed them for correctness and efficiency.


Fixes PR13708 and partially also PR13623. It lets us enable exception handling
on PPC64.

Patch by William J. Schmidt!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163713 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-12 14:47:47 +00:00
Benjamin Kramer
39646d96e7 MC: Overhaul handling of .lcomm
- Darwin lied about not supporting .lcomm and turned it into zerofill in the
  asm parser. Push the zerofill-conversion down into macho-specific code.
- This makes the tri-state LCOMMType enum superfluous, there are no targets
  without .lcomm.
- Do proper error reporting when trying to use .lcomm with alignment on a target
  that doesn't support it.
- .comm and .lcomm alignment was parsed in bytes on COFF, should be power of 2.
- Fixes PR13755 (.lcomm crashes on ELF).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163395 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-07 17:25:13 +00:00
Hal Finkel
c10d5e9dae Move the PPC TOC defs into the PPC64 InstrInfo file.
Since TOC is just defined for PPC64, move its definition to PPC64 td file.

Patch by Adhemerval Zanella.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163234 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 19:22:27 +00:00
Roman Divacky
94b17f334b Remove always true checks. Noticed by Adhemerval Zanella.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163117 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-03 16:55:42 +00:00
NAKAMURA Takumi
d2a35f2937 PPCISelLowering.cpp: Fix r162725.
[Tobias von Koch] What's happening here is that the CR6SET/CR6UNSET is breaking the chain of register copies glued to the function call (BL_SVR4 node). The scheduler then moves other instructions in between those and the function call, which isn't good!

Right. That's the case where there is no chain of register copies before the call, so InFlag == 0... Attached is a new revision of the patch which should fix this for good.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162916 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 15:52:29 +00:00
NAKAMURA Takumi
25f6b5a554 PPCISelLowering.cpp: Whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162915 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-30 15:52:23 +00:00
Hal Finkel
bbd169b1d9 Reserve space for the mandatory traceback fields on PPC64.
We need to reserve space for the mandatory traceback fields,
though leaving them as zero is appropriate for now.

Although the ABI calls for these fields to be filled in fully, no
compiler on Linux currently does this, and GDB does not read these
fields.  GDB uses the first word of zeroes during exception handling to
find the end of the function and the size field, allowing it to compute
the beginning of the function.  DWARF information is used for everything
else.  We need the extra 8 bytes of pad so the size field is found in
the right place.

As a comparison, GCC fills in a few of the fields -- language, number
of saved registers -- but ignores the rest.  IBM's proprietary OSes do
make use of the full traceback table facility.

Patch by Bill Schmidt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162854 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-29 20:22:24 +00:00
Roman Divacky
c6c2ced384 Emit word of zeroes after the last instruction as a start of the mandatory
traceback table on PowerPC64. This helps gdb handle exceptions. The other
mandatory fields are ignored by gdb and harder to implement so just add
there a FIXME.

Patch by Bill Schmidt. PR13641.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162778 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 19:06:55 +00:00
Hal Finkel
621b77ade2 Add PPC Freescale e500mc and e5500 subtargets.
Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to
the PowerPC backend.

Patch by Tobias von Koch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162764 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 16:12:39 +00:00
Hal Finkel
8dc440a46a Split several PPC instruction classes.
Slight reorganisation of PPC instruction classes for scheduling. No
functionality change for existing subtargets.
 - Clearly separate load/store-with-update instructions from regular loads and stores.
 - Split IntRotateD -> IntRotateD and IntRotateDI
 - Split out fsub and fadd from FPGeneral -> FPAddSub
 - Update existing itineraries

Patch by Tobias von Koch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162729 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 02:49:14 +00:00
Hal Finkel
f3c3828e57 Allow remat of LI on PPC.
Allow load-immediates to be rematerialised in the register coalescer for
PPC. This makes test/CodeGen/PowerPC/big-endian-formal-args.ll fail,
because it relies on a register move getting emitted. The immediate load is
equivalent, so change this test case.

Patch by Tobias von Koch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162727 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 02:10:33 +00:00
Hal Finkel
82b3821208 Eliminate redundant CR moves on PPC32.
The 32-bit ABI requires CR bit 6 to be set if the call has fp arguments and
unset if it doesn't. The solution up to now was to insert a MachineNode to
set/unset the CR bit, which produces a CR vreg. This vreg was then copied
into CR bit 6. When the register allocator saw a bunch of these in the same
function, it allocated the set/unset CR bit in some random CR register (1
extra instruction) and then emitted CR moves before every vararg function
call, rather than just setting and unsetting CR bit 6 directly before every
vararg function call. This patch instead inserts a PPCcrset/PPCcrunset
instruction which are then matched by a dedicated instruction pattern.

Patch by Tobias von Koch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162725 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 02:10:27 +00:00
Hal Finkel
97d047dec7 Optimize zext on PPC64.
The zeroextend IR instruction is lowered to an 'and' node with an immediate
mask operand, which in turn gets legalised to a sequence of ori's & ands.
This can be done more efficiently using the rldicl instruction.

Patch by Tobias von Koch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162724 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-28 02:10:15 +00:00
Richard Smith
1144af3c9b Fix integer undefined behavior due to signed left shift overflow in LLVM.
Reviewed offline by chandlerc.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162623 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-24 23:29:28 +00:00
Roman Divacky
9fb8b49380 Lower constant pools and jump tables via TOC on PPC64/SVR4.
In collaboration with Adhemerval Zanella.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162562 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-24 16:26:02 +00:00
Jakob Stoklund Olesen
ea47628cba Add missing SDNPSideEffect flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162557 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-24 14:43:27 +00:00
Roman Divacky
05b2bc8781 Revert r162034, r162035 and r162037.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162039 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-16 19:07:59 +00:00
Roman Divacky
e88d17de99 Define and handle additional fixup kinds. By Adhemerval Zanella.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162037 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-16 18:37:52 +00:00
Roman Divacky
0016f73ae5 Fix typo and grammar. By Adhemerval Zanella.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@162032 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-16 18:19:29 +00:00
Jakob Stoklund Olesen
0c5f5f4916 Don't use getNextOperandForReg().
This way of using getNextOperandForReg() was unlikely to work as
intended. We don't give any guarantees about the order of operands in
the use-def chains, so looking only at operands following a given
operand in the chain doesn't make sense.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161542 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-08 23:44:04 +00:00
Hal Finkel
8da94ad6e0 Add a comment about mftb vs. mfspr on PPC.
Thanks to Alex Rosenberg for the suggestion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161428 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-07 17:04:20 +00:00
Hal Finkel
f45717e985 MFTB on PPC64 should really be encoded using MFSPR.
The MFTB instruction itself is being phased out, and its functionality
is provided by MFSPR. According to the ISA docs, using MFSPR works on all known
chips except for the 601 (which did not have a timebase register anyway)
and the POWER3.

Thanks to Adhemerval Zanella for pointing this out!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161346 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-06 21:21:44 +00:00
Hal Finkel
8cc3474f72 Add readcyclecounter lowering on PPC64.
On PPC64, this can be done with a simple TableGen pattern.
To enable this, I've added the (otherwise missing) readcyclecounter
SDNode definition to TargetSelectionDAG.td.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-04 14:10:46 +00:00
Gabor Greif
4ecaedc771 allow 'make CPPFLAGS=<something>' work again
this makes this hack a bit more bearable
for poor souls who need to pass custom
preprocessor flags to the build process

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161240 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 13:31:24 +00:00
Jakob Stoklund Olesen
68c10a2ff7 Remove variable_ops from call instructions in most targets.
Call instructions are no longer required to be variadic, and
variable_ops should only be used for instructions that encode a variable
number of arguments, like the ARM stm/ldm instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160189 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-13 20:44:29 +00:00
Evan Cheng
769951f6cc Target option DisableJumpTables is a gross hack. Move it to TargetLowering instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159611 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 22:39:56 +00:00
Bob Wilson
564fbf6aff Add all codegen passes to the PassManager via TargetPassConfig.
This is a preliminary step toward having TargetPassConfig be able to
start and stop the compilation at specified passes for unit testing
and debugging.  No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159567 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 19:48:31 +00:00
Bill Wendling
0bcbd1df7a Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and
include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h.

The reasoning is because the DebugInfo module is simply an interface to the
debug info MDNodes and has nothing to do with analysis.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-28 00:05:13 +00:00
Jack Carter
0518fca843 There are a number of generic inline asm operand modifiers that
up to r158925 were handled as processor specific. Making them 
generic and putting tests for these modifiers in the CodeGen/Generic
directory caused a number of targets to fail. 

This commit addresses that problem by having the targets call 
the generic routine for generic modifiers that they don't currently
have explicit code for.

For now only generic print operands 'c' and 'n' are supported.vi


Affected files:

    test/CodeGen/Generic/asm-large-immediate.ll
    lib/Target/PowerPC/PPCAsmPrinter.cpp
    lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    lib/Target/ARM/ARMAsmPrinter.cpp
    lib/Target/XCore/XCoreAsmPrinter.cpp
    lib/Target/X86/X86AsmPrinter.cpp
    lib/Target/Hexagon/HexagonAsmPrinter.cpp
    lib/Target/CellSPU/SPUAsmPrinter.cpp
    lib/Target/Sparc/SparcAsmPrinter.cpp
    lib/Target/MBlaze/MBlazeAsmPrinter.cpp
    lib/Target/Mips/MipsAsmPrinter.cpp
    
MSP430 isn't represented because it did not even run with
the long existing 'c' modifier and it was not apparent what
needs to be done to get it inline asm ready.

Contributer: Jack Carter



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159203 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 13:49:27 +00:00
NAKAMURA Takumi
d5c407d2d0 llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159112 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24 13:32:01 +00:00
Craig Topper
0a2f793d6e Silence an unused variable warning on release builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159074 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 08:09:30 +00:00
Hal Finkel
009f7afbeb Add support for the PPC isel instruction.
The isel (integer select) instruction is supported on the 440 and A2
embedded cores and on the POWER7.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159045 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 23:10:08 +00:00
Hal Finkel
070b8dba80 Convert the PPC backend to use the new FMA infrastructure.
The existing contraction patterns are replaced with fma/fneg.
Overall functionality should be the same.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158955 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 00:49:52 +00:00
Hal Finkel
2bbc9193b4 Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc stores on PPC.
Thanks to Tobias von Koch for pointing out this problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158932 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 20:10:48 +00:00
Hal Finkel
0fcdd8b2cc Add support for generating reg+reg (indexed) pre-inc loads on PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158823 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 15:43:03 +00:00
Lang Hames
d693cafcfb Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158757 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 22:51:23 +00:00
Jakob Stoklund Olesen
7164288c3e Implement PPCInstrInfo::isCoalescableExtInstr().
The PPC::EXTSW instruction preserves the low 32 bits of its input, just
like some of the x86 instructions. Use it to reduce register pressure
when the low 32 bits have multiple uses.

This requires a small change to PeepholeOptimizer since EXTSW takes a
64-bit input register.

This is related to PR5997.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 21:14:34 +00:00
Hal Finkel
fe5b65827f Mark most PPC register classes to avoid write-after-write.
For processors with the G5-like instruction-grouping scheme, this helps avoid
early group termination due to a write-after-write dependency within the group.
It should also help on pipelined embedded cores.

On POWER7, over the test suite, this gives an average 0.5% speedup. The largest
speedups are:

SingleSource/Benchmarks/Stanford/Quicksort - 33%
MultiSource/Applications/d/make_dparser - 21%
MultiSource/Benchmarks/FreeBench/analyzer/analyzer - 12%
MultiSource/Benchmarks/MiBench/telecomm-FFT/telecomm-fft - 12%

Largest slowdowns:

SingleSource/Benchmarks/Stanford/Bubblesort - 23%
MultiSource/Benchmarks/Prolangs-C++/city/city - 21%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 16%
MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode - 13%

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158719 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 13:57:17 +00:00
Hal Finkel
ac81cc3282 Add support for generating reg+reg preinc stores on PPC.
PPC will now generate STWUX and friends.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 02:34:32 +00:00
Hal Finkel
2741d2cfdf Cleanup trip-count finding for PPC CTR loops (and some bug fixes).
This cleans up the method used to find trip counts in order to form CTR loops on PPC.
This refactoring allows the pass to find loops which have a constant trip count but also
happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different
classes of loops that are currently ignored.

In addition, we now search through all potential induction operations instead of just the first.
Also, we check the predicate code on the conditional branch and abort the transformation if the
code is not EQ or NE, and we then make sure that the branch to be transformed matches the
condition register defined by the comparison (multiple possible comparisons will be considered).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158607 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-16 20:34:07 +00:00
Hal Finkel
79248299f6 Add another missing 64-bit itinerary definition for the PPC A2 core.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158393 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 05:55:09 +00:00
Hal Finkel
04dccea2c3 Add some missing 64-bit itinerary definitions for the PPC A2 core.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158373 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12 20:32:29 +00:00
Hal Finkel
16803097fb Split out the PPC instruction class IntSimple from IntGeneral.
On the POWER7, adds and logical operations can also be handled
in the load/store pipelines. We'll call these IntSimple.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158366 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12 19:01:24 +00:00
Hal Finkel
6670c82df5 Fixes for PPC host detection and features.
POWER4 is a 64-bit CPU (better matched to the 970).
The g3 is really the 750 (no altivec), the g4+ is the 74xx (not the 750).

Patch by Andreas Tobler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158363 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12 16:39:23 +00:00
Hal Finkel
4db738ae98 Reapply r158337, this time properly protect Darwin/PPC host CPU use with __ppc__.
Original commit message:
Move PPC host-CPU detection logic from PPCSubtarget into sys::getHostCPUName().

Both the new Linux functionality and the old Darwin functions have been moved.
This change also allows this information to be queried directly by clang and
other frontends (clang, for example, will now have real -mcpu=native support).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158349 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12 03:03:13 +00:00
Jakob Stoklund Olesen
138c2b4e8a Revert r158337 "Move PPC host-CPU detection logic from PPCSubtarget into sys::getHostCPUName()."
This commit broke most of the PowerPC unit tests when running on
Intel/Apple.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158345 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12 00:58:40 +00:00
Hal Finkel
7bb39d8612 Move PPC host-CPU detection logic from PPCSubtarget into sys::getHostCPUName().
Both the new Linux functionality and the old Darwin functions have been moved.
This change also allows this information to be queried directly by clang and
other frontends (clang, for example, will now have real -mcpu=native support).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158337 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 23:14:31 +00:00
Hal Finkel
5a53c6e345 Enable MFOCRF generation on the PPC A2 core.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158324 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 19:57:04 +00:00
Hal Finkel
bd5cafd9bb Rename the PPC target feature gpul to mfocrf.
The PPC target feature gpul (IsGigaProcessor) was only used for one thing:
To enable the generation of the MFOCRF instruction. Furthermore, this
instruction is available on other PPC cores outside of the G5 line. This
feature now corresponds to the HasMFOCRF flag.

No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158323 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 19:57:01 +00:00
Hal Finkel
9770be91de Add A2 to the list of PPC CPUs recognized by Linux host CPU-type detection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158322 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 19:56:57 +00:00
Hal Finkel
0a1852b774 Emit the two-operand form of the PPC mfcr instruction as mfocrf.
This is necessary on Linux and supported on Darwin, see PR2604.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158315 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 15:43:15 +00:00
Hal Finkel
2bd0acd250 Add local CPU detection for Linux PPC.
This functionality mirrors that available on PPC/Darwin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158314 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 15:43:13 +00:00
Hal Finkel
622382fc5e Add POWER6 and POWER7 CPU types to the PPC backend.
No functional change; these will be used by upcoming scheduler enhancements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158313 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 15:43:08 +00:00
Hal Finkel
71ffcfe9f8 Enable ILP scheduling for all nodes by default on PPC.
Over the entire test-suite, this has an insignificantly negative average
performance impact, but reduces some of the worst slowdowns from the
anti-dep. change (r158294).

Largest speedups:
SingleSource/Benchmarks/Stanford/Quicksort - 28%
SingleSource/Benchmarks/Stanford/Towers - 24%
SingleSource/Benchmarks/Shootout-C++/matrix - 23%
MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
(matrix and automotive-bitcount were both in the top-5 slowdown list from the
anti-dep. change)

Largest slowdowns:
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
MultiSource/Applications/d/make_dparser - 16%

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10 19:32:29 +00:00
Hal Finkel
01a90f4f8f Use critical anti-dep. breaking on all PPC targets, but also add other register classes.
Using 'all' instead of 'critical' would be better because it would make it easier to
satisfy the bundling constraints, but, as noted in the FIXME, that is currently not
possible with the crs.

This yields an average 1% speedup over the entire test suite (on Power 7). Largest speedups:
SingleSource/Benchmarks/Shootout-C++/moments - 40%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
SingleSource/Benchmarks/BenchmarkGame/nsieve-bits - 26%
SingleSource/Benchmarks/McGill/misr - 23%
MultiSource/Applications/JM/ldecod/ldecod - 22%

Largest slowdowns:
SingleSource/Benchmarks/Shootout-C++/matrix - -29%
SingleSource/Benchmarks/Shootout-C++/ary3 - -22%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -18%
SingleSource/Benchmarks/Shootout-C++/ary - -17%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - -15%

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158294 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10 11:15:36 +00:00
Hal Finkel
0a3e33b633 Improve ext/trunc patterns on PPC64.
The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that
would leave self-moves in the final assembly. Replacing those patterns with ones
based on the SUBREG builtins yields better-looking code.

Thanks to Jakob and Owen for their suggestions in this matter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 22:10:19 +00:00
Hal Finkel
8bf75ed41c Enable tail merging on PPC.
Tail merging had been disabled on PPC because it would disturb bundling decisions
made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions
are made during post-RA scheduling, and tail merging is generally beneficial (the
average test-suite speedup is insignificantly positive).

Largest test-suite speedups:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23%
SingleSource/Benchmarks/Shootout-C++/ary - 21%
SingleSource/Benchmarks/Stanford/Queens - 17%

Largest slowdowns:
MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22%
MultiSource/Applications/JM/ldecod/ldecod - 14%
MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9%

This is improved by using full (instead of just critical) anti-dependency breaking,
but doing so still causes miscompiles and so cannot yet be enabled by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158259 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 03:14:50 +00:00
Hal Finkel
16b16ac840 Remove the TODO statement in the PPC README re: CTR loops
As Chris points out, this can now be removed!

TODO: check if the associated section on viterbi's inner loop can also be removed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158224 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 20:02:09 +00:00
Hal Finkel
7255d2a808 Enable PPC CTR loop formation by default.
Thanks to Jakob's help, this now causes no new test suite failures!

Over the entire test suite, this gives an average 1% speedup. The largest speedups are:
SingleSource/Benchmarks/Misc/pi - 108%
SingleSource/Benchmarks/CoyoteBench/lpbench - 54%
MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50%
SingleSource/Benchmarks/Shootout/ary3 - 32%
SingleSource/Benchmarks/Shootout-C++/matrix - 30%

The largest slowdowns are:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30%
MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22%
MultiSource/Applications/d/make_dparser - -14%
SingleSource/Benchmarks/Shootout-C++/ary - -13%

In light of these slowdowns, additional profiling work is obviously needed!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158223 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 19:19:53 +00:00