Commit Graph

7878 Commits

Author SHA1 Message Date
Sumanth Gundapaneni
adaebc8b56 Use ".arch_extension" ARM directive to support hwdiv on krait
In case of "krait" CPU, asm printer doesn't emit any ".cpu" so the
features bits are not computed. This patch lets the asm printer
emit ".cpu cortex-a9" directive for krait and the hwdiv feature is
enabled through ".arch_extension". In short, krait is treated
as "cortex-a9" with hwdiv. We can not emit ".krait" as CPU since
it is not supported bu GNU GAS yet


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230651 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 18:08:41 +00:00
Sumanth Gundapaneni
7c0f2ab3db Use ".arch_extension" ARM directive to specify the additional CPU features
This patch is in response to r223147 where the avaiable features are
computed based on ".cpu" directive. This will work clean for the standard
variants like cortex-a9. For custom variants which rely on standard cpu names
for assembly, the additional features of a CPU should be propagated. This can be
done via ".arch_extension" as long as the assembler supports it. The
implementation for krait along with unit test will be submitted in next patch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230650 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 18:07:35 +00:00
Eric Christopher
a01bc6a59f Remove an argument-less call to getSubtargetImpl from TargetLoweringBase.
This required plumbing a TargetRegisterInfo through computeRegisterProperties
and into findRepresentativeClass which uses it for register class
iteration. This required passing a subtarget into a few target specific
initializations of TargetLowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230583 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 00:00:24 +00:00
Renato Golin
b451f4e376 Improve handling of stack accesses in Thumb-1
Thumb-1 only allows SP-based LDR and STR to be word-sized, and SP-base LDR,
STR, and ADD only allow offsets that are a multiple of 4. Make some changes
to better make use of these instructions:

* Use word loads for anyext byte and halfword loads from the stack.
* Enforce 4-byte alignment on objects accessed in this way, to ensure that
  the offset is valid.
* Do the same for objects whose frame index is used, in order to avoid having
  to use more than one ADD to generate the frame index.
* Correct how many bits of offset we think AddrModeT1_s has.

Patch by John Brawn.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230496 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 14:41:06 +00:00
Eric Christopher
f8c57a105e Rename UpdateRegAllocHint to match style guidelines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230357 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 19:10:57 +00:00
Tim Northover
5530ac99e6 ARM: treat [N x i32] and [N x i64] as AAPCS composite types
The logic is almost there already, with our special homogeneous aggregate
handling. Tweaking it like this allows front-ends to emit AAPCS compliant code
without ever having to count registers or add discarded padding arguments.

Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to
apply the logic to all integer arrays for more consistency.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230348 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 17:22:34 +00:00
Bob Wilson
b664a60ecf Fix handling of negative offsets for AddrModeT2_i8s4 in rewriteT2FrameIndex.
This is a follow up to r230233 to fix something that I noticed by
inspection. The AddrModeT2_i8s4 addressing mode does not support
negative offsets. I spent a good chunk of the day trying to come up with
a testcase for this but was not successful. This addressing mode is used
to spill and restore GPRPair registers in Thumb2 code and that does not
happen often. We also make very limited used of negative offsets when
lowering frame indexes. I am going ahead with the change anyway, because
I am pretty confident that it is correct. I also added a missing assertion
to check that the low bits of the scaled offset are zero.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230297 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 01:37:31 +00:00
Eric Christopher
308458a98b Rewrite the global merge pass to be subprogram agnostic for now.
It was previously using the subtarget to get values for the global
offset without actually checking each function as it was generating
code. Go ahead and solidify the current behavior and make the
existing FIXMEs more prominent.

As a note the ARM backend previously had a thumb1 and non-thumb1
set of defaults. Only the former was tested so I've changed the
behavior to only use that for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230245 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-23 19:28:45 +00:00
Bob Wilson
5820b51546 Fix incorrect immediate size for AddrModeT2_i8s4 in rewriteT2FrameIndex.
The natural way to handle this addressing mode would be to say that it has
8 bits and gets scaled by 4, but since the MC layer is expecting the scaling
to be already reflected in the immediate value, we have been setting the
Scale to 1. That's fine, but then NumBits needs to be adjusted to reflect
the effective increase in the range of the immediate. That adjustment was
missing.

The consequence is that the register scavenger can fail.
The estimateRSStackSizeLimit() function in ARMFrameLowering.cpp correctly
assumes that the AddrModeT2_i8s4 address mode can handle scaled offsets up to
1020. Under just the right circumstances, we fail to reserve space for the
scavenger because it thinks that nothing will be needed. However, the overly
pessimistic behavior in rewriteT2FrameIndex causes some frame indexes to be
out of range and require scavenged registers, and so the scavenger asserts.

Unfortunately I have not been able to come up with a testcase for this. I
can only reproduce it on an internal branch where the frame layout and
register allocation is slightly different than trunk. We really need a
way to serialize MachineInstr-level IR to write reasonable tests for things
like this.

rdar://problem/19909005

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230233 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-23 16:57:19 +00:00
Tim Northover
ca7e0787f0 CodeGen: convert CCState interface to using ArrayRefs
Everyone except R600 was manually passing the length of a static array
at each callsite, calculated in a variety of interesting ways. Far
easier to let ArrayRef handle that.

There should be no functional change, but out of tree targets may have
to tweak their calls as with these examples.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230118 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-21 02:11:17 +00:00
Eric Christopher
f179b3f1d9 Get the cached subtarget off the MachineFunction rather than
inquiring for a new one from the TargetMachine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229999 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 08:24:37 +00:00
Eric Christopher
c9d0715997 Make the TargetMachine::getSubtarget that takes a Function argument
take a reference to match the getSubtargetImpl that takes a Function
argument.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229994 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-20 07:32:59 +00:00
Ahmed Bougacha
5898fc70ec [ARM] Re-re-apply VLD1/VST1 base-update combine.
This re-applies r223862, r224198, r224203, and r224754, which were
reverted in r228129 because they exposed Clang misalignment problems
when self-hosting.

The combine caused the crashes because we turned ISD::LOAD/STORE nodes
to ARMISD::VLD1/VST1_UPD nodes.  When selecting addressing modes, we
were very lax for the former, and only emitted the alignment operand
(as in "[r1:128]") when it was larger than the standard alignment of
the memory type.

However, for ARMISD nodes, we just used the MMO alignment, no matter
what.  In our case, we turned ISD nodes to ARMISD nodes, and this
caused the alignment operands to start being emitted.

And that's how we exposed alignment problems that were ignored before
(but I believe would have been caught with SCTRL.A==1?).

To fix this, we can just mirror the hack done for ISD nodes:  only
take into account the MMO alignment when the access is overaligned.

Original commit message:
We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
when the base pointer is incremented after the load/store.

We can do the same thing for generic load/stores.

Note that we can only combine the first load/store+adds pair in
a sequence (as might be generated for a v16f32 load for instance),
because other combines turn the base pointer addition chain (each
computing the address of the next load, from the address of the last
load) into independent additions (common base pointer + this load's
offset).

rdar://19717869, rdar://14062261.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229932 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 23:52:41 +00:00
Ahmed Bougacha
7f47189b51 [ARM] Minor cleanup to CombineBaseUpdate. NFC.
In preparation for a future patch:
- rename isLoad to isLoadOp: the former is confusing, and can be taken
  to refer to the fact that the node is an ISD::LOAD.  (it isn't, yet.)
- change formatting here and there.
- add some comments.
- const-ify bools.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229929 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 23:30:37 +00:00
Ahmed Bougacha
953c5c9458 [CodeGen] Use ArrayRef instead of std::vector&. NFC.
The former lets us use SmallVectors.  Do so in ARM and AArch64.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229925 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 23:13:10 +00:00
Michael Kuperstein
2b5910a767 Reverting r229831 due to multiple ARM/PPC/MIPS build-bot failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229841 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 11:38:11 +00:00
Michael Kuperstein
23dd089d8f Use std::bitset for SubtargetFeatures
Previously, subtarget features were a bitfield with the underlying type being uint64_t. 
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.

No functional change.

Differential Revision: http://reviews.llvm.org/D7065

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229831 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 09:01:04 +00:00
Peter Collingbourne
d93ca09fe0 MC: Remove NullStreamer hook, as it is redundant with NullTargetStreamer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229799 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 00:45:07 +00:00
Peter Collingbourne
99a5e24d34 Introduce Target::createNullTargetStreamer and use it from IRObjectFile.
A null MCTargetStreamer allows IRObjectFile to ignore target-specific
directives. Previously we were crashing.

Differential Revision: http://reviews.llvm.org/D7711

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229797 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 00:45:02 +00:00
Bradley Smith
4fe6d075d5 [ARM] Add missing M/R class CPUs
Add some of the missing M and R class Cortex CPUs, namely:

Cortex-M0+ (called Cortex-M0plus for GCC compatibility)
Cortex-M1
SC000
SC300
Cortex-R5


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229660 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 10:33:30 +00:00
Eric Christopher
24dded2342 Make the ARM AsmPrinter independent of global subtarget
initialization. Initialize the subtarget once per function and
migrate Emit{Start|End}OfAsmFile to either use attributes on the
TargetMachine or get information from the subtarget we'd use
for assembling. One bit (getISAEncoding) touched the general
AsmPrinter and the debug output. Handle this one by passing
the function for the subprogram down and updating all callers
and users.

The top-level-ness of the ARM attribute output for assembly is,
by nature, contrary to how we'd want to do this for an LTO
situation where we have multiple cpu architectures so this
solution is good enough for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229528 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-17 20:02:32 +00:00
Ahmed Bougacha
d2f8ee7194 [ARM] Remove unused declaration. NFC.
GlobalMerge was moved to lib/CodeGen a while ago, and is no longer
called "ARMGlobalMerge".


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229448 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 22:30:08 +00:00
Matthias Braun
0a7fb6f94a ARM: Transfer kill flag when lowering VSTMQIA to VSTMDIA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229425 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 19:34:30 +00:00
Andrew Trick
4f7d60c1ea AArch64: Safely handle the incoming sret call argument.
This adds a safe interface to the machine independent InputArg struct
for accessing the index of the original (IR-level) argument. When a
non-native return type is lowered, we generate the hidden
machine-level sret argument on-the-fly. Before this fix, we were
representing this argument as OrigArgIndex == 0, which is an outright
lie. In particular this crashed in the AArch64 backend where we
actually try to access the type of the original argument.

Now we use a sentinel value for machine arguments that have no
original argument index. AArch64, ARM, Mips, and PPC now check for this
case before accessing the original argument.

Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229413 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-16 18:10:47 +00:00
Aaron Ballman
66981fe208 Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for requiring the macro. NFC; LLVM edition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229340 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-15 22:54:22 +00:00
Duncan P. N. Exon Smith
f38384bcf1 ARM: Canonicalize access to function attributes, NFC
Canonicalize access to function attributes to use the simpler API.

getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
  => getFnAttribute(Kind)

getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
  => hasFnAttribute(Kind)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229220 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-14 02:24:44 +00:00
Chandler Carruth
417c5c172c [PM] Remove the old 'PassManager.h' header file at the top level of
LLVM's include tree and the use of using declarations to hide the
'legacy' namespace for the old pass manager.

This undoes the primary modules-hostile change I made to keep
out-of-tree targets building. I sent an email inquiring about whether
this would be reasonable to do at this phase and people seemed fine with
it, so making it a reality. This should allow us to start bootstrapping
with modules to a certain extent along with making it easier to mix and
match headers in general.

The updates to any code for users of LLVM are very mechanical. Switch
from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h".
Qualify the types which now produce compile errors with "legacy::". The
most common ones are "PassManager", "PassManagerBase", and
"FunctionPassManager".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229094 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 10:01:29 +00:00
Chandler Carruth
02d6288667 Re-sort #include lines using my handy dandy ./utils/sort_includes.py
script. This is in preparation for changes to lots of include lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229088 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-13 09:09:03 +00:00
Benjamin Kramer
d913d9d2c3 MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line with countTrailingZeros
Update all callers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228930 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 15:35:40 +00:00
Asiri Rathnayake
b0d513e1eb ARM: Fix another regression introduced in r223113
The changes in r223113 (ARM modified-immediate syntax) have broken
instructions like:
  mov r0, #~0xffffff00
The problem is that I've added a spurious range check on the immediate
operand to ensure that it lies between INT32_MIN and UINT32_MAX. While
this range check is correct in theory, it causes problems because the
operand is stored in an int64_t (by MC). So valid 32-bit constants like
\#~0xffffff00 become out of range. The solution is to simply remove this
range check. It is not possible to validate the range of the immediate
operand with the current setup because: 1) The operand is stored in an
int64_t by MC, 2) The immediate can be of the forms #imm, #-imm, #~imm
or even #((~imm)) etc. So we just chop the value to 32 bits and use it.

Also noted that the original range check was note tested by any of the
unit tests. I've added a new test to cover #~imm kind of operands.

Change-Id: I411e90d84312a2eff01b732bb238af536c4a7599

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228920 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-12 13:37:28 +00:00
Tim Northover
848d812931 ARM & AArch64: teach LowerVSETCC that output type size may differ from input.
While various DAG combines try to guarantee that a vector SETCC
operation will have the same output size as input, there's nothing
intrinsic to either creation or LegalizeTypes that actually guarantees
it, so the function needs to be ready to handle a mismatch.

Fortunately this is easy enough, just extend or truncate the naturally
compared result.

I couldn't reproduce the failure in other backends that I know have
SIMD, so it's probably only an issue for these two due to shared
heritage.

Should fix PR21645.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228518 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-08 00:50:47 +00:00
Cameron Esfahani
d02540a1d7 Value soft float calls as more expensive in the inliner.
Summary: When evaluating floating point instructions in the inliner, ask the TTI whether it is an expensive operation.  By default, it's not an expensive operation.  This keeps the default behavior the same as before.  The ARM TTI has been updated to return back TCC_Expensive for targets which don't have hardware floating point.

Reviewers: chandlerc, echristo

Reviewed By: echristo

Subscribers: t.p.northover, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D6936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228263 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-05 02:09:33 +00:00
Bradley Smith
960ce2aafa [ARM] Fix subtarget feature set truncation when using .cpu directive
This is a bug that was caused due to storing the feature bitset in a 32-bit
variable when it is a 64-bit mask, discarding the top half of the feature set.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228151 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-04 16:23:24 +00:00
Renato Golin
0966a4e370 Adding support to LLVM for targeting Cortex-A72
Currently, Cortex-A72 is modelled as an Cortex-A57 except the fp
load balancing pass isn't enabled for Cortex-A72 as it's not
profitable to have it enabled for this core.

Patch by Ranjeet Singh.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228140 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-04 13:31:29 +00:00
Renato Golin
ff01f89466 Reverting VLD1/VST1 base-updating/post-incrementing combining
This reverts patches 223862, 224198, 224203, and 224754, which were all
related to the vector load/store combining and were reverted/reaplied
a few times due to the same alignment problems we're seeing now.

Further tests, mainly self-hosting Clang, will be needed to reapply this
patch in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228129 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-04 10:11:59 +00:00
Frederic Riss
1638ca5493 Fix some unnoticed/unwanted behavior change from r222319.
The ARM assembler allows register alias redefinitions as long as it
targets the same register. r222319 broke that. In the AArch64 case
it would just produce a new warning, but in the ARM case it would
error out on previously accepted assembler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228109 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-04 03:10:03 +00:00
Jan Wen Voung
1a63641597 Fix ARM peephole optimizeCompare to avoid optimizing unsigned cmp to 0.
Summary:
Previously it only avoided optimizing signed comparisons to 0.
Sometimes the DAGCombiner will optimize the unsigned comparisons
to 0 before it gets to the peephole pass, but sometimes it doesn't.

Fix for PR22373.

Test Plan: test/CodeGen/ARM/sub-cmp-peephole.ll

Reviewers: jfb, manmanren

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D7274

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227809 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-02 16:56:50 +00:00
Chandler Carruth
b71d385494 [multiversion] Switch the TTI queries from TargetMachine to Subtarget
now that we have a correct and cached subtarget specific to the
function.

Also, finish providing a cached per-function subtarget in the core
LLVMTargetMachine -- that layer hadn't switched over yet.

The only use of the TargetMachine was to re-lookup a subtarget for
a particular function to work around the fact that TTI was immutable.
Now that it is per-function and we haved a cached subtarget, use it.

This still leaves a few interfaces with real warts on them where we were
passing Function objects through the TTI interface. I'll remove these
and clean their usage up in subsequent commits now that this isn't
necessary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227738 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-01 14:22:17 +00:00
Chandler Carruth
d0bfb83efb [multiversion] Remove the cached TargetMachine pointer from the
intermediate TTI implementation template and instead query up to the
derived class for both the TargetMachine and the TargetLowering.

Most of the derived types had a TLI cached already and there is no need
to store a less precisely typed target machine pointer.

This will in turn make it much cleaner to look up the TLI via
a per-function subtarget instead of the generic subtarget, and it will
pave the way toward pulling the subtarget used for unroll preferences
into the same form once we are *always* using the function to look up
the correct subtarget.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227737 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-01 14:01:15 +00:00
Chandler Carruth
6e89e1316a [multiversion] Switch all of the targets over to use the
TargetIRAnalysis access path directly rather than implementing getTTI.

This even removes getTTI from the interface. It's more efficient for
each target to just register a precise callback that creates their
specific TTI.

As part of this, all of the targets which are building their subtargets
individually per-function now build their TTI instance with the function
and thus look up the correct subtarget and cache it. NVPTX, R600, and
XCore currently don't leverage this functionality, but its trivial for
them to add it now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-01 13:20:00 +00:00
Chandler Carruth
d12af8754e [multiversion] Remove a false freedom to leave the TargetMachine pointer
null.

For some reason some of the original TTI code supported a null target
machine. This seems to have been legacy, and I made matters worse when
refactoring this code by spreading that pattern further through the
various targets.

The TargetMachine can't actually be null, and it doesn't make sense to
support that use case. I've now consistently removed it and removed all
of the code trying to cope with that situation. This is probably good,
as several targets *didn't* cope with it being null despite the null
default argument in their constructors. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227734 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-01 12:38:24 +00:00
Chandler Carruth
685c2add65 [PM] Remove a bunch of stale TTI creation method declarations. I nuked
their definitions, but forgot to clean up all the declarations which are
in different files.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227698 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-01 00:22:15 +00:00
Chandler Carruth
1937233a22 [PM] Switch the TargetMachine interface from accepting a pass manager
base which it adds a single analysis pass to, to instead return the type
erased TargetTransformInfo object constructed for that TargetMachine.

This removes all of the pass variants for TTI. There is now a single TTI
*pass* in the Analysis layer. All of the Analysis <-> Target
communication is through the TTI's type erased interface itself. While
the diff is large here, it is nothing more that code motion to make
types available in a header file for use in a different source file
within each target.

I've tried to keep all the doxygen comments and file boilerplate in line
with this move, but let me know if I missed anything.

With this in place, the next step to making TTI work with the new pass
manager is to introduce a really simple new-style analysis that produces
a TTI object via a callback into this routine on the target machine.
Once we have that, we'll have the building blocks necessary to accept
a function argument as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227685 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-31 11:17:59 +00:00
Saleem Abdulrasool
02897142df ARM: make a table more readable (NFC)
This adds some comments and splits the flag calculation on type boundaries to
make the table more readable.  Addresses some post-commit review comments to SVN
r227603.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227670 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-31 04:12:06 +00:00
Chandler Carruth
a6a87b595d [PM] Change the core design of the TTI analysis to use a polymorphic
type erased interface and a single analysis pass rather than an
extremely complex analysis group.

The end result is that the TTI analysis can contain a type erased
implementation that supports the polymorphic TTI interface. We can build
one from a target-specific implementation or from a dummy one in the IR.

I've also factored all of the code into "mix-in"-able base classes,
including CRTP base classes to facilitate calling back up to the most
specialized form when delegating horizontally across the surface. These
aren't as clean as I would like and I'm planning to work on cleaning
some of this up, but I wanted to start by putting into the right form.

There are a number of reasons for this change, and this particular
design. The first and foremost reason is that an analysis group is
complete overkill, and the chaining delegation strategy was so opaque,
confusing, and high overhead that TTI was suffering greatly for it.
Several of the TTI functions had failed to be implemented in all places
because of the chaining-based delegation making there be no checking of
this. A few other functions were implemented with incorrect delegation.
The message to me was very clear working on this -- the delegation and
analysis group structure was too confusing to be useful here.

The other reason of course is that this is *much* more natural fit for
the new pass manager. This will lay the ground work for a type-erased
per-function info object that can look up the correct subtarget and even
cache it.

Yet another benefit is that this will significantly simplify the
interaction of the pass managers and the TargetMachine. See the future
work below.

The downside of this change is that it is very, very verbose. I'm going
to work to improve that, but it is somewhat an implementation necessity
in C++ to do type erasure. =/ I discussed this design really extensively
with Eric and Hal prior to going down this path, and afterward showed
them the result. No one was really thrilled with it, but there doesn't
seem to be a substantially better alternative. Using a base class and
virtual method dispatch would make the code much shorter, but as
discussed in the update to the programmer's manual and elsewhere,
a polymorphic interface feels like the more principled approach even if
this is perhaps the least compelling example of it. ;]

Ultimately, there is still a lot more to be done here, but this was the
huge chunk that I couldn't really split things out of because this was
the interface change to TTI. I've tried to minimize all the other parts
of this. The follow up work should include at least:

1) Improving the TargetMachine interface by having it directly return
   a TTI object. Because we have a non-pass object with value semantics
   and an internal type erasure mechanism, we can narrow the interface
   of the TargetMachine to *just* do what we need: build and return
   a TTI object that we can then insert into the pass pipeline.
2) Make the TTI object be fully specialized for a particular function.
   This will include splitting off a minimal form of it which is
   sufficient for the inliner and the old pass manager.
3) Add a new pass manager analysis which produces TTI objects from the
   target machine for each function. This may actually be done as part
   of #2 in order to use the new analysis to implement #2.
4) Work on narrowing the API between TTI and the targets so that it is
   easier to understand and less verbose to type erase.
5) Work on narrowing the API between TTI and its clients so that it is
   easier to understand and less verbose to forward.
6) Try to improve the CRTP-based delegation. I feel like this code is
   just a bit messy and exacerbating the complexity of implementing
   the TTI in each target.

Many thanks to Eric and Hal for their help here. I ended up blocked on
this somewhat more abruptly than I expected, and so I appreciate getting
it sorted out very quickly.

Differential Revision: http://reviews.llvm.org/D7293

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227669 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-31 03:43:40 +00:00
Saleem Abdulrasool
81674807cc ARM: support stack probe size on Windows on ARM
Now that -mstack-probe-size is piped through to the backend via the function
attribute as on Windows x86, honour the value to permit handling of non-default
values for stack probes.  This is needed /Gs with the clang-cl driver or
-mstack-probe-size with the clang driver when targeting Windows on ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227667 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-31 02:26:37 +00:00
David Blaikie
cbae08dc4d Add ARM test for r227489, but XFAIL because this is actually more work than it appeared to be.
Also revert r227489 since it didn't actually fix the thing I thought I
was fixing (since the test case was targeting the wrong architecture
initially). The change might be correct & demonstrated by other test
cases, but it's not a priority for me to find those test cases right
now.

Filed PR22417 for the failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227632 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-30 23:04:39 +00:00
Saleem Abdulrasool
7a3c3f3a96 ARM: further correct .fpu directive handling
If the original FPU specification involved a restricted VFP unit (d16), ensure
that we reset the functionality when we encounter a new FPU type.  In
particular, if the user specified vfpv3-d16, but switched to a VFPv3 (which has
32 double precision registers), we would fail to reset the D16 feature, and
treat it as being equivalent to vfpv3-d16.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227603 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-30 19:35:18 +00:00
Renato Golin
71b1347acb Revert "Revert "Matching ARM change for r227481: DebugInfo: Teach Fast ISel to respect the debug location of comparisons in jumps.""
This reverts commit r227600, since that reverted the wrong comit. Sorry.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227601 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-30 19:25:20 +00:00
Renato Golin
d0a4d9037b Revert "Matching ARM change for r227481: DebugInfo: Teach Fast ISel to respect the debug location of comparisons in jumps."
This reverts commit r227488 as it was failing ARM bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227600 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-30 19:18:58 +00:00