classes. Replace the frame pointer initialization with a static function
that'll look it up via the subtarget on the MachineFunction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232010 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
I don't know why every singled backend had to redeclare its own DataLayout.
There was a virtual getDataLayout() on the common base TargetMachine, the
default implementation returned nullptr. It was not clear from this that
we could assume at call site that a DataLayout will be available with
each Target.
Now getDataLayout() is no longer virtual and return a pointer to the
DataLayout member of the common base TargetMachine. I plan to turn it into
a reference in a future patch.
The only backend that didn't have a DataLayout previsouly was the CPPBackend.
It now initializes the default DataLayout. This commit is NFC for all the
other backends.
Test Plan: clang+llvm ninja check-all
Reviewers: echristo
Subscribers: jfb, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8243
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231987 91177308-0d34-0410-b5e6-96231b3b80d8
time. The target independent code was passing in one all the
time and targets weren't checking validity before using. Update
a few calls to pass in a MachineFunction where necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231970 91177308-0d34-0410-b5e6-96231b3b80d8
The main issue being fixed here is that APCS targets handling a "byval align N"
parameter with N > 4 were miscounting what objects were where on the stack,
leading to FrameLowering setting the frame pointer incorrectly and clobbering
the stack.
But byval handling had grown over many years, and had multiple layers of cruft
trying to compensate for each other and calculate padding correctly. This only
really needs to be done once, in the HandleByVal function. Elsewhere should
just do what it's told by that call.
I also stripped out unnecessary APCS/AAPCS distinctions (now that Clang emits
byvals with the correct C ABI alignment), which simplified HandleByVal.
rdar://20095672
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231959 91177308-0d34-0410-b5e6-96231b3b80d8
MachineFunction argument so that it can look up the subtarget
rather than using a cached one in some Targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231888 91177308-0d34-0410-b5e6-96231b3b80d8
update all ports accordingly. Required a couple of small rewrites
in handling subtarget features during creation in PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231861 91177308-0d34-0410-b5e6-96231b3b80d8
In theory this allows the compiler to skip materializing the array on
the stack. In practice clang often fails to do that, but that's a
different story. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231571 91177308-0d34-0410-b5e6-96231b3b80d8
to disable lane switching if we don't actually have the instruction
set we want to switch to. Models the earlier check above the
conditional for the pass.
The testcase is one that triggered with the assert that's added
as part of the fix, use it to avoid adding a new testcase as it
highlights the same problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231539 91177308-0d34-0410-b5e6-96231b3b80d8
This commit enables forming vector extloads for ARM.
It only does so for legal types, and when we can't fold the extension
in a wide/long form of the user instruction.
Enabling it for larger types isn't as good an idea on ARM as it is on
X86, because:
- we pretend that extloads are legal, but end up generating vld+vmov
- we have instructions like vld {dN, dM}, which can't be generated
when we "manually expand" extloads to vld+vmov.
For legal types, the combine doesn't fire that often: in the
integration tests only in a big endian testcase, where it removes a
pointless AND.
Related to rdar://19723053
Differential Revision: http://reviews.llvm.org/D7423
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231396 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all.
This does not represent a behavioural change and as such no tests were added.
Patch by: Richard Diamond.
Reviewers: jfb
Reviewed By: jfb
Subscribers: jfb, aemerson, t.p.northover, llvm-commits
Differential Revision: http://reviews.llvm.org/D7713
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231250 91177308-0d34-0410-b5e6-96231b3b80d8
a lookup, pass that in rather than use a naked call to getSubtargetImpl.
This involved passing down and around either a TargetMachine or
TargetRegisterInfo. Update all callers/definitions around the targets
and SelectionDAG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230699 91177308-0d34-0410-b5e6-96231b3b80d8
In case of "krait" CPU, asm printer doesn't emit any ".cpu" so the
features bits are not computed. This patch lets the asm printer
emit ".cpu cortex-a9" directive for krait and the hwdiv feature is
enabled through ".arch_extension". In short, krait is treated
as "cortex-a9" with hwdiv. We can not emit ".krait" as CPU since
it is not supported bu GNU GAS yet
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230651 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is in response to r223147 where the avaiable features are
computed based on ".cpu" directive. This will work clean for the standard
variants like cortex-a9. For custom variants which rely on standard cpu names
for assembly, the additional features of a CPU should be propagated. This can be
done via ".arch_extension" as long as the assembler supports it. The
implementation for krait along with unit test will be submitted in next patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230650 91177308-0d34-0410-b5e6-96231b3b80d8
This required plumbing a TargetRegisterInfo through computeRegisterProperties
and into findRepresentativeClass which uses it for register class
iteration. This required passing a subtarget into a few target specific
initializations of TargetLowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230583 91177308-0d34-0410-b5e6-96231b3b80d8
Thumb-1 only allows SP-based LDR and STR to be word-sized, and SP-base LDR,
STR, and ADD only allow offsets that are a multiple of 4. Make some changes
to better make use of these instructions:
* Use word loads for anyext byte and halfword loads from the stack.
* Enforce 4-byte alignment on objects accessed in this way, to ensure that
the offset is valid.
* Do the same for objects whose frame index is used, in order to avoid having
to use more than one ADD to generate the frame index.
* Correct how many bits of offset we think AddrModeT1_s has.
Patch by John Brawn.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230496 91177308-0d34-0410-b5e6-96231b3b80d8
The logic is almost there already, with our special homogeneous aggregate
handling. Tweaking it like this allows front-ends to emit AAPCS compliant code
without ever having to count registers or add discarded padding arguments.
Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to
apply the logic to all integer arrays for more consistency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230348 91177308-0d34-0410-b5e6-96231b3b80d8
This is a follow up to r230233 to fix something that I noticed by
inspection. The AddrModeT2_i8s4 addressing mode does not support
negative offsets. I spent a good chunk of the day trying to come up with
a testcase for this but was not successful. This addressing mode is used
to spill and restore GPRPair registers in Thumb2 code and that does not
happen often. We also make very limited used of negative offsets when
lowering frame indexes. I am going ahead with the change anyway, because
I am pretty confident that it is correct. I also added a missing assertion
to check that the low bits of the scaled offset are zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230297 91177308-0d34-0410-b5e6-96231b3b80d8
It was previously using the subtarget to get values for the global
offset without actually checking each function as it was generating
code. Go ahead and solidify the current behavior and make the
existing FIXMEs more prominent.
As a note the ARM backend previously had a thumb1 and non-thumb1
set of defaults. Only the former was tested so I've changed the
behavior to only use that for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230245 91177308-0d34-0410-b5e6-96231b3b80d8
The natural way to handle this addressing mode would be to say that it has
8 bits and gets scaled by 4, but since the MC layer is expecting the scaling
to be already reflected in the immediate value, we have been setting the
Scale to 1. That's fine, but then NumBits needs to be adjusted to reflect
the effective increase in the range of the immediate. That adjustment was
missing.
The consequence is that the register scavenger can fail.
The estimateRSStackSizeLimit() function in ARMFrameLowering.cpp correctly
assumes that the AddrModeT2_i8s4 address mode can handle scaled offsets up to
1020. Under just the right circumstances, we fail to reserve space for the
scavenger because it thinks that nothing will be needed. However, the overly
pessimistic behavior in rewriteT2FrameIndex causes some frame indexes to be
out of range and require scavenged registers, and so the scavenger asserts.
Unfortunately I have not been able to come up with a testcase for this. I
can only reproduce it on an internal branch where the frame layout and
register allocation is slightly different than trunk. We really need a
way to serialize MachineInstr-level IR to write reasonable tests for things
like this.
rdar://problem/19909005
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230233 91177308-0d34-0410-b5e6-96231b3b80d8
Everyone except R600 was manually passing the length of a static array
at each callsite, calculated in a variety of interesting ways. Far
easier to let ArrayRef handle that.
There should be no functional change, but out of tree targets may have
to tweak their calls as with these examples.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230118 91177308-0d34-0410-b5e6-96231b3b80d8
This re-applies r223862, r224198, r224203, and r224754, which were
reverted in r228129 because they exposed Clang misalignment problems
when self-hosting.
The combine caused the crashes because we turned ISD::LOAD/STORE nodes
to ARMISD::VLD1/VST1_UPD nodes. When selecting addressing modes, we
were very lax for the former, and only emitted the alignment operand
(as in "[r1:128]") when it was larger than the standard alignment of
the memory type.
However, for ARMISD nodes, we just used the MMO alignment, no matter
what. In our case, we turned ISD nodes to ARMISD nodes, and this
caused the alignment operands to start being emitted.
And that's how we exposed alignment problems that were ignored before
(but I believe would have been caught with SCTRL.A==1?).
To fix this, we can just mirror the hack done for ISD nodes: only
take into account the MMO alignment when the access is overaligned.
Original commit message:
We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
when the base pointer is incremented after the load/store.
We can do the same thing for generic load/stores.
Note that we can only combine the first load/store+adds pair in
a sequence (as might be generated for a v16f32 load for instance),
because other combines turn the base pointer addition chain (each
computing the address of the next load, from the address of the last
load) into independent additions (common base pointer + this load's
offset).
rdar://19717869, rdar://14062261.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229932 91177308-0d34-0410-b5e6-96231b3b80d8
In preparation for a future patch:
- rename isLoad to isLoadOp: the former is confusing, and can be taken
to refer to the fact that the node is an ISD::LOAD. (it isn't, yet.)
- change formatting here and there.
- add some comments.
- const-ify bools.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229929 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, subtarget features were a bitfield with the underlying type being uint64_t.
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.
Differential Revision: http://reviews.llvm.org/D7065
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229831 91177308-0d34-0410-b5e6-96231b3b80d8
Add some of the missing M and R class Cortex CPUs, namely:
Cortex-M0+ (called Cortex-M0plus for GCC compatibility)
Cortex-M1
SC000
SC300
Cortex-R5
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229660 91177308-0d34-0410-b5e6-96231b3b80d8
initialization. Initialize the subtarget once per function and
migrate Emit{Start|End}OfAsmFile to either use attributes on the
TargetMachine or get information from the subtarget we'd use
for assembling. One bit (getISAEncoding) touched the general
AsmPrinter and the debug output. Handle this one by passing
the function for the subprogram down and updating all callers
and users.
The top-level-ness of the ARM attribute output for assembly is,
by nature, contrary to how we'd want to do this for an LTO
situation where we have multiple cpu architectures so this
solution is good enough for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229528 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a safe interface to the machine independent InputArg struct
for accessing the index of the original (IR-level) argument. When a
non-native return type is lowered, we generate the hidden
machine-level sret argument on-the-fly. Before this fix, we were
representing this argument as OrigArgIndex == 0, which is an outright
lie. In particular this crashed in the AArch64 backend where we
actually try to access the type of the original argument.
Now we use a sentinel value for machine arguments that have no
original argument index. AArch64, ARM, Mips, and PPC now check for this
case before accessing the original argument.
Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229413 91177308-0d34-0410-b5e6-96231b3b80d8
Canonicalize access to function attributes to use the simpler API.
getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind)
=> getFnAttribute(Kind)
getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind)
=> hasFnAttribute(Kind)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229220 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM's include tree and the use of using declarations to hide the
'legacy' namespace for the old pass manager.
This undoes the primary modules-hostile change I made to keep
out-of-tree targets building. I sent an email inquiring about whether
this would be reasonable to do at this phase and people seemed fine with
it, so making it a reality. This should allow us to start bootstrapping
with modules to a certain extent along with making it easier to mix and
match headers in general.
The updates to any code for users of LLVM are very mechanical. Switch
from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h".
Qualify the types which now produce compile errors with "legacy::". The
most common ones are "PassManager", "PassManagerBase", and
"FunctionPassManager".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229094 91177308-0d34-0410-b5e6-96231b3b80d8
The changes in r223113 (ARM modified-immediate syntax) have broken
instructions like:
mov r0, #~0xffffff00
The problem is that I've added a spurious range check on the immediate
operand to ensure that it lies between INT32_MIN and UINT32_MAX. While
this range check is correct in theory, it causes problems because the
operand is stored in an int64_t (by MC). So valid 32-bit constants like
\#~0xffffff00 become out of range. The solution is to simply remove this
range check. It is not possible to validate the range of the immediate
operand with the current setup because: 1) The operand is stored in an
int64_t by MC, 2) The immediate can be of the forms #imm, #-imm, #~imm
or even #((~imm)) etc. So we just chop the value to 32 bits and use it.
Also noted that the original range check was note tested by any of the
unit tests. I've added a new test to cover #~imm kind of operands.
Change-Id: I411e90d84312a2eff01b732bb238af536c4a7599
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228920 91177308-0d34-0410-b5e6-96231b3b80d8