Commit Graph

30532 Commits

Author SHA1 Message Date
Kai Nacke
5672e68951 [MIPS] Add aliases for sync instruction used by Octeon CPU
This commit adds aliases for the sync instruction (synciobdma,
syncs, syncw, syncws) which are used by the Octeon CPU.

Reviewed by D. Sanders

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217477 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-10 06:10:24 +00:00
Craig Topper
c4e394a333 Use cast to MVT instead of EVT on a couple calls to getSizeInBits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217473 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-10 04:51:36 +00:00
Sanjay Patel
a9d7398280 Add a scheduling model for AMD 16H Jaguar (btver2).
This is a first pass at a scheduling model for Jaguar.
It's structured largely on the existing SandyBridge and SLM sched models.

Using this model, in addition to turning on the PostRA scheduler, results in 
some perf wins on internal and 3rd party benchmarks. There's not much difference 
in LLVM's test-suite benchmarking subset of tests.

Differential Revision: http://reviews.llvm.org/D5229



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217457 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-09 20:07:07 +00:00
Toma Tabacu
b3fa7e412b [mips] Add assembler support for .set mips0 directive.
Summary:
This directive is used to reset the assembler options to their initial values.
Assembly programmers use it in conjunction with the ".set mipsX" directives.

This patch depends on the .set push/pop directive (http://reviews.llvm.org/D4821).

Contains work done by Matheus Almeida.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4957

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217438 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-09 12:52:14 +00:00
Daniel Sanders
052538124f [mips] Move MipsTargetLowering::MipsCC::regSize() to MipsSubtarget::getGPRSizeInBytes()
Summary:
The GPR size is more a property of the subtarget than that of the ABI so move
this information to the MipsSubtarget.

No functional change.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5009


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217436 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-09 12:11:16 +00:00
Pavel Chupin
586994a74e [x32] Emit callq for CALLpcrel32
Summary:
In AT&T annotation for both x86_64 and x32 calls should be printed as
callq in assembly. It's only a matter of correct mnemonic, object output
is ok.

Test Plan: trivial test added

Reviewers: nadav, dschuff, craig.topper

Subscribers: llvm-commits, zinovy.nis

Differential Revision: http://reviews.llvm.org/D5213

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217435 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-09 11:54:12 +00:00
Daniel Sanders
9242b13a4a [mips] Don't cache IsO32 and IsFP64 in MipsTargetLowering::MipsCC
Summary:
Use a MipsSubtarget reference instead.

No functional change.

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217434 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-09 10:46:48 +00:00
Toma Tabacu
f29c5818bf [mips] Add assembler support for .set push/pop directive.
Summary:
These directives are used to save the current assembler options (in the case of ".set push") and restore the previously saved options (in the case of ".set pop").

Contains work done by Matheus Almeida.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4821

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217432 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-09 10:15:38 +00:00
Renato Golin
ccfbbaca3f ARM: Negative offset support problem
This patch is to permit a negative offset usage for a non frame access.

Patch by Igor Oblakov.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217431 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-09 09:57:59 +00:00
Bob Wilson
086832979b Set trunc store action to Expand for all X86 targets.
When compiling without SSE2, isTruncStoreLegal(F64, F32) would return Legal, whereas with SSE2 it would return Expand. And since the Target doesn't seem to actually handle a truncstore for double -> float, it would just output a store of a full double in the space for a float hence overwriting other bits on the stack.

Patch by Luqman Aden!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217410 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-09 01:13:36 +00:00
Chad Rosier
c3c0c6df2a [AArch64] Enabled AA support for Cortex-A57.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217381 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 15:34:16 +00:00
Matt Arsenault
13ea374e79 R600/SI: Fix assertion from copying a TargetGlobalAddress
Assert in scheduler from an inserted copy_to_regclass from
a constant.

This only seems to break sometimes when a constant initializer
address is forced into VGPRs in a non-entry block. No test
since the only case I've managed to hit only happens with a future
patch, and that case will also not be a problem once scalar instructions
are used in non-entry blocks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217380 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 15:07:33 +00:00
Matt Arsenault
ef4bb30475 R600/SI: Replace LDS atomics with no return versions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217379 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 15:07:31 +00:00
Matt Arsenault
f1cd7ce098 R600/SI: Add InstrMapping for noret atomics.
Only handles LDS atomics for now, and will be used
to replace atomics with no uses with the no return
versions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217378 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 15:07:27 +00:00
Chad Rosier
b30d031de4 [AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.
Patch by Sanjin Sijaric <ssijaric@codeaurora.org>!
Phabricator Review: http://reviews.llvm.org/D5103

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217371 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 14:43:48 +00:00
Chad Rosier
1ef487d463 [AArch64] Enabled AA support for Cortex-A53.
Patch by Sanjin Sijaric <ssijaric@codeaurora.org>!
Phabricator Review: http://reviews.llvm.org/D5103

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217370 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 14:31:49 +00:00
Sid Manning
27ebc7c2f5 Spelling correction
Another trivial spelling change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217364 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-08 13:05:23 +00:00
Chandler Carruth
8ceea90956 [x86] Revert my over-eager commit in r217332.
I hadn't actually run all the tests yet and these combines have somewhat
surprisingly far reaching effects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217333 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 12:37:11 +00:00
Chandler Carruth
e328c5ea83 [x86] Tweak the rules surrounding 0,0 and 1,1 v2f64 shuffles and add
support for MOVDDUP which is really important for matrix multiply style
operations that do lots of non-vector-aligned load and splats.

The original motivation was to add support for MOVDDUP as the lack of it
regresses matmul_f64_4x4 by 5% or so. However, all of the rules here
were somewhat suspicious.

First, we should always be using the floating point domain shuffles,
regardless of how many copies we have to make as a movapd is *crazy*
faster than the domain switching cost on some chips. (Mostly because
movapd is crazy cheap.) Because SHUFPD can't do the copy-for-free trick
of the PSHUF instructions, there is no need to avoid canonicalizing on
UNPCK variants, so do that canonicalizing. This also ensures we have the
chance to form MOVDDUP. =]

Second, we assume SSE2 support when doing any vector lowering, and given
that we should just use UNPCKLPD and UNPCKHPD as they can operate on
registers or memory. If vectors get spilled or come from memory at all
this is going to allow the load to be folded into the operation. If we
want to optimize for encoding size (the only difference, and only
a 2 byte difference) it should be done *much* later, likely after RA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217332 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 12:02:14 +00:00
Matt Arsenault
324a7cd8be R600/SI: Fix register class for some 64-bit atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217323 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 00:46:20 +00:00
Chandler Carruth
7cd7154421 [x86] Fix a pretty horrible bug and inconsistency in the x86 asm
parsing (and latent bug in the instruction definitions).

This is effectively a revert of r136287 which tried to address
a specific and narrow case of immediate operands failing to be accepted
by x86 instructions with a pretty heavy hammer: it introduced a new kind
of operand that behaved differently. All of that is removed with this
commit, but the test cases are both preserved and enhanced.

The core problem that r136287 and this commit are trying to handle is
that gas accepts both of the following instructions:

  insertps $192, %xmm0, %xmm1
  insertps $-64, %xmm0, %xmm1

These will encode to the same byte sequence, with the immediate
occupying an 8-bit entry. The first form was fixed by r136287 but that
broke the prior handling of the second form! =[ Ironically, we would
still emit the second form in some cases and then be unable to
re-assemble the output.

The reason why the first instruction failed to be handled is because
prior to r136287 the operands ere marked 'i32i8imm' which forces them to
be sign-extenable. Clearly, that won't work for 192 in a single byte.
However, making thim zero-extended or "unsigned" doesn't really address
the core issue either because it breaks negative immediates. The correct
fix is to make these operands 'i8imm' reflecting that they can be either
signed or unsigned but must be 8-bit immediates. This patch backs out
r136287 and then changes those places as well as some others to use
'i8imm' rather than one of the extended variants.

Naturally, this broke something else. The custom DAG nodes had to be
updated to have a much more accurate type constraint of an i8 node, and
a bunch of Pat immediates needed to be specified as i8 values.

The fallout didn't end there though. We also then ceased to be able to
match the instruction-specific intrinsics to the instructions so
modified. Digging, this is because they too used i32 rather than i8 in
their signature. So I've also switched those intrinsics to i8 arguments
in line with the instructions.

In order to make the intrinsic adjustments of course, I also had to add
auto upgrading for the intrinsics.

I suspect that the intrinsic argument types may have led everything down
this rabbit hole. Pretty happy with the result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217310 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-06 10:00:01 +00:00
Chandler Carruth
469c73bc27 [x86] Fix an embarressing bug in the INSERTPS formation code. The mask
computation was totally wrong, but somehow it didn't really show up with
llc.

I've added an assert that triggers on multiple existing test cases and
updated one of them to show the correct value.

There appear to still be more bugs lurking around insertps's mask. =/
However, note that this only really impacts the new vector shuffle
lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217289 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 23:19:45 +00:00
Toma Tabacu
babb45124c [mips] Change Feature-related types from unsigned to uint64_t in MipsAsmParser. No functional changes.
Summary: Found a couple of cases where unsigned was still being used. These two should be the last ones in the (entire) Mips backend.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D5028

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217257 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 16:32:09 +00:00
Matt Arsenault
89a7e3ec3e R600/SI: Use same complex patterns for DS atomics
This fixes hitting the same negative base offset problem
that was already fixed for regular loads and stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217256 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 16:24:58 +00:00
Daniel Sanders
353cf20b9b [mips] Marked the Trap-on-Condition instructions as Mips II
Patch by Vasileios Kalintiris.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D5173


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 15:50:13 +00:00
Toma Tabacu
f47b55160c [mips] Rename data members and member functions in MipsAssemblerOptions.
Summary: Use the naming convention from the LLVM Coding Standards.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4972

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217254 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 15:43:21 +00:00
Jan Vesely
286f644bce R600: Fix FROUND
round halfway cases away from zero

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217250 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 14:26:54 +00:00
Tom Stellard
eb1fef0ec1 R600/SI: Fix bug in SIInstrInfo::legalizeOpWithMove()
We must constrain the destination register class of legalized operands
to a VGPR class or else the illegal operand may be folded back into
the instruction by the register coalescer.

This fixes a bug in add.ll that will be uncovered by future commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217249 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 14:08:01 +00:00
Tom Stellard
7cda2d0666 R600/SI: Use S_ADD_U32 and S_SUB_U32 for low half of 64-bit operations
https://bugs.freedesktop.org/show_bug.cgi?id=83416

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217248 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 14:07:59 +00:00
Chandler Carruth
c1c5dcf069 [x86] Factor out the zero vector insertion logic in the new vector
shuffle lowering for integer vectors and share it from v4i32, v8i16, and
v16i8 code paths.

Ironically, the SSE2 v16i8 code for this is now better than the SSSE3!
=] Will have to fix the SSSE3 code next to just using a single pshufb.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217240 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 10:36:31 +00:00
Tim Northover
4b5f105a71 ARM: cover all sub-architecture enumerators to keep compiler happy.
No change in behaviour (hopefully).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217233 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 07:56:46 +00:00
Jiangning Liu
b20b9bf9fd [AArch64] Add pass to enable additional comparison optimizations by CSE.
Patched by Sergey Dmitrouk.

This pass tries to make consecutive compares of values use same operands to
allow CSE pass to remove duplicated instructions. For this it analyzes
branches and adjusts comparisons with immediate values by converting:

GE -> GT
GT -> GE
LT -> LE
LE -> LT

and adjusting immediate values appropriately. It basically corrects two
immediate values towards each other to make them equal.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217220 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 02:55:24 +00:00
Reid Kleckner
f2cdc0b1e9 X86: cpuid and xgetbv write to 32-bit registers, not 64-bit
This fixes an issue where MS inline assembly containing xgetbv wouldn't
be marked as clobbering EAX:EDX. Test for that forthcoming on the Clang
side.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217173 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 16:58:25 +00:00
Tim Northover
8dcac5d77a AArch64: fix vector-immediate BIC/ORR on big-endian devices.
Follow up to r217138, extending the logic to other NEON-immediate instructions.
As before, the instruction already performs the correct operation and we're
just using a different type for convenience, so we want a true nop-cast.

Patch by Asiri Rathnayake.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217159 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 15:05:24 +00:00
Toma Tabacu
0f8b5790d6 [mips] Rename MipsAsmParser functions to conform to the LLVM Coding Standards. No functional changes.
Summary: There are still some functions which should be renamed, but they are inherited from the generic MC classes.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D5068

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217145 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 13:23:44 +00:00
Aaron Ballman
fa9120f514 Silencing a usually-helpful-but-braindead-silly-in-this-case sign mismatch warning with MSVC. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217143 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 11:52:24 +00:00
Tim Northover
dfe4e3e706 AArch64: fix big-endian immediate materialisation
We were materialising big-endian constants using DAG nodes with types different
from what was requested, followed by a bitcast. This is fine on little-endian
machines where bitcasting is a nop, but we need a slightly different
representation for big-endian. This adds a new set of NVCAST (natural-vector
cast) operations which are always nops.

Patch by Asiri Rathnayake.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217138 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 09:46:14 +00:00
Chandler Carruth
ae98867126 [x86] Teach the new v4i32 shuffle lowering some more tricks to recognize
vzext patterns and insert-element patterns that for SSE4 have dedicated
instructions.

With this we can enable the experimental mode in a regression test that
happens to cover some of the past set of issues. You can see that the
new logic does significantly better here on the floating point cases.

A follow-up to this change and the previous ones will hoist the logic
into helpers so it can be shared across element type sizes as in this
particular case it generalizes cleanly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217136 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 09:26:30 +00:00
Elena Demikhovsky
a91600713d Fixed compilation problem on Windows (initialization of non-aggregate type).
After commit 217131.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217134 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 07:20:39 +00:00
Elena Demikhovsky
df1bc5a200 X86 Intrinsics table - changed to a static table sorted by intrinsic id.
Used binary search over the tables.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217131 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 06:34:34 +00:00
Juergen Ributzka
319b5d0e8e [FastISel][AArch64] Cleanup and simplify 'fastSelectInstruction'. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217119 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 01:29:21 +00:00
Juergen Ributzka
68a4ab08b3 [FastISel][AArch64] Add target-specific lowering for logical operations.
This change adds support for immediate and shift-left folding into logical
operations.

This fixes rdar://problem/18223183.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217118 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 01:29:18 +00:00
Chandler Carruth
fa2dfaedf2 [x86] Teach the new vector shuffle lowering about the zero masking
abilities of INSERTPS which are really powerful and come up in very
important contexts such as forming diagonal matrices, etc.

With this I ended up being able to remove the somewhat weird helper
I added for INSERTPS because we can collapse the entire state to a no-op
mask. Added a bunch of tests for inserting into a zero-ish vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217117 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 01:13:48 +00:00
Matt Arsenault
fa2e31c394 R600/SI: Un-move pattern I forgot to remove in last commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217109 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 23:28:57 +00:00
Matt Arsenault
c9cc488dfe R600/SI: Try to keep i32 mul on SALU
Also fix bug this exposed where when legalizing an immediate
operand, a v_mov_b32 would be created with a VSrc dest register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217108 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 23:24:35 +00:00
Chandler Carruth
699fd1909e [x86] Teach the new vector shuffle lowering about the simplest of
'insertps' patterns.

This replaces two shuffles with a single insertps in very common cases.
My next patch will extend this to leverage the zeroing capabilities of
insertps which will allow it to be used in a much wider set of cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217100 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 22:48:34 +00:00
Chandler Carruth
5f209637c4 [x86] Teach the asm comment printing to only print the clarification of
an immediate operand when we don't have instruction-specific comments.

This ensures that instruction-specific comments are attached to the same
line as the instruction which is important for using them to write
readable and maintainable tests. My next commit will just such a test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217099 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 22:46:44 +00:00
Robin Morisset
1ad925ccf8 Refactor AtomicExpandPass and add a generic isAtomic() method to Instruction
Summary:
Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs.
Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving
the X86 backend to this pass.

This requires a way of easily detecting which instructions are atomic.
I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making
isAtomic() a method of Instruction implemented by a switch on the opcodes.

Test Plan: make check

Reviewers: jfb

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217080 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 21:29:59 +00:00
Benjamin Kramer
f0644b42a7 Make some helpers static or move into the llvm namespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217077 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 21:04:12 +00:00
Robin Morisset
4b2698cf19 Use target-dependent emitLeading/TrailingFence instead of the target-independent insertLeading/TrailingFence (in AtomicExpandPass)
Fixes two latent bugs:
- There was no fence inserted before expanded seq_cst load (unsound on Power)
- There was only a fence release before seq_cst stores (again unsound, in particular on Power)
    It is not even clear if this is correct on ARM swift processors (where release fences are
    DMB ishst instead of DMB ish). This behaviour is currently preserved on ARM Swift
    as it is not clear whether it is incorrect. I would love to get documentation stating
    whether it is correct or not.
These two bugs were not triggered because Power is not (yet) using this pass, and these
behaviours happen to be (mostly?) working on ARM
(although they completely butchered the semantics of the llvm IR).

See:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075821.html
for an example of the problems that can be caused by the second of these bugs.

I couldn't see a way of fixing these in a completely target-independent way without
adding lots of unnecessary fences on ARM, hence the target-dependent parts of this
patch.

This patch implements the new target-dependent parts only for ARM (the default
of not doing anything is enough for AArch64), other architectures will use this
infrastructure in later patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217076 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 21:01:03 +00:00