Commit Graph

30204 Commits

Author SHA1 Message Date
Chandler Carruth
60dbe0fd0d [x86] Update the order of instructions after I switched to a bitcast
helper that skips creating a cast when it isn't necessary.

It's really somewhat concerning that this was caused by the the presence
of a no-op bitcast, but...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238642 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 06:02:37 +00:00
Chandler Carruth
d8018eeac9 [x86] Restore the bitcasts I removed when refactoring this to avoid
shifting vectors of bytes as x86 doesn't have direct support for that.

This removes a bunch of redundant masking in the generated code for SSE2
and SSE3.

In order to avoid the really significant code size growth this would
have triggered, I also factored the completely repeatative logic for
shifting and masking into two lambdas which in turn makes all of this
much easier to read IMO.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238637 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 04:05:11 +00:00
Chandler Carruth
828f5b807c [x86] Implement a faster vector population count based on the PSHUFB
in-register LUT technique.

Summary:
A description of this technique can be found here:
http://wm.ite.pl/articles/sse-popcount.html

The core of the idea is to use an in-register lookup table and the
PSHUFB instruction to compute the population count for the low and high
nibbles of each byte, and then to use horizontal sums to aggregate these
into vector population counts with wider element types.

On x86 there is an instruction that will directly compute the horizontal
sum for the low 8 and high 8 bytes, giving vNi64 popcount very easily.
Various tricks are used to get vNi32 and vNi16 from the vNi8 that the
LUT computes.

The base implemantion of this, and most of the work, was done by Bruno
in a follow up to D6531. See Bruno's detailed post there for lots of
timing information about these changes.

I have extended Bruno's patch in the following ways:

0) I committed the new tests with baseline sequences so this shows
   a diff, and regenerated the tests using the update scripts.

1) Bruno had noticed and mentioned in IRC a redundant mask that
   I removed.

2) I introduced a particular optimization for the i32 vector cases where
   we use PSHL + PSADBW to compute the the low i32 popcounts, and PSHUFD
   + PSADBW to compute doubled high i32 popcounts. This takes advantage
   of the fact that to line up the high i32 popcounts we have to shift
   them anyways, and we can shift them by one fewer bit to effectively
   divide the count by two. While the PSHUFD based horizontal add is no
   faster, it doesn't require registers or load traffic the way a mask
   would, and provides more ILP as it happens on different ports with
   high throughput.

3) I did some code cleanups throughout to simplify the implementation
   logic.

4) I refactored it to continue to use the parallel bitmath lowering when
   SSSE3 is not available to preserve the performance of that version on
   SSE2 targets where it is still much better than scalarizing as we'll
   still do a bitmath implementation of popcount even in scalar code
   there.

With #1 and #2 above, I analyzed the result in IACA for sandybridge,
ivybridge, and haswell. In every case I measured, the throughput is the
same or better using the LUT lowering, even v2i64 and v4i64, and even
compared with using the native popcnt instruction! The latency of the
LUT lowering is often higher than the latency of the scalarized popcnt
instruction sequence, but I think those latency measurements are deeply
misleading. Keeping the operation fully in the vector unit and having
many chances for increased throughput seems much more likely to win.

With this, we can lower every integer vector popcount implementation
using the LUT strategy if we have SSSE3 or better (and thus have
PSHUFB). I've updated the operation lowering to reflect this. This also
fixes an issue where we were scalarizing horribly some AVX lowerings.

Finally, there are some remaining cleanups. There is duplication between
the two techniques in how they perform the horizontal sum once the byte
population count is computed. I'm going to factor and merge those two in
a separate follow-up commit.

Differential Revision: http://reviews.llvm.org/D10084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238636 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 03:20:59 +00:00
Chandler Carruth
43d1e87d73 [x86] Restructure the parallel bitmath lowering of popcount into
a separate routine, generalize it to work for all the integer vector
sizes, and do general code cleanups.

This dramatically improves lowerings of byte and short element vector
popcount, but more importantly it will make the introduction of the
LUT-approach much cleaner.

The biggest cleanup I've done is to just force the legalizer to do the
bitcasting we need. We run these iteratively now and it makes the code
much simpler IMO. Other changes were minor, and mostly naming and
splitting things up in a way that makes it more clear what is going on.

The other significant change is to use a different final horizontal sum
approach. This is the same number of instructions as the old method, but
shifts left instead of right so that we can clear everything but the
final sum with a single shift right. This seems likely better than
a mask which will usually have to read the mask from memory. It is
certaily fewer u-ops. Also, this will be temporary. This and the LUT
approach share the need of horizontal adds to finish the computation,
and we have more clever approaches than this one that I'll switch over
to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238635 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 03:20:55 +00:00
Filipe Cabecinhas
3b821159da [BitcodeReader] Change an assert to a call to a call to Error()
It's reachable from user input.

Bug found with AFL fuzz.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238633 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-30 00:17:20 +00:00
Reid Kleckner
bfa311df8c [WinEH] Adjust the 32-bit SEH prologue to better match reality
It turns out that _except_handler3 and _except_handler4 really use the
same stack allocation layout, at least today. They just make different
choices about encoding the LSDA.

This is in preparation for lowering the llvm.eh.exceptioninfo().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238627 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 22:57:46 +00:00
Reid Kleckner
f0e3e4cd84 Disable FP elimination in funcs using 32-bit MSVC EH personalities
The value in 'ebp' acts as an implicit argument to the outlined
handlers, and is recovered with frameaddress(1).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238619 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 21:58:11 +00:00
Matthias Braun
3bd732d1ee MachineCopyPropagation: Remove the copies instead of using KILL instructions.
For some history here see the commit messages of r199797 and r169060.

The original intent was to fix cases like:

%EAX<def> = COPY %ECX<kill>, %RAX<imp-def>
%RCX<def> = COPY %RAX<kill>

where simply removing the copies would have RCX undefined as in terms of
machine operands only the ECX part of it is defined. The machine
verifier would complain about this so 169060 changed such COPY
instructions into KILL instructions so some super-register imp-defs
would be preserved. In r199797 it was finally decided to always do this
regardless of super-register defs.

But this is wrong, consider:
R1 = COPY R0
...
R0 = COPY R1
getting changed to:
R1 = KILL R0
...
R0 = KILL R1

It now looks like R0 dies at the first KILL and won't be alive until the
second KILL, while in reality R0 is alive and must not change in this
part of the program.

As this only happens after register allocation there is not much code
still performing liveness queries so the issue was not noticed.  In fact
I didn't manage to create a testcase for this, without unrelated changes
I am working on at the moment.

The fix is simple: As of r223896 the MachineVerifier allows reads from
partially defined registers, so the whole transforming COPY->KILL thing
is not necessary anymore. This patch also changes a similar (but more
benign case as the def and src are the same register) case in the
VirtRegRewriter.

Differential Revision: http://reviews.llvm.org/D10117

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238588 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 18:19:25 +00:00
Nemanja Ivanovic
8493722975 Add support for VSX FMA single-precision instructions to the PPC back end
This patch corresponds to review:
http://reviews.llvm.org/D9941

It adds the various FMA instructions introduced in the version 2.07 of
the ISA along with the testing for them. These are operations on single
precision scalar values in VSX registers.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238578 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 17:13:25 +00:00
Alex Lorenz
83d291ae8b MIR Serialization: use correct line and column numbers for LLVM IR errors.
This commit translates the line and column numbers for LLVM IR
errors from the numbers in the YAML block scalar to the numbers 
in the MIR file so that the MIRParser users can report LLVM IR 
errors with the correct line and column numbers.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10108


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238576 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 17:05:41 +00:00
Reid Kleckner
16e4a624c4 [WinEH] Emit EH tables for __CxxFrameHandler3 on 32-bit x86
Small (really small!) C++ exception handling examples work on 32-bit x86
now.

This change disables the use of .seh_* directives in WinException when
CFI is not in use. It also uses absolute symbol references in the tables
instead of imagerel32 relocations.

Also fixes a cache invalidation bug in MMI personality classification.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238575 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 17:00:57 +00:00
Jingyue Wu
ef056a9111 [NVPTXFavorNonGenericAddrSpaces] recursively trace into GEP and BitCast
Summary:
This patch allows NVPTXFavorNonGenericAddrSpaces to remove addrspacecast
from longer chains consisting of GEPs and BitCasts. For example, it can
now optimize

  %0 = addrspacecast [10 x float] addrspace(3)* @a to [10 x float]*
  %1 = gep [10 x float]* %0, i64 0, i64 %i
  %2 = bitcast float* %1 to i32*
  %3 = load i32* %2 ; emits ld.u32

to

  %0 = gep [10 x float] addrspace(3)* @a, i64 0, i64 %i
  %1 = bitcast float addrspace(3)* %0 to i32 addrspace(3)*
  %3 = load i32 addrspace(3)* %1 ; emits ld.shared.f32

Test Plan: @ld_int_from_global_float in access-non-generic.ll

Reviewers: broune, eliben, jholewinski, meheff

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10074

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238574 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 17:00:27 +00:00
Jingyue Wu
ed0d841f59 [DependenceAnalysis] Extend unifySubscriptType for handling coupled subscript groups.
Summary:
In continuation to an earlier commit to DependenceAnalysis.cpp by jingyue (r222100), the type for all subscripts in a coupled group need to be the same since constraints from one subscript may be propagated to another during testing. During testing, new SCEVs may be created and the operands for these need to be the same.
This patch extends unifySubscriptType() to work on lists of subscript pairs, ensuring a common extended type for all of them.

Test Plan:
Added a test case to NonCanonicalizedSubscript.ll which causes dependence analysis to crash without this fix.

All regression tests pass.

Reviewers: spop, sebpop, jingyue

Reviewed By: jingyue

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9698

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238573 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 16:58:08 +00:00
Colin LeMahieu
0ace3c01f7 [Hexagon] Disassembling, printing, and emitting instructions a whole-bundle at a time which is the semantic unit for Hexagon. Fixing tests to use the new format. Disabling tests in the direct object emission path for a followup patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238556 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 14:44:13 +00:00
Quentin Colombet
7e31fe7e20 Add a test for the MachineCopyPropagation change landed in r238518.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238537 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 01:40:00 +00:00
Ahmed Bougacha
d4b59dcdba [TableGen][AsmMatcherEmitter] Only parse isolated tokens as registers.
Fixes PR23455, where, when TableGen generates the matcher from the
AsmString, it splits "cmp${cc}ss" into tokens, and the "ss" suffix
is recognized as the SS register.

I can't think of a situation where that's a feature, not a bug, hence:
when a token is "isolated", i.e., it is followed and preceded by
separators, it shouldn't be parsed as a register.

Differential Revision: http://reviews.llvm.org/D9844


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238536 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 01:03:37 +00:00
Ahmed Bougacha
c0d1ce55ac [IR] fptrunc-of-fptrunc isn't an EliminableCastPair.
Double and single rounding can produce different results.
This is the IR counterpart to r228911.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238531 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-29 00:04:30 +00:00
Chandler Carruth
b78a66659b [x86] Move the vector popcount tests into non-ISA files, and instead
organize them by the width of vector.

This makes it a lot easier to see that we're covering all of the vector
types but not doing so excessively. This also adds tests across the
spectrum of SSE versions in addition to the AVX versions.

If you're really tired of seeing the *massive* sprawl of scalarized code
for this, don't worry, I'm just about to land Bruno's patch that
dramatically improve the situation for SSSE3 and newer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238520 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 22:46:48 +00:00
Alex Lorenz
3682046086 MIR Serialization: print and parse machine function names.
This commit introduces a serializable structure called
'llvm::yaml::MachineFunction' that stores the machine
function's name. This structure will mirror the machine 
function's state in the future.

This commit prints machine functions as YAML documents
containing a YAML mapping that stores the state of a machine
function. This commit also parses the YAML documents
that contain the machine functions.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D9841


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238519 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 22:41:12 +00:00
David Majnemer
47b5a3cbea Add testcase for r238503.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238515 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 22:12:27 +00:00
Reid Kleckner
5f50442d79 [WinEH] Start inserting state number stores for C++ EH
This moves all the state numbering code for C++ EH to WinEHPrepare so
that we can call it from the X86 state numbering IR pass that runs
before isel.

Now we just call the same state numbering machinery and insert a bunch
of stores. It also populates MachineModuleInfo with information about
the current function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238514 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 22:00:24 +00:00
Rafael Espindola
fb48de619e Don't special case undefined symbol when deciding the symbol order.
ELF has no restrictions on where undefined symbols go relative to other defined
symbols. In fact, gas just sorts them together. Do the same.

This was there since r111174 probably just because the MachO writer has it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238513 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 21:59:34 +00:00
Andy Ayers
588f5f0cbb Revise test to run llc and llvm-mc separately.
Differential Revision: http://reviews.llvm.org/D10066

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238508 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 21:49:50 +00:00
Wei Mi
61897e8564 Enable exitValue rewrite only when the cost of expansion is low.
The patch evaluates the expansion cost of exitValue in indVarSimplify pass, and only does the rewriting when the expansion cost is low or loop can be deleted with the rewriting. It provides an option "-replexitval=" to control the default aggressiveness of the exitvalue rewriting. It also fixes some missing cases in SCEVExpander::isHighCostExpansionHelper to enhance the evaluation of SCEV expansion cost.

Differential Revision: http://reviews.llvm.org/D9800


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238507 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 21:49:07 +00:00
Reid Kleckner
7738ecd62b Disable x86 tail call optimizations that jump through GOT
For x86 targets, do not do sibling call optimization when materializing
the callee's address would require a GOT relocation. We can still do
tail calls to internal functions, hidden functions, and protected
functions, because they do not require this kind of relocation. It is
still possible to get GOT relocations when the user explicitly asks for
it with musttail or -tailcallopt, both of which are supposed to
guarantee TCO.

Based on a patch by Chih-hung Hsieh.

Reviewers: srhines, timmurray, danalbert, enh, void, nadav, rnk

Subscribers: joerg, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D9799

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238487 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 20:44:28 +00:00
Daniel Sanders
1348f57925 Revert r238427 - [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only.
It caused a smaller number of failures than the previous attempt at committing but still caused a couple on the llvm-linux-mips builder. Reverting while I investigate the remainder.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238483 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 20:30:32 +00:00
Alexey Samsonov
8ecf661ef1 Object, ELF: Use error code instead of calling report_fatal_error()
Make createELFObjectFile() return object_error::parse_failed on
encountering invalid ELF file, instead of crashing the program.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238481 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 20:25:42 +00:00
Peter Collingbourne
27565d6185 Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM.
We were previously codegen'ing these as regular load/store operations and
hoping that the register allocator would allocate registers in ascending order
so that we could apply an LDM/STM combine after register allocation. According
to the commit that first introduced this code (r37179), we planned to teach
the register allocator to allocate the registers in ascending order. This
never got implemented, and up to now we've been stuck with very poor codegen.

A much simpler approach for achiveing better codegen is to create LDM/STM
instructions with identical sets of virtual registers, let the register
allocator pick arbitrary registers and order register lists when printing an
MCInst. This approach also avoids the need to repeatedly calculate offsets
which ultimately ought to be eliminated pre-RA in order to decrease register
pressure.

This is implemented by lowering the memcpy intrinsic to a series of SD-only
MCOPY pseudo-instructions which performs a memory copy using a given number
of registers. During SD->MI lowering, we lower MCOPY to LDM/STM. This is a
little unusual, but it avoids the need to encode register lists in the SD,
and we can take advantage of SD use lists to decide whether to use the _UPD
variant of the instructions.

Fixes PR9199.

Differential Revision: http://reviews.llvm.org/D9508

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238473 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 20:02:45 +00:00
David Majnemer
967f6ad3e1 [InstCombine] Fold IntToPtr and PtrToInt into preceding loads.
Currently we only fold a BitCast into a Load when the BitCast is its
only user.

Do the same for any no-op cast.

Differential Revision: http://reviews.llvm.org/D9152

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238452 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 18:39:17 +00:00
Kai Nacke
95fa1db8f5 [mips] Add new format for dmtc2/dmfc2 for Octeon CPUs.
Octeon CPUs use dmtc2 rt,imm16 and dmfcp2 rt,imm16 for the crypto coprocessor.
E.g. dmtc2 rt,0x4057 starts calculation of sha-1.

I had to introduce a new deconding namespace to avoid a decoding conflict.

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D10083


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238439 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 16:23:16 +00:00
Ed Maste
5d25204af9 DebugInfo: .debug_line DWARF64 support
This adds support for the 64-bit DWARF format, but is still limited to
less than 4GB of debug data by the DataExtractor class.  Some versions
of the GNU MIPS toolchain generate 64-Bit DWARF even though it isn't
actually necessary.

Differential Revision: http://reviews.llvm.org/D1988


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238434 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 15:38:17 +00:00
Rafael Espindola
c4e38f605e Don't create an unused _GLOBAL_OFFSET_TABLE_.
This was a bug for bug compatibility with gas that is completely unnecessary.
If a _GLOBAL_OFFSET_TABLE_ symbol is used, it will already be created by
the time we get to the ELF writer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238432 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 15:20:00 +00:00
Daniel Sanders
8bf191d139 [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only.
Summary:
Following on from r209907 which made personality encodings indirect, do the
same for TType encodings. This fixes the case where a try/catch block needs
to generate references to, for example, std::exception in the
.gcc_except_table.

Reviewers: petarj

Reviewed By: petarj

Subscribers: srhines, joerg, tberghammer, llvm-commits

Differential Revision: http://reviews.llvm.org/D9669



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238427 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 14:52:15 +00:00
Petar Jovanovic
a703f676f7 [Mips64] Add support for MCJIT for MIPS64r2 and MIPS64r6
Add support for resolving MIPS64r2 and MIPS64r6 relocations in MCJIT.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D9667


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238424 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 13:48:41 +00:00
Yury Gribov
08e5ec43f4 [ASan] New approach to dynamic allocas unpoisoning. Patch by Max Ostapenko!
Differential Revision: http://reviews.llvm.org/D7098


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238402 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 07:51:49 +00:00
David Majnemer
48e2671cb6 [Reassociate] Canonicalizing 'x [+-] (-Constant * y)' isn't always a win
Canonicalizing 'x [+-] (-Constant * y)' is not a win if we don't *know*
we will open up CSE opportunities.

If the multiply was 'nsw', then negating 'y' requires us to clear the
'nsw' flag.  If this is actually worth pursuing, it is probably more
appropriate to do so in GVN or EarlyCSE.

This fixes PR23675.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238397 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 06:16:39 +00:00
Jingyue Wu
4977e92629 [NaryReassociate] Run EarlyCSE after NaryReassociate
Summary:
This patch made two improvements to NaryReassociate and the NVPTX pipeline

1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common
expressions.

2. When adding an instruction to SeenExprs, maps both the SCEV before and after
reassociation to that instruction.

Test Plan: updated @reassociate_gep_nsw in nary-gep.ll

Reviewers: meheff, broune

Reviewed By: broune

Subscribers: dberlin, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9947

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238396 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 04:56:52 +00:00
Chandler Carruth
9dacaea1a1 [x86] Refactor the tests for popcnt.
Extracted from the D6531 patch by Bruno Cardoso Lopes, and re-generated
to reflect the current state of the world. This should let Bruno's D6531
actually show the delta between the approaches by running the x86 test
case update script after re-building.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238391 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-28 02:40:15 +00:00
Lang Hames
8cc212a07e [RuntimeDyld] Fix MachO i386 SECTDIFF relocation to support non-zero addends.
Previously, relocations of the form 'A - B + C' would fail on i386 when C was
non-zero.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238356 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 20:50:01 +00:00
Diego Novillo
b72f3e0d32 Final fix for PR 23499 and IR test case.
This fixes a bit I forgot in r238335. In addition to the data record and
the counter, we can also move the name of the counter to the comdat for
the associated function.

I'm also adding an IR test case to check that these three elements are
placed in the proper comdat.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238351 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 19:34:01 +00:00
Renato Golin
a052a77187 ARMTargetParser: Normalising build attributes
Now that most of the methods in Clang and LLVM that were parsing arch/cpu/fpu
strings are using ARMTargetParser, it's time to make it a bit more conforming
with what the ABI says.

This commit adds some clarification on what build attributes are accepted and
which are "non-standard". It also makes clear that the "defaultCPU" and
"defaultArch" methods were really just build attribute getters.

It also diverges from GCC's behaviour to say that armv2/armv3 are really an
ARMv4 in the build attributes, when the ABI has a clear state for that: Pre-v4.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238344 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 18:15:37 +00:00
Alex Lorenz
61aecc8c23 Resubmit r237954 (MIR Serialization: print and parse LLVM IR using MIR format).
This commit a 3rd attempt at comitting the initial MIR serialization patch.
The first commit (r237708) was reverted in 237730. Then the second commit
(r237954) was reverted in r238007, as the MIR library under CodeGen caused
a circular dependency where the CodeGen library depended on MIR and MIR
library depended on CodeGen.

This commit has fixed the dependencies between CodeGen and MIR by
reorganizing the MIR serialization code - the code that prints out
MIR has been moved to CodeGen, and the MIR library has been renamed
to MIRParser. Now the CodeGen library doesn't depend on the
MIRParser library, thus the circular dependency no longer exists.

--Original Commit Message--

MIR Serialization: print and parse LLVM IR using MIR format.

This commit is the initial commit for the MIR serialization project.
It creates a new library under CodeGen called 'MIR'. This new
library adds a new machine function pass that prints out the LLVM IR
using the MIR format. This pass is then added as a last pass when a
'stop-after' option is used in llc. The new library adds the initial
functionality for parsing of MIR files as well. This commit also
extends the llc tool so that it can recognize and parse MIR input files.

Reviewers: Duncan P. N. Exon Smith, Matthias Braun, Philip Reames

Differential Revision: http://reviews.llvm.org/D9616 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238341 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 18:02:19 +00:00
Zoran Jovanovic
dbab00a2d4 [mips][microMIPSr6] Implement SEB and SEH instructions
Differential Revision: http://reviews.llvm.org/D9739


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238333 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 15:39:47 +00:00
Jozef Kolek
b8124ac882 [mips][microMIPSr6] Implement BEQZALC, BGEZALC, BGTZALC, BLEZALC, BLTZALC and BNEZALC instructions
This patch implements microMIPS32r6 BEQZALC, BGEZALC, BGTZALC, BLEZALC, BLTZALC
and BNEZALC instructions using mapping.

Differential Revision: http://reviews.llvm.org/D10031


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238325 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 14:19:22 +00:00
Elena Demikhovsky
d56dcc4243 AVX-512: Fixed a bug in extracting subvector from v64i1
By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238322 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 14:09:33 +00:00
Daniel Sanders
3a9cbffdcb Revert r238190 and r238197: [mips] Make TTypeEncoding indirect to allow .eh_frame to be read-only.
This broke the llvm-mips-linux builder and several of our out-of-tree builders.
Initial investigations show that the commit probably isn't the problem but
reverting anyway while I investigate.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238302 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 08:44:01 +00:00
Elena Demikhovsky
078088b790 AVX-512: Implemented all forms of sign-extend and zero-extend instructions for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

By Igor Breger (igor.breger@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238301 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 08:15:19 +00:00
Quentin Colombet
60c91c28e4 [X86] Implement the support for shrink-wrapping.
With this patch the x86 backend is now shrink-wrapping capable
and this functionality can be tested by using the
-enable-shrink-wrap switch.

The next step is to make more test and enable shrink-wrapping by
default for x86.

Related to <rdar://problem/20821487>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238293 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 06:28:41 +00:00
Chandler Carruth
34afa06cf2 [inliner] Fix the early-exit of the inline cost analysis to correctly
model the dense vector instruction bonuses.

Previously, this code really didn't effectively compute the density of
inlined vector instructions and apply the intended inliner bonus. It
would try to compute it repeatedly while analyzing the function and
didn't handle the case where future vector instructions would tip the
scales back towards the bonus.

Instead, speculatively apply all possible bonuses to the threshold
initially. Once we *know* that a certain bonus can not be applied,
subtract it. This should delay early bailout enough to get much more
consistent results without actually causing us to analyze huge swaths of
code. I expect some (hopefully mild) compile time hit here, and some
swings in performance, but this was definitely the intended behavior of
these bonuses.

This also dramatically simplifies the computation of the bonuses to not
interact with each other in confusing ways. The previous code didn't do
a good job of this and the values for bonuses may be surprising but are
at least now clearly written in the code.

Finally, fix code to be in line with comments and use zero as the
bailout condition.

Patch by Easwaran Raman, with some comment tweaks by me to try and
further clarify what is going on with this code.

http://reviews.llvm.org/D8267

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238276 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 02:49:05 +00:00
Filipe Cabecinhas
1c0b496636 [BitcodeReader] Change assert to report_fatal_error
It can be triggered by user input.

Bug found with AFL fuzz.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238272 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-27 01:05:40 +00:00