This change adds XCoreMCInstLower to do the lowering to MCInst and
XCoreInstPrinter to print the MCInsts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170288 91177308-0d34-0410-b5e6-96231b3b80d8
Mips16 is really a processor decoding mode (ala thumb 1) and in the same
program, mips16 and mips32 functions can exist and can call each other.
If a jal type instruction encounters an address with the lower bit set, then
the processor switches to mips16 mode (if it is not already in it). If the
lower bit is not set, then it switches to mips32 mode.
The linker knows which functions are mips16 and which are mips32.
When relocation is performed on code labels, this lower order bit is
set if the code label is a mips16 code label.
In general this works just fine, however when creating exception handling
tables and dwarf, there are cases where you don't want this lower order
bit added in.
This has been traditionally distinguished in gas assembly source by using a
different syntax for the label.
lab1: ; this will cause the lower order bit to be added
lab2=. ; this will not cause the lower order bit to be added
In some cases, it does not matter because in dwarf and debug tables
the difference of two labels is used and in that case the lower order
bits subtract each other out.
To fix this, I have added to mcstreamer the notion of a debuglabel.
The default is for label and debug label to be the same. So calling
EmitLabel and EmitDebugLabel produce the same result.
For various reasons, there is only one set of labels that needs to be
modified for the mips exceptions to work. These are the "$eh_func_beginXXX"
labels.
Mips overrides the debug label suffix from ":" to "=." .
This initial patch fixes exceptions. More changes most likely
will be needed to DwarfCFException to make all of this work
for actual debugging. These changes will be to emit debug labels in some
places where a simple label is emitted now.
Some historical discussion on this from gcc can be found at:
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.htmlhttp://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170279 91177308-0d34-0410-b5e6-96231b3b80d8
We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases
if y is a constant. DAGCombiner canonicalizes those so we first have to undo the
canonicalization for those cases. The pattern occurs in gzip when the loop
vectorizer is enabled. Part of PR14613.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170273 91177308-0d34-0410-b5e6-96231b3b80d8
Not all chips targeted by x86_64 have this feature, but a dramatically
increasing number do. Specifying a chip-specific tuning parameter will
continue to turn the feature on or off as appropriate for that
particular chip, but the generic flag should try to achieve the best
performance on the most widely available hardware. Today, the number of
chips with fast UA access dwarfs those without in the x86-64 space.
Note that this also brings LLVM's code generation for this '-march' flag
more in line with that of modern GCCs. Reviewed by Dan Gohman.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170269 91177308-0d34-0410-b5e6-96231b3b80d8
In this case, essentially it is soft float with different library routines.
The next step will be to make this fully interoperational with mips32 floating
point and that requires creating stubs for functions with signatures that
contain floating point types.
I have a more sophisticated design for mips16 hardfloat which I hope to
implement at a later time that directly does floating point without the need
for function calls.
The mips16 encoding has no floating point instructions so one needs to
switch to mips32 mode to execute floating point instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170259 91177308-0d34-0410-b5e6-96231b3b80d8
immediate generates the narrow version. Needed when doing round-trip
assemble/disassemble testing using the alternate syntax that specifies
'pc' directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170255 91177308-0d34-0410-b5e6-96231b3b80d8
for TLS dynamic models on 64-bit PowerPC ELF. The default sort routine
for relocations only sorts on the r_offset field; but with TLS, there
can be two relocations with the same r_offset. For PowerPC, this patch
sorts secondarily on descending r_type, which matches the behavior
expected by the linker.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170237 91177308-0d34-0410-b5e6-96231b3b80d8
for a wider range of GOT entries that can hold thread-relative offsets.
This matches the behavior of GCC, which was not documented in the PPC64 TLS
ABI. The ABI will be updated with the new code sequence.
Former sequence:
ld 9,x@got@tprel(2)
add 9,9,x@tls
New sequence:
addis 9,2,x@got@tprel@ha
ld 9,x@got@tprel@l(9)
add 9,9,x@tls
Note that a linker optimization exists to transform the new sequence into
the shorter sequence when appropriate, by replacing the addis with a nop
and modifying the base register and relocation type of the ld.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170209 91177308-0d34-0410-b5e6-96231b3b80d8
This change moves the code for default shadow propagaition (handleShadowOr)
and origin tracking (setOriginForNaryOp) into a new builder-like class. Also
gets rid of handleShadowOrBinary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170192 91177308-0d34-0410-b5e6-96231b3b80d8
The new API is higher level than just manipulating the bundle flags
directly, and the setIsInsideBundle() function will disappear soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170159 91177308-0d34-0410-b5e6-96231b3b80d8
some hackery in place that hid my poor use of TblGen, which I've now sorted
out and cleaned up. No change in observable behavior, so no new test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170149 91177308-0d34-0410-b5e6-96231b3b80d8
This assumes (1 << n) is always not zero. Consider n is greater than word size.
Although I know it is undefined, this transforms undefined behavior hidden.
This led clang unexpected behavior with some failures. I will investigate to fix undefined shl in clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170128 91177308-0d34-0410-b5e6-96231b3b80d8
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.
This is the first, in a series of patches.
This is the second attempt. In the first attempt (r169837), a few
getSimpleVT() were hoisted too far, detected by bootstrap failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170104 91177308-0d34-0410-b5e6-96231b3b80d8
On MachO, sections also have segment names. When a tool looking at a .o file
prints a segment name, this is what they mean. In reality, a .o has only one,
anonymous, segment.
This patch adds a MachO only function to fetch that segment name. I named it
getSectionFinalSegmentName since the main use for the name seems to be informing
the linker with segment this section should go to.
The patch also changes MachOObjectFile::getSectionName to return just the
section name instead of computing SegmentName,SectionName.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170095 91177308-0d34-0410-b5e6-96231b3b80d8
In a previous thread it was pointed out that isPowerOfTwo is not a very precise
name since it can return false for powers of two if it is unable to show that
they are powers of two.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170093 91177308-0d34-0410-b5e6-96231b3b80d8
Provides m_Argument that allows matching against a CallSite's specified argument. Provides m_Intrinsic pattern that can be templatized over the intrinsic id and bind/match arguments similarly to other pattern matchers. Implementations provided for 0 to 4 arguments, though it's very simple to extend for more. Also provides example template specialization for bswap (m_BSwap) and example of code cleanup for its use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170091 91177308-0d34-0410-b5e6-96231b3b80d8
Better controls the inlining of functions when the caller function has MinSize attribute.
Basically, when the caller function has this attribute, we do not "force" the inlining
of callee functions carrying the InlineHint attribute (i.e., functions defined with
inline keyword)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170065 91177308-0d34-0410-b5e6-96231b3b80d8
FFR1_W_M and FFR1P_M. The new instruction definitions have one-to-one
correspondence with the instructions in the ISA manual.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170053 91177308-0d34-0410-b5e6-96231b3b80d8
should only occur on invalid input. Instruction matching errors aren't
unexpected, so we can't rely on the AsmParsers HadError variable directly.
rdar://12840278
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170037 91177308-0d34-0410-b5e6-96231b3b80d8
load / store pair. It's not legal to use a wider load than the size of
the remaining bytes if it's the first pair of load / store.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170018 91177308-0d34-0410-b5e6-96231b3b80d8
PowerPC target. This is the last of the four models, so we now have
full TLS support.
This is mostly a straightforward extension of the general dynamic model.
I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the
register copy following ADDI_TLSLD_L; otherwise everything above the
ADDIS_DTPREL_HA appeared dead and was removed.
As before, there are new test cases to test the assembly generation, and
the relocations output during integrated assembly. The expected code
gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll.
There are a couple of things I think can be done more efficiently in the
overall TLS code, so there will likely be a clean-up patch forthcoming;
but for now I want to be sure the functionality is in place.
Bill
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170003 91177308-0d34-0410-b5e6-96231b3b80d8
been used in the first place. It simply was passed to the function and to the
recursive invocations. Simply drop the parameter and update the callers for the
new signature.
Patch by Saleem Abdulrasool!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169988 91177308-0d34-0410-b5e6-96231b3b80d8
When ASan replaces <alloca instruction> with
<offset into a common large alloca>, it should also patch
llvm.dbg.declare calls and replace debug info descriptors to mark
that we've replaced alloca with a value that stores an address
of the user variable, not the user variable itself.
See PR11818 for more context.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169984 91177308-0d34-0410-b5e6-96231b3b80d8
Add R_ARM_NONE and R_ARM_PREL31 relocation types
to MCExpr. Both of them will be used while
generating .ARM.extab and .ARM.exidx sections.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169965 91177308-0d34-0410-b5e6-96231b3b80d8
mention the inline memcpy / memset expansion code is a mess?
This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset.
The first indicates whether it is expanding a memset or a memcpy / memmove.
The later is whether the memset is a memset of zero. It's totally possible
(likely even) that targets may want to do different things for memcpy and
memset of zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169959 91177308-0d34-0410-b5e6-96231b3b80d8
Also added more comments to explain why it is generally ok to return true.
- Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to
be true for loaded source (memcpy) or zero constants (memset). The poor name
choice is probably some kind of legacy issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169954 91177308-0d34-0410-b5e6-96231b3b80d8
fsub X, +0 ==> X
fsub X, -0 ==> X, when we know X is not -0
fsub +/-0.0, (fsub -0.0, X) ==> X
fsub nsz +/-0.0, (fsub +/-0.0, X) ==> X
fsub nnan ninf X, X ==> 0.0
fadd nsz X, 0 ==> X
fadd [nnan ninf] X, (fsub [nnan ninf] 0, X) ==> 0
where nnan and ninf have to occur at least once somewhere in this expression
fmul X, 1.0 ==> X
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169940 91177308-0d34-0410-b5e6-96231b3b80d8
Pre-regalloc frame allocation and referencing has been on by default
for ages. No need for the testing option that disables it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169931 91177308-0d34-0410-b5e6-96231b3b80d8
ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined
term for something like integer immediate materialization. It is always possible
to materialize an integer immediate. Whether to use it for memcpy expansion is
more a "cost" conceern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169929 91177308-0d34-0410-b5e6-96231b3b80d8
Given a thread-local symbol x with global-dynamic access, the generated
code to obtain x's address is:
Instruction Relocation Symbol
addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x
addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x
bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x
R_PPC64_REL24 __tls_get_addr
nop
<use address in r3>
The implementation borrows from the medium code model work for introducing
special forms of ADDIS and ADDI into the DAG representation. This is made
slightly more complicated by having to introduce a call to the external
function __tls_get_addr. Using the full call machinery is overkill and,
more importantly, makes it difficult to add a special relocation. So I've
introduced another opcode GET_TLS_ADDR to represent the function call, and
surrounded it with register copies to set up the parameter and return value.
Most of the code is pretty straightforward. I ran into one peculiarity
when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like
BL8_NOP_ELF except that it takes another parameter to represent the symbol
("x" above) that requires a relocation on the call. Something in the
TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated
identically during the emit phase, so this second operand was never
visited to generate relocations. This is the reason for the slightly
messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding().
Two new tests are included to demonstrate correct external assembly and
correct generation of relocations using the integrated assembler.
Comments welcome!
Thanks,
Bill
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169910 91177308-0d34-0410-b5e6-96231b3b80d8
because that method is only getting called for MCInstFragment. These
fragments aren't even generated when RelaxAll is set, which is why the
flag reference here is superfluous. Removing it simplifies the code
with no harmful effects.
An assertion is added higher up to make sure this path is never
reached.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169886 91177308-0d34-0410-b5e6-96231b3b80d8
Use explicitely aligned store and load instructions to deal with argument and
retval shadow. This matters when an argument's alignment is higher than
__msan_param_tls alignment (which is the case with __m128i).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169859 91177308-0d34-0410-b5e6-96231b3b80d8
instead of the instruction. I've left a forwarding wrapper for the
instruction so users with the instruction don't need to create
a GEPOperator themselves.
This lets us remove the copy of this code in instsimplify.
I've looked at most of the other copies of similar code, and this is the
only one I've found that is actually exactly the same. The one in
InlineCost is very close, but it requires re-mapping non-constant
indices through the cost analysis value simplification map. I could add
direct support for this to the generic routine, but it seems overly
specific.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169853 91177308-0d34-0410-b5e6-96231b3b80d8
the GEP instruction class.
This is part of the continued refactoring and cleaning of the
infrastructure used by SROA. This particular operation is also done in
a few other places which I'll try to refactor to share this
implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169852 91177308-0d34-0410-b5e6-96231b3b80d8
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.
This is the first, in a series of patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169837 91177308-0d34-0410-b5e6-96231b3b80d8
try to reduce the width of this load, and would end up transforming:
(truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
(truncate (zextload i32 <ptr+4> as i64) to i32)
We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:
movswl 6(...),%eax
Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:
movl 6(...), %eax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169802 91177308-0d34-0410-b5e6-96231b3b80d8
This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles. Testing with the external/internal nightly
test suite reveals no change in compile time performance. Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures. All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.
In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.
Part of rdar://12553082
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169796 91177308-0d34-0410-b5e6-96231b3b80d8
controls each of the abbreviation sets (only a single one at the
moment) and computes offsets separately as well for each set
of DIEs.
No real function change, ordering of abbreviations for the skeleton
CU changed but only because we're computing in a separate order. Fix
the testcase not to care.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169793 91177308-0d34-0410-b5e6-96231b3b80d8
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
if it's not possible to materialize an integer immediate with a single
instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
Also increase the threshold to something reasonable (8 for memset, 4 pairs
for memcpy).
This significantly improves Dhrystone, up to 50% on ARM iOS devices.
rdar://12760078
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8
Analyse Phis under the starting assumption that they are NoAlias. Recursively
look at their inputs.
If they MayAlias/MustAlias there must be an input that makes them so.
Addresses bug 14351.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169788 91177308-0d34-0410-b5e6-96231b3b80d8
InitSections is called before the MCContext is initialized it could cause
duplicate temporary symbols to be emitted later (after context initialization
resets the temporary label counter).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169785 91177308-0d34-0410-b5e6-96231b3b80d8
The `-mno-red-zone' flag wasn't being propagated to the functions that code
coverage generates. This allowed some of them to use the red zone when that
wasn't allowed.
<rdar://problem/12843084>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169754 91177308-0d34-0410-b5e6-96231b3b80d8
the assembler. This is useful in order to know how the numbers add up,
since in particular the Align fragments account for a non-trivial
portion of the emitted fragments (especially on -O0 which sets
relax-all).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169747 91177308-0d34-0410-b5e6-96231b3b80d8
misched used GetUnderlyingObject in order to break false load/store
dependencies, and the -enable-aa-sched-mi feature similarly relied on
GetUnderlyingObject in order to ensure it is safe to use the aliasing analysis.
Unfortunately, GetUnderlyingObject does not recurse through phi nodes, and so
(especially due to LSR) all of these mechanisms failed for
induction-variable-dependent loads and stores inside loops.
This change replaces uses of GetUnderlyingObject with GetUnderlyingObjects
(which will recurse through phi and select instructions) in misched.
Andy reviewed, tested and simplified this patch; Thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169744 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Not all chips targeted by x86_64 have this feature, but a dramatically
increasing number do. Specifying a chip-specific tuning parameter will
continue to turn the feature on or off as appropriate for that
particular chip, but the generic flag should try to achieve the best
performance on the most widely available hardware. Today, the number of
chips with fast UA access dwarfs those without in the x86-64 space.
Note that this also brings LLVM's code generation for this '-march' flag
more in line with that of modern GCCs.
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D195
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169740 91177308-0d34-0410-b5e6-96231b3b80d8
Intel chips.
The model number rules were determined by inspecting Intel's
documentation for their newer chip model numbers. My understanding is
that all of the newer Intel chips have fast unaligned memory access, but
if anyone is concerned about a particular chip, just shout.
No tests updated; it's not clear we have dedicated tests for the chips'
various features, but if anyone would like tests (or can point me at
some existing ones), I'm happy to oblige.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169730 91177308-0d34-0410-b5e6-96231b3b80d8
This visitor provides infrastructure for recursively traversing the
use-graph of a pointer-producing instruction like an alloca or a malloc.
It maintains a worklist of uses to visit, so it can handle very deep
recursions. It automatically looks through instructions which simply
translate one pointer to another (bitcasts and GEPs). It tracks the
offset relative to the original pointer as long as that offset remains
constant and exposes it during the visit as an APInt offset. Finally, it
performs conservative escape analysis.
However, currently it has some limitations that should be addressed
going forward:
1) It doesn't handle vectors of pointers.
2) It doesn't provide a cheaper visitor when the constant offset
tracking isn't needed.
3) It doesn't support non-instruction pointer values.
The current functionality is exactly what is required to implement the
SROA pointer-use visitors in terms of this one, rather than in terms of
their own ad-hoc base visitor, which was always very poorly specified.
SROA has been converted to use this, and the code there deleted which
this utility now provides.
Technically speaking, using this new visitor allows SROA to handle a few
more cases than it previously did. It is now more aggressive in ignoring
chains of instructions which look like they would defeat SROA, but in
fact do not because they never result in a read or write of memory.
While this is "neat", it shouldn't be interesting for real programs as
any such chains should have been removed by others passes long before we
get to SROA. As a consequence, I've not added any tests for these
features -- it shouldn't be part of SROA's contract to perform such
heroics.
The goal is to extend the functionality of this visitor going forward,
and re-use it from passes like ASan that can benefit from doing
a detailed walk of the uses of a pointer.
Thanks to Ben Kramer for the code review rounds and lots of help
reviewing and debugging this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169728 91177308-0d34-0410-b5e6-96231b3b80d8
When SROA was evaluating a mixture of i1 and i8 loads and stores, in
just a particular case, it would tickle a latent bug where we compared
bits to bytes rather than bits to bits. As a consequence of the latent
bug, we would allow integers through which were not byte-size multiples,
a situation the later rewriting code was never intended to handle.
In release builds this could trigger all manner of oddities, but the
reported issue in PR14548 was forming invalid bitcast instructions.
The only downside of this fix is that it makes it more clear that SROA
in its current form is not capable of handling mixed i1 and i8 loads and
stores. Sometimes with the previous code this would work by luck, but
usually it would crash, so I'm not terribly worried. I'll watch the LNT
numbers just to be sure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169719 91177308-0d34-0410-b5e6-96231b3b80d8
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.
Reviewed by: Nadav
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169711 91177308-0d34-0410-b5e6-96231b3b80d8
This will more closely match the behavior of the new PtrUseVisitor that
I am adding. Hopefully this will not change the actual behavior in any
way, but by making the processing order more similar help in debugging.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169697 91177308-0d34-0410-b5e6-96231b3b80d8
There are still bugs in this pass, as well as other issues that are
being worked on, but the bugs are crashers that occur pretty easily in
the wild. Test cases have been sent to the original commit's review
thread.
This reverts the commits:
r169671: Fix a logic error.
r169604: Move the popcnt tests to an X86 subdirectory.
r168931: Initial commit adding the pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169683 91177308-0d34-0410-b5e6-96231b3b80d8
It was a nasty oversight that we didn't include this when we added this
API in the first place. Blech.
rdar://12839439
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169653 91177308-0d34-0410-b5e6-96231b3b80d8
SmallString. This makes it possible to use the length-erased SmallVectorImpl
in the interface without imposing buffer size. Thus, the size of MCInstFragment
is back down since a preallocated 8-byte contents buffer is enough.
It would be generally a good idea to rid all the fragments of SmallString as
contents, because a vector just makes more sense.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169644 91177308-0d34-0410-b5e6-96231b3b80d8
the VSRI instruction before it since it does not affect the MSB.
Thanks Craig Topper for suggesting this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169638 91177308-0d34-0410-b5e6-96231b3b80d8
Before this patch, when you objdump an LLVM-compiled file, objdump tried to
decode data-in-code sections as if they were code. This patch adds the missing
Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D).
Patch based on work by Greg Fitzgerald.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169609 91177308-0d34-0410-b5e6-96231b3b80d8
MSan uses a TLS slot to pass shadow for function arguments and return values.
This makes all instrumented functions not readonly, and at the same time
requires that all callees of an instrumented function that may be
MSan-instrumented do not have readonly attribute (otherwise some of the
instrumentation may be optimized out).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169591 91177308-0d34-0410-b5e6-96231b3b80d8
This is still a work in progress. The purpose is to make bundling and
unbundling operations explicit, and to catch errors where bundles are
broken or created inadvertently.
The old IsInsideBundle flag is replaced by two MI flags: BundledPred
which has the same meaning as IsInsideBundle, and BundledSucc which is
set on instructions that are bundled with a successor. Having two flags
provdes redundancy to detect when a bundle is inadvertently torn by a
splice() or insert(), and it makes it possible to write bundle iterators
that don't need to peek at adjacent instructions.
The new flags can't be manipulated directly (once setIsInsideBundle is
gone). Instead there are MI functions to make and break bundle bonds.
The setIsInsideBundle function will be removed in a future commit. It
should be replaced by bundleWithPred().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169583 91177308-0d34-0410-b5e6-96231b3b80d8