CodeModel: It's now possible to create an MCJIT instance with any CodeModel you like. Previously it was only possible to
create an MCJIT that used CodeModel::JITDefault.
EnableFastISel: It's now possible to turn on the fast instruction selector.
The CodeModel option required some trickery. The problem is that previously, we were ensuring future binary compatibility in
the MCJITCompilerOptions by mandating that the user bzero's the options struct and passes the sizeof() that he saw; the
bindings then bzero the remaining bits. This works great but assumes that the bitwise zero equivalent of any field is a
sensible default value.
But this is not the case for LLVMCodeModel, or its internal equivalent, llvm::CodeModel::Model. In both of those, the default
for a JIT is CodeModel::JITDefault (or LLVMCodeModelJITDefault), which is not bitwise zero.
Hence this change introduces LLVMInitializeMCJITCompilerOptions(), which will initialize the user's options struct with
defaults. The user will use this in the same way that they would have previously used memset() or bzero(). MCJITCAPITest.cpp
illustrates the change, as does the comment in ExecutionEngine.h.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180893 91177308-0d34-0410-b5e6-96231b3b80d8
the things, and renames it to CBindingWrapping.h. I also moved
CBindingWrapping.h into Support/.
This new file just contains the macros for defining different wrap/unwrap
methods.
The calls to those macros, as well as any custom wrap/unwrap definitions
(like for array of Values for example), are put into corresponding C++
headers.
Doing this required some #include surgery, since some .cpp files relied
on the fact that including Wrap.h implicitly caused the inclusion of a
bunch of other things.
This also now means that the C++ headers will include their corresponding
C API headers; for example Value.h must include llvm-c/Core.h. I think
this is harmless, since the C API headers contain just external function
declarations and some C types, so I don't believe there should be any
nasty dependency issues here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180881 91177308-0d34-0410-b5e6-96231b3b80d8
Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180875 91177308-0d34-0410-b5e6-96231b3b80d8
report a fatal error. This allows us to continue processing the translation
unit. Test case to come on the clang side because we need an inline asm
diagnostics handler in place.
rdar://13446483
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180873 91177308-0d34-0410-b5e6-96231b3b80d8
Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector.
rdar://13402653
PR15866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180871 91177308-0d34-0410-b5e6-96231b3b80d8
The cause of the windows failures was fixed by r180791. Revert to the state
after Sabre's original revert.
Original message:
revert r179735, it has no testcases, and doesn't really make sense.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180844 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r180802
There's ongoing discussion about whether this is the right place to make
this transformation. Reverting for now while we figure it out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180834 91177308-0d34-0410-b5e6-96231b3b80d8
Expand copy instructions between two accumulator registers before callee-saved
scan is done. Handle copies between integer GPR and hi/lo registers in
MipsSEInstrInfo::copyPhysReg. Delete pseudo-copy instructions that are not
needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180827 91177308-0d34-0410-b5e6-96231b3b80d8
register-indirect address with an offset of 0.
It used to be that a DBG_VALUE is a register-indirect value if the offset
(operand 1) is nonzero. The new convention is that a DBG_VALUE is
register-indirect if the first operand is a register and the second
operand is an immediate. For plain registers use the combination reg, reg.
rdar://problem/13658587
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180816 91177308-0d34-0410-b5e6-96231b3b80d8
Always fold a shuffle-of-shuffle into a single shuffle when there's only one
input vector in the first place. Continue to be more conservative when there's
multiple inputs.
rdar://13402653
PR15866
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180802 91177308-0d34-0410-b5e6-96231b3b80d8
First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created.
Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway.
Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll
Jim has okayed this off-list.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180799 91177308-0d34-0410-b5e6-96231b3b80d8
The actual storage was already using unsigned, but the interface was using
uint64_t. This is wasteful on 32 bits and looks to be the root causes of
a miscompilation on Windows where a value was being sign extended to 64bits
to compare with the result of getSlotIndex.
Patch by Pasi Parviainen!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180791 91177308-0d34-0410-b5e6-96231b3b80d8
Differences in bitwidth between X and Y could exist even if C1 and C2 have
the same Log2 representation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180779 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes the optimization introduced in r179748 and reverted in r179750.
While the optimization was sound, it did not properly respect differences in
bit-width.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180777 91177308-0d34-0410-b5e6-96231b3b80d8
1. VarArgStyleRegisters: functionality that emits "store" instructions for byval regs moved out into separated method "StoreByValRegs". Before this patch VarArgStyleRegisters had confused use-cases. It was used for both variadic functions and for regular functions with byval parameters. In last case it created new stack-frame and registered it as VarArg frame, that is wrong.
This patch replaces VarArgsStyleRegisters usage for byval parameters with StoreByValRegs method.
2. In ARMMachineFunctionInfo, "get/setVarArgsRegSaveSize" was renamed to "get/setArgRegsSaveSize". By the same reason. Sometimes it was used for variadic functions, and sometimes for byval parameters in regular functions. Actually, this property means the size of registers, that keeps arguments, and thats why it was renamed.
3. In ARMISelLowering.cpp, ARMTargetLowering class, in methods computeRegArea and StoreByValRegs, VARegXXXXXX was renamed to ArgRegsXXXXXX still by the same reasons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180774 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes 2013-04-04-RelocAddend.ll. We don't have a testcase for non external
relocs with an Addend. I will try to write one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180767 91177308-0d34-0410-b5e6-96231b3b80d8
The `llvm.tls_init_funcs' (created by the front-end) holds pointers to the TLS
initialization functions. These need to be placed into the correct section so
that they are run before `main()'.
<rdar://problem/13733006>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180737 91177308-0d34-0410-b5e6-96231b3b80d8
For regular object files this is only meaningful for common symbols. An object
file format with direct support for atoms should be able to provide alignment
information for all symbols.
This replaces getCommonSymbolAlignment and fixes
test-common-symbols-alignment.ll on darwin. This also includes a fix to
MachOObjectFile::getSymbolFlags. It was marking undefined symbols as common
(already tested by existing mcjit tests now that it is used).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180736 91177308-0d34-0410-b5e6-96231b3b80d8
The implemented RuntimeDyldImpl interface is public. Everything else is private.
Since these classes are not inherited from (yet), there is no need to have
protected members.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180733 91177308-0d34-0410-b5e6-96231b3b80d8
This resurrects r179957, but adds code that makes sure we don't touch
atomic/volatile stores:
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:
a[i] =
may-alias with a[i] load
if (cond)
a[i] = Y
into an unconditional store.
a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp
We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case where the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.
hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180731 91177308-0d34-0410-b5e6-96231b3b80d8
This un-reverts r179735 and reverts commit r180574.
This fixes assertion failures for me locally and should fix the failures
on Windows reported widely on llvm-dev. We should check if the bots
caught this and if so why not.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180722 91177308-0d34-0410-b5e6-96231b3b80d8
Re-submitting with fix for OCaml dependency problems (removing dependency on SectionMemoryManager when it isn't used).
Patch by Fili Pizlo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180720 91177308-0d34-0410-b5e6-96231b3b80d8
For MachO we need information that is not represented in ObjRelocationInfo.
Instead of copying the bits we think are needed from a relocation_iterator,
just pass the relocation_iterator down to the format specific functions.
No functionality change yet as we still drop the information once
processRelocationRef returns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180711 91177308-0d34-0410-b5e6-96231b3b80d8
Turning retains into retainRV calls disrupts the data flow analysis in
ObjCARCOpts. Thus we move it as late as we can by moving it into
ObjCARCContract.
We leave in the conversion from retainRV -> retain in ObjCARCOpt since
it enables the dataflow analysis.
rdar://10813093
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180698 91177308-0d34-0410-b5e6-96231b3b80d8
When Reassociator optimize "(x | C1)" ^ "(X & C2)", it may swap the two
subexpressions, however, it forgot to swap cached constants (of C1 and C2)
accordingly.
rdar://13739160
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180676 91177308-0d34-0410-b5e6-96231b3b80d8
to determine whether or not we're on a darwin platform for debug code
emitting.
Solves the problem of a module with no triple on the command line
and no triple in the module using non-gdb ok features on darwin. Fix
up the member-pointers test to check the correct things for cross
platform (DW_FORM_flag is a good prefix).
Unfortunately no testcase because I have no ideas how to test something
without a triple and without a triple in the module yet check
precisely on two platforms. Ideas welcome.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180660 91177308-0d34-0410-b5e6-96231b3b80d8
We switch the order of offset and field type to make TBAAStructType node
(name, parent node, offset) similar to scalar TBAA node (name, parent node).
TypeIsImmutable is added to TBAAStructTag node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180654 91177308-0d34-0410-b5e6-96231b3b80d8
Clarify documentation and API to make the difference between register and
register-indirect addressed locations more explicit. Put in a comment
to point out that with the current implementation we cannot specify
a register-indirect location with offset 0 (a breg 0 in DWARF).
No functionality change intended.
rdar://problem/13658587
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180641 91177308-0d34-0410-b5e6-96231b3b80d8
TLVs probably won't be as common as the other types of variables. Check for them
last before defaulting to "DATA".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180631 91177308-0d34-0410-b5e6-96231b3b80d8
For Mach-O there were 2 implementations for parsing object files. A
standalone llvm/Object/MachOObject.h and llvm/Object/MachO.h which
implements the generic interface in llvm/Object/ObjectFile.h.
This patch adds the missing features to MachO.h, moves macho-dump to
use MachO.h and removes ObjectFile.h.
In addition to making sure that check-all is clean, I checked that the
new version produces exactly the same output in all Mach-O files in a
llvm+clang build directory (including executables and shared
libraries).
To test the performance, I ran macho-dump over all the files in a
llvm+clang build directory again, but this time redirecting the output
to /dev/null. Both the old and new versions take about 4.6 seconds
(2.5 user) to finish.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180624 91177308-0d34-0410-b5e6-96231b3b80d8
We need to intialize this to something and since clang does not set
the shader type attribute and clang is used only for compute shaders,
initializing it to COMPUTE seems like the best choice.
Reviewed-by: Christian König <christian.koenig@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180620 91177308-0d34-0410-b5e6-96231b3b80d8
"hint" space for Thumb actually overlaps the encoding space of the CPS
instruction. In actuality, hints can be defined as CPS instructions where imod
and M bits are all nil.
Handle decoding of permitted nop-compatible hints (i.e. nop, yield, wfi, wfe,
sev) in DecodeT2CPSInstruction.
This commit adds a proper diagnostic message for Imm0_4 and updates all tests.
Patch by Mihail Popa <Mihail.Popa@arm.com>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180617 91177308-0d34-0410-b5e6-96231b3b80d8
Since we can't guarantee that the original dbg.declare instrinsic
is removed by LowerDbgDeclare(), we need to make sure that we are
not inserting the same dbg.value intrinsic over and over.
This removes tons of redundant DIEs when compiling optimized code.
rdar://problem/13056109
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180615 91177308-0d34-0410-b5e6-96231b3b80d8
In the default PowerPC assembler syntax, registers are specified simply
by number, so they cannot be distinguished from immediate values (without
looking at the opcode). This means that the default operand matching logic
for the asm parser does not work, and we need to specify custom matchers.
Since those can only be specified with RegisterOperand classes and not
directly on the RegisterClass, all instructions patterns used by the asm
parser need to use a RegisterOperand (instead of a RegisterClass) for
all their register operands.
This patch adds one RegisterOperand for each RegisterClass, using the
same name as the class, just in lower case, and updates all instruction
patterns to use RegisterOperand instead of RegisterClass operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180611 91177308-0d34-0410-b5e6-96231b3b80d8
When testing the asm parser, I noticed wrong encodings for the
above instructions (wrong sub-opcodes).
Tests will be added together with the asm parser.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180608 91177308-0d34-0410-b5e6-96231b3b80d8
When testing the asm parser, I noticed wrong encodings for the
above instructions (wrong sub-opcodes). Note that apparently
the compiler currently never generates pre-inc instructions
for floating point types for some reason ...
Tests will be added together with the asm parser.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180607 91177308-0d34-0410-b5e6-96231b3b80d8
When testing the asm parser, I noticed wrong encodings for the
above instructions (wrong operand name in rldimi, wrong form
and sub-opcode for rldcl).
Tests will be added together with the asm parser.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180606 91177308-0d34-0410-b5e6-96231b3b80d8
When testing the asm parser, I ran into an error when using a conditional
branch to an external symbol (this doesn't occur in compiler-generated
code) due to missing support in PPCELFObjectWriter::getRelocTypeInner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180605 91177308-0d34-0410-b5e6-96231b3b80d8
This already helps SSE2 x86 a lot because it lacks an efficient way to
represent a vector select. The long term goal is to enable the backend to match
a canonicalized pattern into a single instruction (e.g. vabs or pabs).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180597 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r180222.
I think this might tie in with a different problem which will require a
different approach potentially. I am reverting this in the case I need to go
down that second path.
My apologies for the noise. = /.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180590 91177308-0d34-0410-b5e6-96231b3b80d8
Mips have delayslots for certain instructions
like jumps and branches. These are instructions
that follow the branch or jump and are executed
before the jump or branch is completed.
Early Mips compilers could not cope with delayslots
and left them up to the assembler. The assembler would
fill the delayslots with the appropriate instruction,
usually just a nop to allow correct runtime behavior.
The default behavior for this is set with .set reorder.
To tell the assembler that you don't want it to mess with
the delayslot one used .set noreorder.
For backwards compatibility we need to support
.set reorder and have it be the default behavior in the
assembler.
Our support for it is to insert a NOP directly after an
instruction with a delayslot when in .set reorder mode.
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180584 91177308-0d34-0410-b5e6-96231b3b80d8
latency for certain models of the Intel Atom family, by converting
instructions into their equivalent LEA instructions, when it is both
useful and possible to do so.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180573 91177308-0d34-0410-b5e6-96231b3b80d8
Since the relocation iterator walks only the relocations in one section, we
can just use a pointer and avoid fetching information about the section at
every reference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180262 91177308-0d34-0410-b5e6-96231b3b80d8
getRelocationAddress is for dynamic libraries and executables,
getRelocationOffset for relocatable objects.
Mark the getRelocationAddress of COFF and MachO as not implemented yet. Add a
test of ELF's. llvm-readobj -r now prints the same values as readelf -r.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180259 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes PR15838. Need to check for blocks with nothing but dbg.value.
I'm not sure how to force this situation with a unit test. I tried to
reduce the test case in PR15838 (1k lines of metadata) but gave up.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180227 91177308-0d34-0410-b5e6-96231b3b80d8
Due to the semantics of ARC, we must be extremely conservative with autorelease
calls inserted by the frontend since ARC gaurantees that said object will be in
the autorelease pool after that point, an optimization invariant that the
optimizer must respect.
On the other hand, we are allowed significantly more flexibility with
autoreleaseRV instructions.
Often times though this flexibility is disrupted by early transformations which
transform objc_autoreleaseRV => objc_autorelease if said instruction is no
longer being used as part of an RV pair (generally due to inlining). Since we
can not tell the difference in between an autorelease put into place by the
frontend and one created through said ``strength reduction'' we can not perform
these optimizations.
The addition of this set gets around said issues by allowing us to differentiate
in between said two cases.
rdar://problem/13697741.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180222 91177308-0d34-0410-b5e6-96231b3b80d8
While here, don't report a dummy symbol for relocations that don't have symbols.
We used to says such relocations were for the first defined symbol, but now we
return end_symbols(). The llvm-readobj output change agrees with otool.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180214 91177308-0d34-0410-b5e6-96231b3b80d8
This patch disables memory-instruction vectorization for types that need padding
bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on
x86_64. Because the load/store vectorization is performed by the bit casting to
a packed vector, which has incompatible memory layout due to the lack of padding
bytes, the present vectorizer produces inconsistent result for memory
instructions of those types.
This patch checks an equality of the AllocSize of a scalar type and allocated
size for each vector element, to ensure that there is no padding bytes and the
array can be read/written using vector operations.
Patch by Daisuke Takahashi!
Fixes PR15758.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180196 91177308-0d34-0410-b5e6-96231b3b80d8
This should bring the ppc bots back. I will try to write a test that would
have found the problem on a little endian system too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180194 91177308-0d34-0410-b5e6-96231b3b80d8
For now, we just reschedule instructions that use the copied vregs and
let regalloc elliminate it. I would really like to eliminate the
copies on-the-fly during scheduling, but we need a complete
implementation of repairIntervalsInRange() first.
The general strategy is for the register coalescer to eliminate as
many global copies as possible and shrink live ranges to be
extended-basic-block local. The coalescer should not have to worry
about resolving local copies (e.g. it shouldn't attemp to reorder
instructions). The scheduler is a much better place to deal with local
interference. The coalescer side of this equation needs work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180193 91177308-0d34-0410-b5e6-96231b3b80d8
When MachineScheduler is enabled, this functionality can be
removed. Until then, provide a way to disable it for test cases and
designing MachineScheduler heuristics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180192 91177308-0d34-0410-b5e6-96231b3b80d8
Since the relocation iterator walks only the relocations in one section, we
can just use a pointer and avoid fetching information about the section at
every reference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180189 91177308-0d34-0410-b5e6-96231b3b80d8
I know what would be cool! We should align the compact unwind section because
aligned data access is faster.
<rdar://problem/13723271>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180171 91177308-0d34-0410-b5e6-96231b3b80d8
debug location. This solves a problem where range of an inlined
subroutine is emitted wrongly.
Patch by Manman Ren.
Fixes rdar://problem/12415623
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180140 91177308-0d34-0410-b5e6-96231b3b80d8
This exposed an issue with PowerPC AltiVec where it appears it was setting the wrong vector boolean contents. The included change
fixes the PowerPC tests, and was OK'd by Hal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180129 91177308-0d34-0410-b5e6-96231b3b80d8
1) Disallow 'returned' on parameter that is also 'sret' (no sensible semantics, as far as I can tell).
2) Conservatively disallow tail calls through 'returned' parameters that also are 'zext' or 'sext' (for consistency with treatment of other zero-extending and sign-extending operations in tail call position detection...can be revised later to handle situations that can be determined to be safe).
This is a new attribute that is not yet used, so there is no impact.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180118 91177308-0d34-0410-b5e6-96231b3b80d8
This makes llvm-dwarfdump and llvm-symbolizer understand
debug info sections compressed by ld.gold linker.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180088 91177308-0d34-0410-b5e6-96231b3b80d8
even if erroneously annotated with the parallel loop metadata.
Fixes Bug 15794:
"Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180081 91177308-0d34-0410-b5e6-96231b3b80d8
AArch64 always demands a register-scavenger, so the pointer should never be
NULL. However, in the spirit of paranoia, we'll assert it before use just in
case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180080 91177308-0d34-0410-b5e6-96231b3b80d8
The tag is of type TBAANode when flag EnableStructPathTBAA is off.
Move implementation of MDNode::getMostGenericTBAA to TypeBasedAliasAnalysis.cpp
since it depends on how to interprete the MDNodes for scalar TBAA and
struct-path aware TBAA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180068 91177308-0d34-0410-b5e6-96231b3b80d8
now taken care of by the frontend, which allows us to parse arbitrary C/C++
variables.
Part of rdar://13663589
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180037 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is http://llvm.org/PR15802. Backslashes preceding double quotes in
arguments must be escaped. The interesting bit is that all other
backslashes should *not* be escaped, because the un-escaping logic is
only triggered by the presence of a double quote character.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D705
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180035 91177308-0d34-0410-b5e6-96231b3b80d8
Also add a check for llvm.used in the verifier and simplify clients now that
they can assume they have a ConstantArray.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180019 91177308-0d34-0410-b5e6-96231b3b80d8
-- C.4 and C.5 statements, when NSAA is not equal to SP.
-- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a
variadic procedure.
Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are
some exceptions in AAPCS.
1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated
CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs.
2. Check that for VA functions all params uses GPRs and then stack.
No exceptions, no CPRCs here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180011 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll
I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even
though the debug info was clearly invalid on all of them, but this ought to fix
it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179996 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.
Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again. For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
%inlo = v4i32 extract_subvector %in, 0
%inhi = v4i32 extract_subvector %in, 4
%lo16 = v4i16 trunc v4i32 %inlo
%hi16 = v4i16 trunc v4i32 %inhi
%in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
%res = v8i8 trunc v8i16 %in16
This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.
Update the ARMTargetTransformInfo to take this improved legalization
into account.
Consider the simplified IR:
define <16 x i8> @test1(<16 x i32>* %ap) {
%a = load <16 x i32>* %ap
%tmp = trunc <16 x i32> %a to <16 x i8>
ret <16 x i8> %tmp
}
define <8 x i8> @test2(<8 x i32>* %ap) {
%a = load <8 x i32>* %ap
%tmp = trunc <8 x i32> %a to <8 x i8>
ret <8 x i8> %tmp
}
Previously, we would generate the truly hideous:
.syntax unified
.section __TEXT,__text,regular,pure_instructions
.globl _test1
.align 2
_test1: @ @test1
@ BB#0:
push {r7}
mov r7, sp
sub sp, sp, #20
bic sp, sp, #7
add r1, r0, #48
add r2, r0, #32
vld1.64 {d24, d25}, [r0:128]
vld1.64 {d16, d17}, [r1:128]
vld1.64 {d18, d19}, [r2:128]
add r1, r0, #16
vmovn.i32 d22, q8
vld1.64 {d16, d17}, [r1:128]
vmovn.i32 d20, q9
vmovn.i32 d18, q12
vmov.u16 r0, d22[3]
strb r0, [sp, #15]
vmov.u16 r0, d22[2]
strb r0, [sp, #14]
vmov.u16 r0, d22[1]
strb r0, [sp, #13]
vmov.u16 r0, d22[0]
vmovn.i32 d16, q8
strb r0, [sp, #12]
vmov.u16 r0, d20[3]
strb r0, [sp, #11]
vmov.u16 r0, d20[2]
strb r0, [sp, #10]
vmov.u16 r0, d20[1]
strb r0, [sp, #9]
vmov.u16 r0, d20[0]
strb r0, [sp, #8]
vmov.u16 r0, d18[3]
strb r0, [sp, #3]
vmov.u16 r0, d18[2]
strb r0, [sp, #2]
vmov.u16 r0, d18[1]
strb r0, [sp, #1]
vmov.u16 r0, d18[0]
strb r0, [sp]
vmov.u16 r0, d16[3]
strb r0, [sp, #7]
vmov.u16 r0, d16[2]
strb r0, [sp, #6]
vmov.u16 r0, d16[1]
strb r0, [sp, #5]
vmov.u16 r0, d16[0]
strb r0, [sp, #4]
vldmia sp, {d16, d17}
vmov r0, r1, d16
vmov r2, r3, d17
mov sp, r7
pop {r7}
bx lr
.globl _test2
.align 2
_test2: @ @test2
@ BB#0:
push {r7}
mov r7, sp
sub sp, sp, #12
bic sp, sp, #7
vld1.64 {d16, d17}, [r0:128]
add r0, r0, #16
vld1.64 {d20, d21}, [r0:128]
vmovn.i32 d18, q8
vmov.u16 r0, d18[3]
vmovn.i32 d16, q10
strb r0, [sp, #3]
vmov.u16 r0, d18[2]
strb r0, [sp, #2]
vmov.u16 r0, d18[1]
strb r0, [sp, #1]
vmov.u16 r0, d18[0]
strb r0, [sp]
vmov.u16 r0, d16[3]
strb r0, [sp, #7]
vmov.u16 r0, d16[2]
strb r0, [sp, #6]
vmov.u16 r0, d16[1]
strb r0, [sp, #5]
vmov.u16 r0, d16[0]
strb r0, [sp, #4]
ldm sp, {r0, r1}
mov sp, r7
pop {r7}
bx lr
Now, however, we generate the much more straightforward:
.syntax unified
.section __TEXT,__text,regular,pure_instructions
.globl _test1
.align 2
_test1: @ @test1
@ BB#0:
add r1, r0, #48
add r2, r0, #32
vld1.64 {d20, d21}, [r0:128]
vld1.64 {d16, d17}, [r1:128]
add r1, r0, #16
vld1.64 {d18, d19}, [r2:128]
vld1.64 {d22, d23}, [r1:128]
vmovn.i32 d17, q8
vmovn.i32 d16, q9
vmovn.i32 d18, q10
vmovn.i32 d19, q11
vmovn.i16 d17, q8
vmovn.i16 d16, q9
vmov r0, r1, d16
vmov r2, r3, d17
bx lr
.globl _test2
.align 2
_test2: @ @test2
@ BB#0:
vld1.64 {d16, d17}, [r0:128]
add r0, r0, #16
vld1.64 {d18, d19}, [r0:128]
vmovn.i32 d16, q8
vmovn.i32 d17, q9
vmovn.i16 d16, q8
vmov r0, r1, d16
bx lr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179989 91177308-0d34-0410-b5e6-96231b3b80d8
This is an edge case that can happen if we modify a chain of multiple selects.
Update all operands in that case and remove the assert. PR15805.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179982 91177308-0d34-0410-b5e6-96231b3b80d8
There is the temptation to make this tranform dependent on target information as
it is not going to be beneficial on all (sub)targets. Therefore, we should
probably do this in MI Early-Ifconversion.
This reverts commit r179957. Original commit message:
"SimplifyCFG: If convert single conditional stores
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:
a[i] =
may-alias with a[i] load
if (cond)
a[i] = Y
into an unconditional store.
a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp
We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.
hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.
I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179980 91177308-0d34-0410-b5e6-96231b3b80d8