e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99928 91177308-0d34-0410-b5e6-96231b3b80d8
Rewrite the pmulld patterns, and make sure that they fold in loads of
arguments into the instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99910 91177308-0d34-0410-b5e6-96231b3b80d8
create symbols. It is extremely error prone and a source of a lot
of the remaining integrated assembler bugs on x86-64.
This fixes rdar://7807601.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99902 91177308-0d34-0410-b5e6-96231b3b80d8
passing the command-line parameter "-stats" and to print the resulting
statistics without calling llvm_shutdown().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99893 91177308-0d34-0410-b5e6-96231b3b80d8
on all objects it has allocated, if they are all of the same size and alignment.
Use this to destruct all VNInfos allocated in LiveIntervalAnalysis (PR6653).
valnos is not reliable for this purpose, as seen in r99400
(which still leaked, and sometimes caused double frees).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99881 91177308-0d34-0410-b5e6-96231b3b80d8
instead of just a count of them, and refactor the guts of
report printing out of removeTimer into its own method.
Refactor addTimerToPrint away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99872 91177308-0d34-0410-b5e6-96231b3b80d8
transform. I.e., if a clean-up eh.selector call dominates the invoke of an
_Unwind_Resume_or_Rethrow, then we convert the eh.selector into a
catch-all. This patch, however, uses the DominatorTree information, and doesn't
go through the whole rigmarole of starting at the eh.exception call, finding the
corresponding URoR and eh.selector calls, and trying to trace through any number
of instruction types to get to them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99846 91177308-0d34-0410-b5e6-96231b3b80d8
isn't used by anyone and is better exposed as a non-per-timer
thing. Also, stop including System/Mutex.h in Timer.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99841 91177308-0d34-0410-b5e6-96231b3b80d8
eliminate the per-timer lock (timers should be
externally locked if needed), the info-output-stream
can never be dbgs(), so drop the check. Make some
stuff private.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99839 91177308-0d34-0410-b5e6-96231b3b80d8
makes calls a little bit more consistent and allows easy removal of the
specializations in the future. Convert all callers to the templated functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99838 91177308-0d34-0410-b5e6-96231b3b80d8
Most of these were unused, some of them were wrong and unused (isS16Constant<short>,
isS10Constant<short>).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99827 91177308-0d34-0410-b5e6-96231b3b80d8
"the bigstack patch for SPU, with testcase. It is essentially the patch committed as 97091, and reverted as 97099, but with the following additions:
-in vararg handling, registers are marked to be live, to not confuse the register scavenger
-function prologue and epilogue are not emitted, if the stack size is 16. 16 means it is empty - there is only the register scavenger emergency spill slot, which is not used as there is no stack."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99819 91177308-0d34-0410-b5e6-96231b3b80d8
This is same as r99772 (which was reverted) with just one meaningful difference where two source lines exchanged their positions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99816 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions use byte index in a control vector (M:Vm) to lookup byte
values in a table and generate a new vector (D:Vd). The table is specified via
a list of vectors, which can be:
{Dn}
{Dn D<n+1>}
{Dn D<n+1> D<n+2>}
{Dn D<n+1> D<n+2> D<n+3>}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99789 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise, e.g. in the invocation like clang -DFOO=\"bar\" FOO macro
got the bar value, not "bar".
Patch by Alexander Esilevich!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99763 91177308-0d34-0410-b5e6-96231b3b80d8
and those derived from them. These are obnoxious because
they were written as: PatLeaf<(bitconvert). Not having an
argument was foiling adding better type checking for operand
count matching up with what was required (in this case,
bitconvert always requires an operand!)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99759 91177308-0d34-0410-b5e6-96231b3b80d8
input to be v8i8 or v16i8, which buildvectors get canonicalized to.
This allows the patterns that were previously using a bare 'vnot' to
match, before they couldn't.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99754 91177308-0d34-0410-b5e6-96231b3b80d8
1, 1 cases which are by-far the most frequent. This shrinks the X86
isel table from 77014 -> 74657 bytes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99740 91177308-0d34-0410-b5e6-96231b3b80d8
Type::destroy(), so it got skipped for FunctionTypes, StructTypes, and
UnionTypes. This fixes the resulting leaks in test/Feature/opaquetypes.ll and
test/Integer/opaquetypes_bt.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99732 91177308-0d34-0410-b5e6-96231b3b80d8
pointer. There was also a SmallPtrSet whose settiness wasn't being used, so I
changed it to a SmallVector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99713 91177308-0d34-0410-b5e6-96231b3b80d8
of the previous load - it's usually important. For example, we don't want
to blindly turn an unaligned load into an aligned one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99699 91177308-0d34-0410-b5e6-96231b3b80d8
it as the format for the appropriate N3V*SL*<> classes. These instructions
require special handling of the M:Vm field which encodes the restricted Dm and
the lane index within Dm.
Examples are A8.6.325 VMLA, VMLAL, VMLS, VMLSL (by scalar):
vmlal.s32 q3, d2, d10[0]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99690 91177308-0d34-0410-b5e6-96231b3b80d8
through to the generic version. The generic functions use STR/LDR, but T2
needs the t2STR/t2LDR instead so we get the addressing mode correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99678 91177308-0d34-0410-b5e6-96231b3b80d8
to now take a format argument. N3VDInt<> and N3VQInt<> are modified to take a
format argument as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99676 91177308-0d34-0410-b5e6-96231b3b80d8
'invoke' instruction. You will get a situation like this:
bb:
%ehptr = eh.exception()
%sel = eh.selector(%ehptr, @per, 0);
...
bb2:
invoke _Unwind_Resume_or_Rethrow(%ehptr) %normal unwind to %lpad
lpad:
...
The unwinder will see the %sel call as a clean-up and, if it doesn't have a
catch further up the call stack, it will skip running it. But there *is* another
catch up the stack -- the catch for the %lpad. However, we can't see that. This
is fixed in code-gen, where we detect this situation, and convert the "clean-up"
selector call into a "catch-all" selector call. This gives us the correct
semantics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99671 91177308-0d34-0410-b5e6-96231b3b80d8
to encode the byte location of the extracted result in the concatenation of the
operands, from the least significant end.
Modify VEXTd and VEXTq classes to use the format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99659 91177308-0d34-0410-b5e6-96231b3b80d8
follow the N3RegFrm's operand order of D:Vd N:Vn M:Vm. The operand order of
N3RegVShFrm is D:Vd M:Vm N:Vn (notice that M:Vm is the first src operand).
Add a parent class N3Vf which requires passing a Format argument and which the
N3V class is modified to inherit from. N3V class represents the "normal"
3-Register NEON Instructions with N3RegFrm.
Also add a multiclass N3VSh_QHSD to represent clusters of NEON 3-Register Shift
Instructions and replace 8 invocations with it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99655 91177308-0d34-0410-b5e6-96231b3b80d8
Examples are VABA (Vector Absolute Difference and Accumulate), VABAL (Vector
Absolute Difference and Accumulate Long), and VABD (Vector Absolute Difference).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99628 91177308-0d34-0410-b5e6-96231b3b80d8
dispatch to the appropriate routines to handle the different interpretations of
the shift amount encoded in the imm6 field. The Vd, Vm fields are interpreted
the same between the two, though.
See, for example, A8.6.367 VQSHL, VQSHLU (immediate) for N2RegVShLFrm format and
A8.6.368 VQSHRN, VQSHRUN for N2RegVShRFrm format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99590 91177308-0d34-0410-b5e6-96231b3b80d8
exactly two passes in that case, and don't ever need to recompute any layout,
so this is a nice baseline for relaxation performance.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99563 91177308-0d34-0410-b5e6-96231b3b80d8
- Still O(N^2), just a faster form, and now its the MCAsmLayout's fault.
On the .s I am tuning against (combine.s from 403.gcc):
--
ddunbar@lordcrumb:MC$ diff stats-before.txt stats-after.txt
5,10c5,10
< 1728 assembler - Number of assembler layout and relaxation steps
< 7707 assembler - Number of emitted assembler fragments
< 120588 assembler - Number of emitted object file bytes
< 2233448 assembler - Number of evaluated fixups
< 1727 assembler - Number of relaxed instructions
< 6723845 mcexpr - Number of MCExpr evaluations
---
> 3 assembler - Number of assembler layout and relaxation steps
> 7707 assembler - Number of emitted assembler fragments
> 120588 assembler - Number of emitted object file bytes
> 14796 assembler - Number of evaluated fixups
> 1727 assembler - Number of relaxed instructions
> 67889 mcexpr - Number of MCExpr evaluations
--
Feel free to LOL at the -before numbers, if you like.
I am a little surprised we make more than 2 relaxation passes. It's pretty
trivial for us to do relaxation out-of-order if that would give a speedup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99543 91177308-0d34-0410-b5e6-96231b3b80d8
Remove much horribleness from X86InstrFormats as a result. Similar
simplifications are probably possible for other targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99539 91177308-0d34-0410-b5e6-96231b3b80d8
the custom insertion hook deletes the instruction, then we try to set dead
flags on it. Neither the code that I added nor the code that was there
before was safe.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99538 91177308-0d34-0410-b5e6-96231b3b80d8
On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
in a different domain than where it was defined. Some instructions have
equvivalents for different domains, like por/orps/orpd.
The SSEDomainFix pass tries to minimize the number of domain crossings by
changing between equvivalent opcodes where possible.
This is a work in progress, in particular the pass doesn't do anything yet. SSE
instructions are tagged with their execution domain in TableGen using the last
two bits of TSFlags. Note that not all instructions are tagged correctly. Life
just isn't that simple.
The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
issue handled by NEONMoveFixPass. This pass may become target independent to
handle both.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99524 91177308-0d34-0410-b5e6-96231b3b80d8
bytes instead of one byte. This is important because
we're running up to too many opcodes to fit in a byte
and it is aggrevated by FIRST_TARGET_MEMORY_OPCODE
making the numbering sparse. This just bites the
bullet and bloats out the table. In practice, this
increases the size of the x86 isel table from 74.5K
to 76K. I think we'll cope :)
This fixes rdar://7791648
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99494 91177308-0d34-0410-b5e6-96231b3b80d8
handles dead implicit results more aggressively. More
to come, I think this is now just a data entry problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99486 91177308-0d34-0410-b5e6-96231b3b80d8
happening.
Enhance scheduling to set the DEAD flag on implicit defs
more aggressively. Before, we'd set an implicit def operand
to dead if it were present in the SDNode corresponding to
the machineinstr but had no use. Now we do it in this case
AND if the implicit def does not exist in the SDNode at all.
This exposes a couple of problems: one is the FIXME, which
causes a live intervals crash on CodeGen/X86/sibcall.ll.
The second is that it makes machinecse and licm more
aggressive (which is a good thing) but also exposes a case
where licm hoists a set0 and then it doesn't get resunk.
Talking to codegen folks about both these issues, but I need
this patch in in the meantime.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99485 91177308-0d34-0410-b5e6-96231b3b80d8
Here is a theoretical example that illustrates why the placement is important.
tmp1 =
store tmp1 -> x
...
tmp2 = add ...
...
call
...
store tmp2 -> x
Now mem2reg comes along:
tmp1 =
dbg_value (tmp1 -> x)
...
tmp2 = add ...
...
call
...
dbg_value (tmp2 -> x)
When the debugger examine the value of x after the add instruction but before the call, it should have the value of tmp1.
Furthermore, for dbg_value's that reference constants, they should not be emitted at the beginning of the block (since they do not have "producers").
This patch also cleans up how SDISel manages DbgValue nodes. It allow a SDNode to be referenced by multiple SDDbgValue nodes. When a SDNode is deleted, it uses the information to find the SDDbgValues and invalidate them. They are not deleted until the corresponding SelectionDAG is destroyed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99469 91177308-0d34-0410-b5e6-96231b3b80d8
addl $12, %esp
popl %esi
popl %edi
popl %ebx
popl %ebp
jmpl *__Block_deallocator-L1$pb(%esi) # TAILCALL
The problem is the global base register is assigned GR32 register class. TCRETURNmi needs the registers making up the address mode to have the GR32_TC register class.
The *proper* fix is for X86DAGToDAGISel::getGlobalBaseReg() to return a copy from the global base register of the machine function rather than returning the register itself. But that has the potential of causing it to be coalesced to a more restrictive register class: GR32_TC. It can introduce additional copies and spills. For something as important the PIC base, it's not worth it especially since this is not an issue on 64-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99455 91177308-0d34-0410-b5e6-96231b3b80d8
2006-07-19-stwbrx-crash.ll for me, but it's the only likely
patch in the blame list of several bots. Lets see if this
fixes it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99453 91177308-0d34-0410-b5e6-96231b3b80d8
--- Reverse-merging r99440 into '.':
U test/MC/AsmParser/X86/x86_32-bit_cat.s
U test/MC/AsmParser/X86/x86_32-encoding.s
U include/llvm/IntrinsicsX86.td
U include/llvm/CodeGen/SelectionDAGNodes.h
U lib/Target/X86/X86InstrSSE.td
U lib/Target/X86/X86ISelLowering.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99450 91177308-0d34-0410-b5e6-96231b3b80d8
not get an "Unknown immediate size" assert failure when used. All instructions
of this form have an 8-bit immediate. Also added a test case of an example
instruction that is of this form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99435 91177308-0d34-0410-b5e6-96231b3b80d8
--- Reverse-merging r99400 into '.':
D test/CodeGen/Generic/2010-03-24-liveintervalleak.ll
U lib/CodeGen/LiveIntervalAnalysis.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99419 91177308-0d34-0410-b5e6-96231b3b80d8
otherwise the SmallVector it contains doesn't free its memory.
In most cases LiveIntervalAnalysis could get away by not calling the destructor,
because VNInfos are bumpptr-allocated, and smallvectors usually don't grow.
However when the SmallVector does grow it always leaks.
This is the valgrind shown leak from the original testcase:
==8206== 18,304 bytes in 151 blocks are definitely lost in loss record 164 of 164
==8206== at 0x4A079C7: operator new(unsigned long) (vg_replace_malloc.c:220)
==8206== by 0x4DB7A7E: llvm::SmallVectorBase::grow_pod(unsigned long, unsigned long) (in /home/edwin/clam/git/builds/defaul
t/libclamav/.libs/libclamav.so.6.1.0)
==8206== by 0x4F90382: llvm::VNInfo::addKill(llvm::SlotIndex) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libcl
amav.so.6.1.0)
==8206== by 0x5126B5C: llvm::LiveIntervals::handleVirtualRegisterDef(llvm::MachineBasicBlock*, llvm::ilist_iterator<llvm::M
achineInstr>, llvm::SlotIndex, llvm::MachineOperand&, unsigned int, llvm::LiveInterval&) (in /home/edwin/clam/git/builds/defau
lt/libclamav/.libs/libclamav.so.6.1.0)
==8206== by 0x512725E: llvm::LiveIntervals::handleRegisterDef(llvm::MachineBasicBlock*, llvm::ilist_iterator<llvm::MachineI
nstr>, llvm::SlotIndex, llvm::MachineOperand&, unsigned int) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav
.so.6.1.0)
==8206== by 0x51278A8: llvm::LiveIntervals::computeIntervals() (in /home/edwin/clam/git/builds/default/libclamav/.libs/libc
lamav.so.6.1.0)
==8206== by 0x5127CB4: llvm::LiveIntervals::runOnMachineFunction(llvm::MachineFunction&) (in /home/edwin/clam/git/builds/de
fault/libclamav/.libs/libclamav.so.6.1.0)
==8206== by 0x4DAE935: llvm::FPPassManager::runOnFunction(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclama
v/.libs/libclamav.so.6.1.0)
==8206== by 0x4DAEB10: llvm::FunctionPassManagerImpl::run(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclama
v/.libs/libclamav.so.6.1.0)
==8206== by 0x4DAED3D: llvm::FunctionPassManager::run(llvm::Function&) (in /home/edwin/clam/git/builds/default/libclamav/.l
ibs/libclamav.so.6.1.0)
==8206== by 0x4D8BE8E: llvm::JIT::runJITOnFunctionUnlocked(llvm::Function*, llvm::MutexGuard const&) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.0)
==8206== by 0x4D8CA72: llvm::JIT::getPointerToFunction(llvm::Function*) (in /home/edwin/clam/git/builds/default/libclamav/.libs/libclamav.so.6.1.0)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99400 91177308-0d34-0410-b5e6-96231b3b80d8
I have audited all getOperandNo calls now, fixing
hidden assumptions. CallSite related uglyness will
be eliminated successively.
Note this patch has a long and griveous history,
for all the back-and-forths have a look at
CallSite.h's log.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99399 91177308-0d34-0410-b5e6-96231b3b80d8
ISD node. The only change in the generated isel code are comments
like:
< // Src: (X86dec_flag:i16 GR16:i16:$src)
---
> // Src: (X86dec_flag:i16:i32 GR16:i16:$src)
because now it knows that X86dec_flag returns both an i16 (for the result)
and an i32 (for EFLAGS) in this case. Wewt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99369 91177308-0d34-0410-b5e6-96231b3b80d8
This is work in progress. So far, SSE execution domain tables are added to
X86InstrInfo, and a skeleton pass is enabled with -sse-domain-fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99345 91177308-0d34-0410-b5e6-96231b3b80d8
for the noinline attribute, and make the inliner refuse to
inline a call site when the call site is marked noinline even
if the callee isn't. This fixes PR6682.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99341 91177308-0d34-0410-b5e6-96231b3b80d8
for ignoring debug info intrinsics everywhere else is to advance
past them, and it needs to be consistent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99332 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions are only needed for codegen, so I've removed all the
explicit encoding bits for now; they should be set in the same way as the for
VLDMD and VSTMD whenever we add encodings for VFP. The use of addrmode5
requires that the instructions be custom-selected so that the number of
registers can be set in the AM5Opc value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99309 91177308-0d34-0410-b5e6-96231b3b80d8
of D registers. Add a separate VST1q instruction with a Q register
source operand for use by storeRegToStackSlot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99265 91177308-0d34-0410-b5e6-96231b3b80d8
of D registers. Add a separate VLD1q instruction with a Q register
destination operand for use by loadRegFromStackSlot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99261 91177308-0d34-0410-b5e6-96231b3b80d8