Calls that use register mask operands don't have implicit defs for
returned values. The register mask operand handles the call clobber,
but it always behaves like a set of dead defs.
Add live implicit defs for any implicitly defined physregs that are
actually used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149715 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG has 4 different ways of passing physreg defs to users.
Collect all of the uses at the same time, and pass all of them to
MI->setPhysRegsDeadExcept() to mark the remaining defs dead.
The setPhysRegsDeadExcept() function will soon add the required
implicit-defs to instructions with register mask operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149708 91177308-0d34-0410-b5e6-96231b3b80d8
In this patch we optimize this pattern and convert the sequence into extract op of a narrow type.
This allows the BUILD_VECTOR dag optimizations to construct efficient shuffle operations in many cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149692 91177308-0d34-0410-b5e6-96231b3b80d8
Allows command line overrides to be centralized in LLVMTargetMachine.cpp.
LLVMTargetMachine can intercept common passes and give precedence to command line overrides.
Allows adding "internal" target configuration options without touching TargetOptions.
Encapsulates the PassManager.
Provides a good point to initialize all CodeGen passes so that Pass ID's can be used in APIs.
Allows modifying the target configuration hooks without rebuilding the world.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149672 91177308-0d34-0410-b5e6-96231b3b80d8
needed to emit a 64-bit gp-relative relocation entry. Make changes necessary
for emitting jump tables which have entries with directive .gpdword. This patch
does not implement the parts needed for direct object emission or JIT.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149668 91177308-0d34-0410-b5e6-96231b3b80d8
It doesn't seem worthwhile to give meaning to a NULL register mask
pointer. It complicates all the code using register mask operands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149646 91177308-0d34-0410-b5e6-96231b3b80d8
NEON loads and stores accept single and double spaced pairs, triples,
and quads of D registers. This patch adds new register classes to
accurately model those constraints:
Dn, Dn+1 Dn, Dn+2
----------------------
DPair DPairSpc
DTriple DTripleSpc
DQuad DQuadSpc
Also extend the existing QQ and QQQQ register classes to contains all Q
pairs and quads instead of just the aligned ones.
These new register classes will make it possible to accurately model
constraints on NEON loads and stores, and we can get rid of all the NEON
pseudo-instructions. The late scheduler will be able to accurately
model instruction dependencies from the explicit operands.
This more than doubles the number of ARM registers, but the backend
passes are quite good at handling this. The llc -O0 compile time only
regresses by 1.5%. Future work on register mask operands will recover
this regression.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149640 91177308-0d34-0410-b5e6-96231b3b80d8
As suggested by Nick Lewycky, the tree traversal queues have been changed to SmallVectors and the associated loops have been rotated. Also, an 80-col violation was fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149607 91177308-0d34-0410-b5e6-96231b3b80d8
Long basic blocks with many candidate pairs (such as in the SHA implementation in Perl 5.14; thanks to Roman Divacky for the example) used to take an unacceptably-long time to compile. Instead, break long blocks into groups so that no group has too many candidate pairs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149595 91177308-0d34-0410-b5e6-96231b3b80d8
more than two adjacent ranges needed to be merged. The new version should be
able to handle an arbitrary sequence of adjancent ranges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149588 91177308-0d34-0410-b5e6-96231b3b80d8
Adds an instruction itinerary to all x86 instructions, giving each a default latency of 1, using the InstrItinClass IIC_DEFAULT.
Sets specific latencies for Atom for the instructions in files X86InstrCMovSetCC.td, X86InstrArithmetic.td, X86InstrControl.td, and X86InstrShiftRotate.td. The Atom latencies for the remainder of the x86 instructions will be set in subsequent patches.
Adds a test to verify that the scheduler is working.
Also changes the scheduling preference to "Hybrid" for i386 Atom, while leaving x86_64 as ILP.
Patch by Preston Gurd!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149558 91177308-0d34-0410-b5e6-96231b3b80d8
This new scheduler plugs into the existing selection DAG scheduling framework. It is a top-down critical path scheduler that tracks register pressure and uses a DFA for pipeline modeling.
Patch by Sergei Larin!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149547 91177308-0d34-0410-b5e6-96231b3b80d8
The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want.
What was done:
1. Changed semantics of index inside the getCaseValue method:
getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous.
2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned.
3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment.
4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst.
4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor.
4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor.
Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149481 91177308-0d34-0410-b5e6-96231b3b80d8
The pass pointer should never be referenced after sending it to
schedulePass(), which may delete the pass. To fix this bug I had to
clean up the design leading to more goodness.
You may notice now that any non-analysis pass is printed. So things like loop-simplify and lcssa show up, while target lib, target data, alias analysis do not show up. Normally, analysis don't mutate the IR, but you can now check this by using both -print-after and -print-before. The effects of analysis will now show up in between the two.
The llc path is still in bad shape. But I'll be improving it in my next checkin. Meanwhile, print-machineinstrs still works the same way. With print-before/after, many llc passes that were not printed before now are, some of these should be converted to analysis. A few very important passes, isel and scheduler, are not properly initialized, so not printed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149480 91177308-0d34-0410-b5e6-96231b3b80d8
This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure.
Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149468 91177308-0d34-0410-b5e6-96231b3b80d8
Changing arguments from being passed as fixed to varargs is unsafe, as
the ABI may require they be handled differently (stack vs. register, for
example).
Remove two tests which rely on the bitcast being folded into the direct
call, which is exactly the transformation that's unsafe.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149457 91177308-0d34-0410-b5e6-96231b3b80d8
symbol from an assignment. In this case the symbol did not have a fragment so
MCObjectWriter::IsSymbolRefDifferenceFullyResolved() should not have been
calling IsSymbolRefDifferenceFullyResolvedImpl() with a NULL fragment and should
just have returned false in that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149442 91177308-0d34-0410-b5e6-96231b3b80d8
This new function provides a way to get the Mac OS X version number from
either generic "darwin" triples of macosx triples.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149438 91177308-0d34-0410-b5e6-96231b3b80d8
This removes implicit assumption about the form of MI coming into regalloc. In particular, it should be independent of ProcessImplicitDefs which will eventually become a standard part of coming out of SSA--unless we simply can eliminate IMPLICIT_DEF completely. Current unit tests expose this once I remove incidental pass ordering restrictions.
This is not a final fix. Just a temporary workaround until I figure out the right way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149360 91177308-0d34-0410-b5e6-96231b3b80d8
kicking in the big win of ConstantDataArray. As part of this, change
the implementation of GetConstantStringInfo in ValueTracking to work
with ConstantDataArray (and not ConstantArray) making it dramatically,
amazingly, more efficient in the process and renaming it to
getConstantStringInfo.
This keeps around a GetConstantStringInfo entrypoint that (grossly)
forwards to getConstantStringInfo and constructs the std::string
required, but existing clients should move over to
getConstantStringInfo instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149351 91177308-0d34-0410-b5e6-96231b3b80d8
vectors of all one bits to be printed more cleverly in the AsmPrinter.
Unfortunately, the byte value for all one bits is the same with
-fsigned-char as the error return of '-1'. Force this to be the unsigned
byte value when returning it to avoid this problem, and update the test
case for the shiny new behavior.
Yay for building LLVM and Clang with -funsigned-char.
Chris, please review, and let me know if there is any reason to not
desire this change. It seems good on the surface, and certainly intended
based on the code written.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149299 91177308-0d34-0410-b5e6-96231b3b80d8
The redzones emitted by AddressSanitizer for CFString instances confuse the linker and are of little use, so we shouldn't add them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149243 91177308-0d34-0410-b5e6-96231b3b80d8
- Don't call malloc+free in the very hot forward().
- Don't call isTiedToDefOperand().
- Don't create BitVector temporaries.
- Merge DeadRegs into KillRegs.
- Eliminate the early clobber checks, they were irrelevant to scavenging.
- Remove unnecessary code from -Asserts builds.
This speeds up ARM PEI by 3.4x and overall llc -O0 codegen time by 11%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149189 91177308-0d34-0410-b5e6-96231b3b80d8
Sometimes there is only one 'resume' instruction per function. In those
situations, we don't need a separate block for the call to _Unwind_Resume. In
fact, it adds a lot of overhead to code-gen if we do that -- especially at -O0.
If we have a single 'resume' instruction, just generate the call within that
block.
<rdar://problem/10694814>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149159 91177308-0d34-0410-b5e6-96231b3b80d8
Unfortunately I also had to disable constant-pool-sharing.ll the code it tests has been
updated to use the IL logic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149148 91177308-0d34-0410-b5e6-96231b3b80d8
around within a basic block while maintaining live-intervals.
Updated ScheduleTopDownLive in MachineScheduler.cpp to use the moveInstr API
when reordering MIs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149147 91177308-0d34-0410-b5e6-96231b3b80d8
GEP instructions are there for the compiler and shouldn't really output much
code (if any at all). When a GEP is stored in the entry block, Fast ISel (for
one) will not know that it could fold it into further uses. For instance, inside
of the EH handling code. This results in a lot of unnecessary spills and loads
which bloat code and slows down pretty much everything.
<rdar://problem/10694814>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149114 91177308-0d34-0410-b5e6-96231b3b80d8
mid-level constant folding APIs instead of doing its own analysis.
This makes it more general (e.g. can now share a <2 x i64> with a
<4 x i32>) and avoid duplicating a bunch of logic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149111 91177308-0d34-0410-b5e6-96231b3b80d8
Adjust an example MachObjectWriter diagnostic to use the information
to issue a better message.
Before:
LLVM ERROR: unknown ARM fixup kind!
After:
x.s:6:5: error: unsupported relocation on symbol
beq bar
^
rdar://9800182
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149093 91177308-0d34-0410-b5e6-96231b3b80d8
The Win64 calling convention has xmm6-15 as callee-saved while still
clobbering all ymm registers.
Add a YMM_HI_6_15 pseudo-register that aliases the clobbered part of the
ymm registers, and mark that as call-clobbered. This allows live xmm
registers across calls.
This hack wouldn't be necessary with RegisterMask operands representing
the call clobbers, but they are not quite operational yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149088 91177308-0d34-0410-b5e6-96231b3b80d8
we're at it, allow PatternMatch's "neg" pattern to match integer
vector negations, and enhance ComputeNumSigned bits to handle
shl of vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149082 91177308-0d34-0410-b5e6-96231b3b80d8
ConstantExpr::getWithOperandReplaced and ConstantExpr::replaceUsesOfWithOnConstant
in terms of ConstantExpr::getWithOperands. While we're at it,
make sure that ConstantExpr::getWithOperands covers all instructions: it was
missing insert/extractvalue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149076 91177308-0d34-0410-b5e6-96231b3b80d8
MachineBasicBlock::canFallThrough(). We're interested in the state of the
instruction (i.e., is this a barrier or not?), not if the instruction is
predicable or not.
rdar://10501092
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149070 91177308-0d34-0410-b5e6-96231b3b80d8
The live range of the source register may be extended when a redundant
copy is eliminated. Make sure any kill flags between the two copies are
cleared.
This fixes PR11765.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149069 91177308-0d34-0410-b5e6-96231b3b80d8
This enables the linker to match concrete relocation types (absolute or relative) with whatever library or C++ support code is being linked against.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149057 91177308-0d34-0410-b5e6-96231b3b80d8
. "fptosi" and "fptoui" IR instructions are defined with round-to-zero rounding mode.
. Currently for AVX mode for <4xdouble> and <8xdouble> the "VCVTPD2DQ.128" and "VCVTPD2DQ.256" instructions are selected (for .fp_to_sint. DAG node operation ) by AVX codegen. However they use round-to-nearest-even rounding mode.
. Consequently, the conversion produces incorrect numbers.
The fix is to replace selection of VCVTPD2DQ instructions with VCVTTPD2DQ instructions. The latter use truncate (i.e. round-to-zero) rounding mode.
As .fp_to_sint. DAG node operation is used only for lowering of "fptosi" and "fptoui" IR instructions, the fix in X86InstrSSE.td definition file doesn.t have an impact on other LLVM flows.
The patch includes changes in the .td file, LIT test for the changes and a fix in a legacy LIT test (which produced asm code conflicting with LLVN IR spec).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149056 91177308-0d34-0410-b5e6-96231b3b80d8
more robust) ways to do what it was doing now. Also, add static methods
for decoding a ShuffleVector mask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149028 91177308-0d34-0410-b5e6-96231b3b80d8
ConstantVector. Fix some outright bugs in the implementation of
ConstantArray and Constant struct, which would cause us to not make
one big UndefValue when asking for an array/struct with all undef
elements. Enhance Constant::isAllOnesValue to work with
ConstantDataVector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@149021 91177308-0d34-0410-b5e6-96231b3b80d8
to reduce the number of cast<>'s we have. This allows someone to use
things like Ty->getVectorNumElements() instead of
cast<VectorType>(Ty)->getNumElements() when you know that a type is a
vector.
It would be a great general cleanup to move the codebase to use these,
I will do so in the code I'm touching.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148999 91177308-0d34-0410-b5e6-96231b3b80d8
This boils down to using MachineOperand::readsReg() more.
This fixes PR11829 where a use ended up after the first def when
lowering REG_SEQUENCE instructions involving IMPLICIT_DEFs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148996 91177308-0d34-0410-b5e6-96231b3b80d8
function. They don't appear to be used, and are inconsistent with handling of
other physreg intervals (i.e. intervals that are not live-in) where ranges are
not inserted for aliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148986 91177308-0d34-0410-b5e6-96231b3b80d8
"Although a Thumb2 instruction, the IT mnemonic shall be permitted in
ARM mode, and the condition verified to match the condition code(s)
on the following instruction(s)."
PR11853
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148969 91177308-0d34-0410-b5e6-96231b3b80d8
savings from a pointer argument becoming an alloca. Sometimes callees will even
compare a pointer to null and then branch to an otherwise unreachable block!
Detect these cases and compute the number of saved instructions, instead of
bailing out and reporting no savings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@148941 91177308-0d34-0410-b5e6-96231b3b80d8