GCC apparently does this, and code depends on not having to do
emms when this happens. This is x86-64 only so far, second half
should handle x86-32.
rdar://5741668
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47470 91177308-0d34-0410-b5e6-96231b3b80d8
any, we force sdisel to do all regalloc for an asm. This
leads to gross but correct codegen.
This fixes the rest of PR2078.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47454 91177308-0d34-0410-b5e6-96231b3b80d8
inline asms.
Fix PR2078 by marking aliases of registers used when a register is
marked used. This prevents EAX from being allocated when AX is listed
in the clobber set for the asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47426 91177308-0d34-0410-b5e6-96231b3b80d8
Parse reversed smax and umax as smin and umin and express them with negative
or binary-not SCEVs (which are really just subtract under the hood).
Parse 'xor %x, -1' as (-1 - %x).
Remove dead code (ConstantInt::get always returns a ConstantInt).
Don't use getIntegerSCEV(-1, Ty). The first value is an int, then it gets
passed into a uint64_t. Instead, create the -1 directly from
ConstantInt::getAllOnesValue().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47360 91177308-0d34-0410-b5e6-96231b3b80d8
- X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47290 91177308-0d34-0410-b5e6-96231b3b80d8
has plain one-result scalar integer multiplication instructions.
This avoids expanding such instructions into MUL_LOHI sequences that
must be special-cased at isel time, and avoids the problem with that
code that provented memory operands from being folded.
This fixes PR1874, addressesing the most common case. The uncommon
cases of optimizing multiply-high operations will require work
in DAGCombiner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47277 91177308-0d34-0410-b5e6-96231b3b80d8
another sret function, it should pass its own sret parameter to the tail callee, allowing it to fill in the correct
return value. llvm-gcc does not emit this by default. Instead, it allocates space in the caller for the sret of
the tail call and then uses memcpy to copy the result into the caller's sret parameter. This optimization detects
and optimizes that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47265 91177308-0d34-0410-b5e6-96231b3b80d8
CTTZ and CTPOP. The expansion code differs from
that in LegalizeDAG in that it chooses to take the
CTLZ/CTTZ count from the Hi/Lo part depending on
whether the Hi/Lo value is zero, not on whether
CTLZ/CTTZ of Hi/Lo returned 32 (or whatever the
width of the type is) for it. I made this change
because the optimizers may well know that Hi/Lo
is zero and exploit it. The promotion code for
CTTZ also differs from that in LegalizeDAG: it
uses an "or" to get the right result when the
original value is zero, rather than using a compare
and select. This also means the value doesn't
need to be zero extended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47075 91177308-0d34-0410-b5e6-96231b3b80d8
node as soon as we create it in SDISel. Previously we would lower it in
legalize. The problem with this is that it only exposes the argument
loads implied by FORMAL_ARGUMENTs after legalize, so that only dag combine 2
can hack on them. This causes us to miss some optimizations because
datatype expansion also happens here.
Exposing the loads early allows us to do optimizations on them. For example
we now compile arg-cast.ll to:
_foo:
movl $2147483647, %eax
andl 8(%esp), %eax
ret
where we previously produced:
_foo:
subl $12, %esp
movsd 16(%esp), %xmm0
movsd %xmm0, (%esp)
movl $2147483647, %eax
andl 4(%esp), %eax
addl $12, %esp
ret
It might also make sense to do this for ISD::CALL nodes, which have implicit
stores on many targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47054 91177308-0d34-0410-b5e6-96231b3b80d8
variable (with step 1) and m is its final value. Then, the correct trip
count is SMAX(m,n)-n. Previously, we used SMAX(0,m-n), but m-n may
overflow and can't in general be interpreted as signed.
Patch by Nick Lewycky.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@47007 91177308-0d34-0410-b5e6-96231b3b80d8
to the RHS. This simple change allows to compute loop iteration count
for loops with condition similar to the one in the testcase (which seems
to be quite common).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46959 91177308-0d34-0410-b5e6-96231b3b80d8
arbitrary iteration.
The patch:
1) changes SCEVSDivExpr into SCEVUDivExpr,
2) replaces PartialFact() function with BinomialCoefficient(); the
computations (essentially, the division) in BinomialCoefficient() are
performed with the apprioprate bitwidth necessary to avoid overflow;
unsigned division is used instead of the signed one.
Computations in BinomialCoefficient() require support from the code
generator for APInts. Currently, we use a hack rounding up the
neccessary bitwidth to the nearest power of 2. The hack is easy to turn
off in future.
One remaining issue: we assume the divisor of the binomial coefficient
formula can be computed accurately using 16 bits. It means we can handle
AddRecs of length up to 9. In future, we should use APInts to evaluate
the divisor.
Thanks to Nicholas for cooperation!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46955 91177308-0d34-0410-b5e6-96231b3b80d8
was incorrectly simplifying "x == (gep x, 1, i)" into false, even
though i could be negative. As it turns out, all the code to
handle this already existed, we just need to disable the incorrect
optimization case and let the general case handle it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46739 91177308-0d34-0410-b5e6-96231b3b80d8
any bugs in the future since to get the crash you also
need hacked in fake libcall support (which creates odd
but legal trees), but since adding it doesn't hurt...
Thanks to Chris for this ultimately reduced version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46706 91177308-0d34-0410-b5e6-96231b3b80d8
In practice this can only happen on code with already undefined behavior,
but this is still a good thing to handle correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46539 91177308-0d34-0410-b5e6-96231b3b80d8
only two addressing mode nodes, SPUaform and SPUindirect (vice the
three previous ones, SPUaform, SPUdform and SPUxform). This improves
code somewhat because we now avoid using reg+reg addressing when
it can be avoided. It also simplifies the address selection logic,
which was the main point for doing this.
Also, for various global variables that would be loaded using SPU's
A-form addressing, prefer D-form offs[reg] addressing, keeping the
base in a register if the variable is used more than once.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46483 91177308-0d34-0410-b5e6-96231b3b80d8
way or the other. Rewriting the code itself prevents subsequent analysis
passes from making contradictory conclusions about the code that could
cause an infeasible path to be made feasible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46427 91177308-0d34-0410-b5e6-96231b3b80d8
registers if used by a bitconvert or using a bitconvert. This allows us to
avoid constant pool loads and use cheaper integer instructions when the
values come from or end up in integer regs anyway. For example, we now
compile CodeGen/X86/fp-in-intregs.ll to:
_test1:
movl $2147483648, %eax
xorl 4(%esp), %eax
ret
_test2:
movl $1065353216, %eax
orl 4(%esp), %eax
andl $3212836864, %eax
ret
Instead of:
_test1:
movss 4(%esp), %xmm0
xorps LCPI2_0, %xmm0
movd %xmm0, %eax
ret
_test2:
movss 4(%esp), %xmm0
andps LCPI3_0, %xmm0
movss LCPI3_1, %xmm1
andps LCPI3_2, %xmm1
orps %xmm0, %xmm1
movd %xmm1, %eax
ret
bitconverts can happen due to various calling conventions that require
fp values to passed in integer regs in some cases, e.g. when returning
a complex.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46414 91177308-0d34-0410-b5e6-96231b3b80d8
void bork() {
int *address = 0;
*address = 0;
}
It's compiled into LLVM code that looks like this:
define void @bork() noreturn nounwind {
entry:
unreachable
}
This is bad on some platforms (like PPC) because it will generate the label for
the function but no body. The label could end up being associated with some
non-code related stuff, like a section. This places a "trap" instruction if the
SimplifyCFG pass removed all code from the function leaving only one
"unreachable" instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46387 91177308-0d34-0410-b5e6-96231b3b80d8
delete a node even if it was not dead in some cases. Instead, just add it to
the worklist. Also, make sure to use the CombineTo methods, as it was doing
things that were unsafe: the top level combine loop could touch dangling memory.
This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46384 91177308-0d34-0410-b5e6-96231b3b80d8
can't be aliased to other known objects. This allows us to know that byval
pointer args don't alias globals, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46315 91177308-0d34-0410-b5e6-96231b3b80d8
This case returns the value in ST(0) and then has to convert it to an SSE
register. This causes significant codegen ugliness in some cases. For
example in the trivial fp-stack-direct-ret.ll testcase we used to generate:
_bar:
subl $28, %esp
call L_foo$stub
fstpl 16(%esp)
movsd 16(%esp), %xmm0
movsd %xmm0, 8(%esp)
fldl 8(%esp)
addl $28, %esp
ret
because we move the result of foo() into an XMM register, then have to
move it back for the return of bar.
Instead of hacking ever-more special cases into the call result lowering code
we take a much simpler approach: on x86-32, fp return is modeled as always
returning into an f80 register which is then truncated to f32 or f64 as needed.
Similarly for a result, we model it as an extension to f80 + return.
This exposes the truncate and extensions to the dag combiner, allowing target
independent code to hack on them, eliminating them in this case. This gives
us this code for the example above:
_bar:
subl $12, %esp
call L_foo$stub
addl $12, %esp
ret
The nasty aspect of this is that these conversions are not legal, but we want
the second pass of dag combiner (post-legalize) to be able to hack on them.
To handle this, we lie to legalize and say they are legal, then custom expand
them on entry to the isel pass (PreprocessForFPConvert). This is gross, but
less gross than the code it is replacing :)
This also allows us to generate better code in several other cases. For
example on fp-stack-ret-conv.ll, we now generate:
_test:
subl $12, %esp
call L_foo$stub
fstps 8(%esp)
movl 16(%esp), %eax
cvtss2sd 8(%esp), %xmm0
movsd %xmm0, (%eax)
addl $12, %esp
ret
where before we produced (incidentally, the old bad code is identical to what
gcc produces):
_test:
subl $12, %esp
call L_foo$stub
fstpl (%esp)
cvtsd2ss (%esp), %xmm0
cvtss2sd %xmm0, %xmm0
movl 16(%esp), %eax
movsd %xmm0, (%eax)
addl $12, %esp
ret
Note that we generate slightly worse code on pr1505b.ll due to a scheduling
deficiency that is unrelated to this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46307 91177308-0d34-0410-b5e6-96231b3b80d8
Fixed CellSPU's A-form (local store) address mode, so that all globals,
externals, constant pool and jump table symbols are now wrapped within
a SPUISD::AFormAddr pseudo-instruction. This now identifies all local
store memory addresses, although it requires a bit of legerdemain during
instruction selection to properly select loads to and stores from local
store, properly generating "LQA" instructions.
Also added mul_ops.ll test harness for exercising integer multiplication.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46142 91177308-0d34-0410-b5e6-96231b3b80d8
1. Legalize now always promotes truncstore of i1 to i8.
2. Remove patterns and gunk related to truncstore i1 from targets.
3. Rename the StoreXAction stuff to TruncStoreAction in TLI.
4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions.
5. Mark a wide variety of invalid truncstores as such in various targets, e.g.
X86 currently doesn't support truncstore of any of its integer types.
6. Add legalize support for truncstores with invalid value input types.
7. Add a dag combine transform to turn store(truncate) into truncstore when
safe.
The later allows us to compile CodeGen/X86/storetrunc-fp.ll to:
_foo:
fldt 20(%esp)
fldt 4(%esp)
faddp %st(1)
movl 36(%esp), %eax
fstps (%eax)
ret
instead of:
_foo:
subl $4, %esp
fldt 24(%esp)
fldt 8(%esp)
faddp %st(1)
fstps (%esp)
movl 40(%esp), %eax
movss (%esp), %xmm0
movss %xmm0, (%eax)
addl $4, %esp
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46140 91177308-0d34-0410-b5e6-96231b3b80d8
and the spill is its kill. However, if the local allocator has determined the
register has not been modified (possible when its value was reloaded), it would
not issue a restore. In that case, mark the last use of the virtual register as
kill.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46111 91177308-0d34-0410-b5e6-96231b3b80d8
promoted functions. This is important for varargs calls in
particular. Thanks to duncan for providing a great testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46108 91177308-0d34-0410-b5e6-96231b3b80d8
It's not safe to use the two value CombineTo variant to combine away a dead load.
e.g.
v1, chain2 = load chain1, loc
v2, chain3 = load chain2, loc
v3 = add v2, c
Now we replace use of v1 with undef, use of chain2 with chain1.
ReplaceAllUsesWith() will iterate through uses of the first load and update operands:
v1, chain2 = load chain1, loc
v2, chain3 = load chain1, loc
v3 = add v2, c
Now the second load is the same as the first load, SelectionDAG cse will ensure
the use of second load is replaced with the first load.
v1, chain2 = load chain1, loc
v3 = add v1, c
Then v1 is replaced with undef and bad things happen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46099 91177308-0d34-0410-b5e6-96231b3b80d8
it should work, but I have no machine to test
it on. Committed because it will at least
cause no harm, and maybe someone can test it
for me!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46098 91177308-0d34-0410-b5e6-96231b3b80d8
make the 'fp return in ST(0)' optimization smart enough to
look through token factor nodes. THis allows us to compile
testcases like CodeGen/X86/fp-stack-retcopy.ll into:
_carg:
subl $12, %esp
call L_foo$stub
fstpl (%esp)
fldl (%esp)
addl $12, %esp
ret
instead of:
_carg:
subl $28, %esp
call L_foo$stub
fstpl 16(%esp)
movsd 16(%esp), %xmm0
movsd %xmm0, 8(%esp)
fldl 8(%esp)
addl $28, %esp
ret
Still not optimal, but much better and this is a trivial patch. Fixing
the rest requires invasive surgery that is is not llvm 2.2 material.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46054 91177308-0d34-0410-b5e6-96231b3b80d8
drop attributes on varargs call arguments. Also, it could generate
invalid IR if the transformed call already had the 'nest' attribute
somewhere (this can never happen for code coming from llvm-gcc,
but it's a theoretical possibility). Fix both problems.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45973 91177308-0d34-0410-b5e6-96231b3b80d8
byval work. This miscompilation is due to the program indexing an array out
of range and us doing a transformation that broke this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45949 91177308-0d34-0410-b5e6-96231b3b80d8
a load/store of i64. The later prevents promotion/scalarrepl of the
source and dest in many cases.
This fixes the 300% performance regression of the byval stuff on
stepanov_v1p2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45945 91177308-0d34-0410-b5e6-96231b3b80d8
realize that ne & sgt was a signed comparison (it was only
looking at whether the left compare was signed).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45937 91177308-0d34-0410-b5e6-96231b3b80d8
if this becomes a varargs call then deal correctly with any
parameter attributes on the newly vararg call arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45931 91177308-0d34-0410-b5e6-96231b3b80d8
inlining a function if we know that the function does not write
to *any* memory. This implements test/Transforms/Inline/byval2.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45912 91177308-0d34-0410-b5e6-96231b3b80d8
parameter, even if it is a varargs function. Do
allow attributes on the varargs part of a call,
but not beyond the last argument. Only allow
selected attributes to be on the varargs part of
a call (currently only 'byval' is allowed). The
reasoning here is that most attributes, eg inreg,
simply make no sense here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45887 91177308-0d34-0410-b5e6-96231b3b80d8
get away with it, which exposes opportunities to eliminate the memory
objects entirely. For example, we now compile byval.ll to:
define internal void @f1(i32 %b.0, i64 %b.1) {
entry:
%tmp2 = add i32 %b.0, 1 ; <i32> [#uses=0]
ret void
}
define i32 @main() nounwind {
entry:
call void @f1( i32 1, i64 2 )
ret i32 0
}
This seems like it would trigger a lot for code that passes around small
structs (e.g. SDOperand's or _Complex)...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45886 91177308-0d34-0410-b5e6-96231b3b80d8
- struct_2.ll: Completely unaligned load/store testing
- call_indirect.ll, struct_1.ll: Add test lines to exercise
X-form [$reg($reg)] addressing
At this point, loads and stores should be under control (he says
in an optimistic tone of voice.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45882 91177308-0d34-0410-b5e6-96231b3b80d8
- Cleaned up custom load/store logic, common code is now shared [see note
below], cleaned up address modes
- More test cases: various intrinsics, structure element access (load/store
test), updated target data strings, indirect function calls.
Note: This patch contains a refactoring of the LoadSDNode and StoreSDNode
structures: they now share a common base class, LSBaseSDNode, that
provides an interface to their common functionality. There is some hackery
to access the proper operand depending on the derived class; otherwise,
to do a proper job would require finding and rearranging the SDOperands
sent to StoreSDNode's constructor. The current refactor errs on the
side of being conservatively and backwardly compatible while providing
functionality that reduces redundant code for targets where loads and
stores are custom-lowered.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45851 91177308-0d34-0410-b5e6-96231b3b80d8
Likewise fix up a bunch of other libcalls. While
there I remove NEG_F32 and NEG_F64 since they are
not used anywhere. This fixes 9 Ada ACATS failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45833 91177308-0d34-0410-b5e6-96231b3b80d8
the code generated is not wonderful. This turns a miscompilation into
a code quality bug (noted in the ppc readme). This fixes PR642, which
is over 2 years old (!). Nate, please review this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45742 91177308-0d34-0410-b5e6-96231b3b80d8
providing a misleading facility. It's used once in the MIPS backend
and hardcoded as "\t.globl\t" everywhere else.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45676 91177308-0d34-0410-b5e6-96231b3b80d8
direct calls bails out unless caller and callee have essentially
equivalent parameter attributes. This is illogical - the callee's
attributes should be of no relevance here. Rework the logic, which
incidentally fixes a crash when removed arguments have attributes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45658 91177308-0d34-0410-b5e6-96231b3b80d8
a direct call with cast parameters and cast return
value (if any), instcombine was prepared to cast any
non-void return value into any other, whether castable
or not. Add a new predicate for testing whether casting
is valid, and check it both for the return value and
(as a cleanup) for the parameters.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45657 91177308-0d34-0410-b5e6-96231b3b80d8
could theoretically introduce a trap, but is also a performance issue.
This speeds up ptrdist/ks by 8%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45533 91177308-0d34-0410-b5e6-96231b3b80d8
the initial value, while the type fields were not (this
is a qualified union type, so not all fields are always
present). This resulted in the size of the corresponding
LLVM type being larger than the gcc TYPE_SIZE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45522 91177308-0d34-0410-b5e6-96231b3b80d8
values, which means doing extra legalization work.
It would be easier to get this kind of thing right if
there was some documentation...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45472 91177308-0d34-0410-b5e6-96231b3b80d8
eliminating the llvm.x86.sse2.loadl.pd intrinsic?), one shuffle optzn
may be done (if shufps is better than pinsw, Evan, please review), and
we already know about LICM of simple instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45407 91177308-0d34-0410-b5e6-96231b3b80d8
if we are just going to store it back anyway. This improves things
like:
double foo();
void bar(double *P) { *P = foo(); }
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45399 91177308-0d34-0410-b5e6-96231b3b80d8
define void @f() {
...
call i32 @g()
...
}
define void @g() {
...
}
The hazards are:
- @f and @g have GC, but they differ GC. Inlining is invalid. This
may never occur.
- @f has no GC, but @g does. g's GC must be propagated to @f.
The other scenarios are safe:
- @f and @g have the same GC.
- @f and @g have no GC.
- @g has no GC.
This patch adds inliner checks for the former two scenarios.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45351 91177308-0d34-0410-b5e6-96231b3b80d8
function with GC.
This will catch the error when the inliner inlines a function with
GC into a caller with no GC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45350 91177308-0d34-0410-b5e6-96231b3b80d8
as on functions. Make it verify invokes and not just
ordinary calls. As a (desired) side-effect, it is no
longer legal to have call attributes on arguments that
are being passed to the varargs part of a varargs
function (llvm-as drops them on the floor anyway).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45286 91177308-0d34-0410-b5e6-96231b3b80d8
return attributes on the floor. In the case of a call
to a varargs function where the varargs arguments are
being removed, any call attributes on those arguments
need to be dropped. I didn't do this because I plan to
make it illegal to have such attributes (see next patch).
With this change, compiling the gcc filter2 eh test at -O0
and then running opt -std-compile-opts on it results in
a correctly working program (compiling at -O1 or higher
results in the test failing due to a problem with how we
output eh info into the IR).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45285 91177308-0d34-0410-b5e6-96231b3b80d8
calls 'nounwind'. It is important for correct C++
exception handling that nounwind markings do not get
lost, so this transformation is actually needed for
correctness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45218 91177308-0d34-0410-b5e6-96231b3b80d8
how to lower them (with no attempt made to be
efficient, since they should only occur for
unoptimized code).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45108 91177308-0d34-0410-b5e6-96231b3b80d8
calls. Remove special casing of inline asm from the
inliner. There is a potential problem: the verifier
rejects invokes of inline asm (not sure why). If an
asm call is not marked "nounwind" in some .ll, and
instcombine is not run, but the inliner is run, then
an illegal module will be created. This is bad but
I'm not sure what the best approach is. I'm tempted
to remove the check in the verifier...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45073 91177308-0d34-0410-b5e6-96231b3b80d8
endianness of the target not of the host. Done by the
simple expedient of reversing bytes for primitive types
if the host and target endianness don't match. This is
correct for integer and pointer types. I don't know if
it is correct for floating point types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45039 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG::getConstant, in the same way as vector floating-point
constants. This allows the legalize expansion code for @llvm.ctpop and
friends to be usable with vector types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44954 91177308-0d34-0410-b5e6-96231b3b80d8
2. Using zero-extended value of Scale and unsigned division is safe provided
that Scale doesn't have the sign bit set.
Previously these 2 instructions:
%p = bitcast [100 x {i8,i8,i8}]* %x to i8*
%q = getelementptr i8* %p, i32 -4
were combined into:
%q = getelementptr [100 x { i8, i8, i8 }]* %x, i32 0,
i32 1431655764, i32 0
what was incorrect.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44936 91177308-0d34-0410-b5e6-96231b3b80d8
regions of memory that have a target specific relationship, as described in the
Embedded C Technical Report.
This also implements the 2007-12-11-AddressSpaces test,
which demonstrates how address space attributes can be used in LLVM IR.
In addition, this patch changes the bitcode signature for stores (in a backwards
compatible manner), such that the pointer type, rather than the pointee type, is
encoded. This permits type information in the pointer (e.g. address space) to be
preserved for stores.
LangRef updates are forthcoming.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44858 91177308-0d34-0410-b5e6-96231b3b80d8
possible before resorting to pextrw and pinsrw.
- Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles.
- Improves (i16 extract_vector_element 0) codegen by recognizing
(i32 extract_vector_element 0) does not require a pextrw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44836 91177308-0d34-0410-b5e6-96231b3b80d8
Thompson. Usage should be something like this:
open Llvm
open Llvm_bitreader
match read_bitcode_file fn with
| Bitreader_failure msg ->
prerr_endline msg
| Bitreader_success m ->
...;
dispose_module m
Compile with: ocamlc llvm.cma llvm_bitreader.cma
ocamlopt llvm.cmxa llvm_bitreader.cmxa
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44824 91177308-0d34-0410-b5e6-96231b3b80d8
Reimplement the xform in Analysis/ConstantFolding.cpp where we can use
targetdata to validate that it is safe. While I'm in there, fix some const
correctness issues and generalize the interface to the "operand folder".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44817 91177308-0d34-0410-b5e6-96231b3b80d8
using the minimum possible number of bytes. For little
endian targets run on little endian machines, apints are
stored in memory from LSB to MSB as before. For big endian
targets on big endian machines they are stored from MSB to
LSB which wasn't always the case before (if the target and
host endianness doesn't match values are stored according
to the host's endianness). Doing this requires knowing the
endianness of the host, which is determined when configuring -
thanks go to Anton for this. Only having access to little
endian machines I was unable to properly test the big endian
part, which is also the most complicated...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44796 91177308-0d34-0410-b5e6-96231b3b80d8
methods are new to Function:
bool hasCollector() const;
const std::string &getCollector() const;
void setCollector(const std::string &);
void clearCollector();
The assembly representation is as such:
define void @f() gc "shadow-stack" { ...
The implementation uses an on-the-side table to map Functions to
collector names, such that there is no overhead. A StringPool is
further used to unique collector names, which are extremely
likely to be unique per process.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44769 91177308-0d34-0410-b5e6-96231b3b80d8
_foo:
movl $12, %eax
andl 4(%esp), %eax
movl _array(%eax), %eax
ret
instead of:
_foo:
movl 4(%esp), %eax
shrl $2, %eax
andl $3, %eax
movl _array(,%eax,4), %eax
ret
As it turns out, this triggers all the time, in a wide variety of
situations, for example, I see diffs like this in various programs:
- movl 8(%eax), %eax
- shll $2, %eax
- andl $1020, %eax
- movl (%esi,%eax), %eax
+ movzbl 8(%eax), %eax
+ movl (%esi,%eax,4), %eax
- shll $2, %edx
- andl $1020, %edx
- movl (%edi,%edx), %edx
+ andl $255, %edx
+ movl (%edi,%edx,4), %edx
Unfortunately, I also see stuff like this, which can be fixed in the
X86 backend:
- andl $85, %ebx
- addl _bit_count(,%ebx,4), %ebp
+ shll $2, %ebx
+ andl $340, %ebx
+ addl _bit_count(%ebx), %ebp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44656 91177308-0d34-0410-b5e6-96231b3b80d8
the function type, instead they belong to functions
and function calls. This is an updated and slightly
corrected version of Reid Spencer's original patch.
The only known problem is that auto-upgrading of
bitcode files doesn't seem to work properly (see
test/Bitcode/AutoUpgradeIntrinsics.ll). Hopefully
a bitcode guru (who might that be? :) ) will fix it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44359 91177308-0d34-0410-b5e6-96231b3b80d8
optimized. This avoids creating illegal divisions when the combiner is
running after legalize; this fixes PR1815. Also, it produces better
code in the included testcase by avoiding the subtract and multiply
when the division isn't optimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44341 91177308-0d34-0410-b5e6-96231b3b80d8