file to become corrupted due to interactions between mmap'd memory segments
and file descriptors closing. The problem is completely avoiding by using
a third temporary file.
Patch provided by Evan Jones
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24527 91177308-0d34-0410-b5e6-96231b3b80d8
generates it. Make MVT::Vector expand-only, and remove the code in
Legalize that attempts to legalize it.
The plan for supporting N x Type is to continually epxand it in ExpandOp
until it gets down to 2 x Type, where it will be scalarized into a pair of
scalars.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24482 91177308-0d34-0410-b5e6-96231b3b80d8
packed types with an element count of 1, although more generic support is
coming. This allows LLVM to turn the following code:
void %foo(<1 x float> * %a) {
entry:
%tmp1 = load <1 x float> * %a;
%tmp2 = add <1 x float> %tmp1, %tmp1
store <1 x float> %tmp2, <1 x float> *%a
ret void
}
Into:
_foo:
lfs f0, 0(r3)
fadds f0, f0, f0
stfs f0, 0(r3)
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24416 91177308-0d34-0410-b5e6-96231b3b80d8
(our graphs apparently already fit within the cache on my g5). In any case
this reduces memory usage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24249 91177308-0d34-0410-b5e6-96231b3b80d8
set and eliminating the need to iterate whenever something is removed (which
can be really slow in some cases). Thx to Jim for pointing out something silly
I was getting stuck on. :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24241 91177308-0d34-0410-b5e6-96231b3b80d8
alignment information appropriately. Includes code for PowerPC to support
fixed-size allocas with alignment larger than the stack. Support for
arbitrarily aligned dynamic allocas coming soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24224 91177308-0d34-0410-b5e6-96231b3b80d8
a special case hack for X86, make the hack more general: if an incoming argument
register is not used in any block other than the entry block, don't copy it to
a vreg. This helps us compile code like this:
%struct.foo = type { int, int, [0 x ubyte] }
int %test(%struct.foo* %X) {
%tmp1 = getelementptr %struct.foo* %X, int 0, uint 2, int 100
%tmp = load ubyte* %tmp1 ; <ubyte> [#uses=1]
%tmp2 = cast ubyte %tmp to int ; <int> [#uses=1]
ret int %tmp2
}
to:
_test:
lbz r3, 108(r3)
blr
instead of:
_test:
lbz r2, 108(r3)
or r3, r2, r2
blr
The (dead) copy emitted to copy r3 into a vreg for extra-block uses was
increasing the live range of r3 past the load, preventing the coallescing.
This implements CodeGen/PowerPC/reg-coallesce-simple.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24115 91177308-0d34-0410-b5e6-96231b3b80d8
generating results in vregs that will need them. In the case of something
like this: CopyToReg((add X, Y), reg1024), we no longer emit code like
this:
reg1025 = add X, Y
reg1024 = reg 1025
Instead, we emit:
reg1024 = add X, Y
Whoa! :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@24111 91177308-0d34-0410-b5e6-96231b3b80d8
FP_TO_SINT is preferred to a larger FP_TO_UINT. This seems to be begging
for a TLI.isOperationCustom() helper function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23992 91177308-0d34-0410-b5e6-96231b3b80d8
the input is that type, this caused a failure on gs on X86 last night.
Move the hard checks into Build[US]Div since that is where decisions like
this should be made.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23881 91177308-0d34-0410-b5e6-96231b3b80d8
Add a new flag to TargetLowering indicating if the target has really cheap
signed division by powers of two, make ppc use it. This will probably go
away in the future.
Implement some more ISD::SDIV folds in the dag combiner
Remove now dead code in the x86 backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23853 91177308-0d34-0410-b5e6-96231b3b80d8
for types that aren't legal, and fail a divisor is less than zero
comparison, which would cause us to drop a subtract.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23846 91177308-0d34-0410-b5e6-96231b3b80d8
that the nodes can be folded with other nodes, and we can not duplicate
code in every backend. Alpha will probably want this too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23835 91177308-0d34-0410-b5e6-96231b3b80d8
allows us to lower legal return types to something else, to meet ABI
requirements (such as that i64 be returned in two i32 regs on Darwin/ppc).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23802 91177308-0d34-0410-b5e6-96231b3b80d8
a lot throughout many programs. In particular, specfp triggers it a bunch for
constant FP nodes when you have code like cond ? 1.0 : -1.0.
If the PPC ISel exposed the loads implicit in pic references to external globals,
we would be able to eliminate a load in cases like this as well:
%X = external global int
%Y = external global int
int* %test4(bool %C) {
%G = select bool %C, int* %X, int* %Y
ret int* %G
}
Note that this breaks things that use SrcValue's (see the fixme), but since nothing
uses them yet, this is ok.
Also, simplify some code to use hasOneUse() on an SDOperand instead of hasNUsesOfValue directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23781 91177308-0d34-0410-b5e6-96231b3b80d8
you could be AND'ing with the result of a shift that shifts out all the
bits you care about, in addition to a constant.
Also, move over an add/sub_parts fold from legalize to the dag combiner,
where it works for things other than constants. Woot!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23720 91177308-0d34-0410-b5e6-96231b3b80d8
out, where after the first CombineTo() call, the node the second CombineTo
wishes to replace may no longer exist.
Fix a very real bug with the truncated load optimization on little endian
targets, which do not need a byte offset added to the load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23704 91177308-0d34-0410-b5e6-96231b3b80d8
like turning:
_foo:
fctiwz f0, f1
stfd f0, -8(r1)
lwz r2, -4(r1)
rlwinm r3, r2, 0, 16, 31
blr
into
_foo:
fctiwz f0,f1
stfd f0,-8(r1)
lhz r3,-2(r1)
blr
Also removed an unncessary constraint from sra -> srl conversion, which
should take care of hte only reason we would ever need to handle sra in
MaskedValueIsZero, AFAIK.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23703 91177308-0d34-0410-b5e6-96231b3b80d8
location, replace them with a new store of the last value. This occurs
in the same neighborhood in 197.parser, speeding it up about 1.5%
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23691 91177308-0d34-0410-b5e6-96231b3b80d8
multiple results.
Use this support to implement trivial store->load forwarding, implementing
CodeGen/PowerPC/store-load-fwd.ll. Though this is the most simple case and
can be extended in the future, it is still useful. For example, it speeds
up 197.parser by 6.2% by avoiding an LSU reject in xalloc:
stw r6, lo16(l5_end_of_array)(r2)
addi r2, r5, -4
stwx r5, r4, r2
- lwzx r5, r4, r2
- rlwinm r5, r5, 0, 0, 30
stwx r5, r4, r2
lwz r2, -4(r4)
ori r2, r2, 1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23690 91177308-0d34-0410-b5e6-96231b3b80d8
creating a new vreg and inserting a copy: just use the input vreg directly.
This speeds up the compile (e.g. about 5% on mesa with a debug build of llc)
by not adding a bunch of copies and vregs to be coallesced away. On mesa,
for example, this reduces the number of intervals from 168601 to 129040
going into the coallescer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@23671 91177308-0d34-0410-b5e6-96231b3b80d8