argument that had to be between 0 and 7 to have any value,
firing an assert later in the AsmPrinter. Now, the
disassembler rejects instructions with out-of-range values
for that immediate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100694 91177308-0d34-0410-b5e6-96231b3b80d8
solution. The only reason these don't fire with gcc-4.2 is that gcc turns off
part of -Wsign-compare in C++ on accident.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100581 91177308-0d34-0410-b5e6-96231b3b80d8
When a frame pointer is not otherwise required, and dynamic stack alignment
is necessary solely due to the spilling of a register with larger alignment
requirements than the default stack alignment, the frame pointer can be both
used as a general purpose register and a frame pointer. That goes poorly, for
obvious reasons. This patch brings back a bit of old logic for identifying
the use of such registers and conservatively reserves the frame pointer
during register allocation in such cases.
For now, implement for X86 only since it's 32-bit linux which is hitting this,
and we want a targeted fix for 2.7. As a follow-on, this will be expanded
to handle other targets, as theoretically the problem could arise elsewhere
as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100559 91177308-0d34-0410-b5e6-96231b3b80d8
uint32_t insn;
MemoryObject.readBytes(Address, 4, (uint8_t*)&insn, NULL)
to read 4 bytes of memory contents into a 32-bit uint variable. This leaves the
interpretation of byte order up to the host machine and causes PPC test cases of
arm-tests, neon-tests, and thumb-tests to fail. Fixed to use a byte array for
reading the memory contents and shift the bytes into place for the 32-bit uint
variable in the ARM case and 16-bit halfword in the Thumb case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100403 91177308-0d34-0410-b5e6-96231b3b80d8
When a target instruction wants to set target-specific flags, it should simply
set bits in the TSFlags bit vector defined in the Instruction TableGen class.
This works well because TableGen resolves member references late:
class I : Instruction {
AddrMode AM = AddrModeNone;
let TSFlags{3-0} = AM.Value;
}
let AM = AddrMode4 in
def ADD : I;
TSFlags gets the expected bits from AddrMode4 in this example.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100384 91177308-0d34-0410-b5e6-96231b3b80d8
which is really a property of the section being referenced.
Add a predicate to MCSection to replace it.
Yay for reduction in magic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100367 91177308-0d34-0410-b5e6-96231b3b80d8
"asm printering" happens through MCStreamer. This also
Streamerizes PIC16 debug info, which escaped my attention.
This removes a leak from LLVMTargetMachine of the 'legacy'
output stream.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100327 91177308-0d34-0410-b5e6-96231b3b80d8
raw_ostream to print an instruction to had to be specified
at MCInstPrinter construction time instead of being able
to pick at each call to printInstruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100307 91177308-0d34-0410-b5e6-96231b3b80d8
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100304 91177308-0d34-0410-b5e6-96231b3b80d8
abstraction it brings. And also get rid of the atexit() handler, it does not
belong in the lib directory. :-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100265 91177308-0d34-0410-b5e6-96231b3b80d8
backend (ARMDecoderEmitter) which emits the decoder functions for ARM and Thumb,
and the disassembler core which invokes the decoder function and builds up the
MCInst based on the decoded Opcode.
Reviewed by Chris Latter and Bob Wilson.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100233 91177308-0d34-0410-b5e6-96231b3b80d8
a new subtarget option for AES and check for the support. Add "westmere"
line of processors and add AES-NI support to the core i7.
Add a couple of TODOs for information I couldn't verify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100231 91177308-0d34-0410-b5e6-96231b3b80d8
return an error status in all failure cases, printing
messages to debugs() only when debugging is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100229 91177308-0d34-0410-b5e6-96231b3b80d8
representation. This eliminates the 'DILocation' MDNodes for
file/line/col tuples from -O0 -g codegen.
This remove the old DebugLoc class, making it a typedef for DebugLoc,
I'll rename NewDebugLoc next.
I didn't update the JIT to use the new apis, so it will continue to
work, but be as slow as before. Someone should eventually do this
or, better yet, rip out the JIT debug info stuff and build the JIT
on top of MC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100209 91177308-0d34-0410-b5e6-96231b3b80d8
Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset,
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100191 91177308-0d34-0410-b5e6-96231b3b80d8
in particular, they end up aligning strings at 16-byte boundaries, and
there's no way for GlobalOpt to check OptForSize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100172 91177308-0d34-0410-b5e6-96231b3b80d8
- Do not try to infer GV alignment unless its type is sized. It's not possible to infer alignment if it has opaque type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100118 91177308-0d34-0410-b5e6-96231b3b80d8
1. Makes it possible to lower with floating point loads and stores.
2. Avoid unaligned loads / stores unless it's fast.
3. Fix some memcpy lowering logic bug related to when to optimize a
load from constant string into a constant.
4. Adjust x86 memcpy lowering threshold to make it more sane.
5. Fix x86 target hook so it uses vector and floating point memory
ops more effectively.
rdar://7774704
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100090 91177308-0d34-0410-b5e6-96231b3b80d8
aes instead of sse4.2. Add a brief todo for a subtarget flag and rework
the aeskeygenassist instruction to more closely match the docs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100078 91177308-0d34-0410-b5e6-96231b3b80d8
SSEDomainFix will collapse to the domain with the lower number when it has a
choice. The SSEPackedSingle domain often has smaller instructions, so prefer
that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99952 91177308-0d34-0410-b5e6-96231b3b80d8
e.g., llvm.memcpy.i32(i8*, i8*, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8*, i8*, i32, i32, i1)
A update of langref will occur in a subsequent checkin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99928 91177308-0d34-0410-b5e6-96231b3b80d8
Rewrite the pmulld patterns, and make sure that they fold in loads of
arguments into the instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99910 91177308-0d34-0410-b5e6-96231b3b80d8
create symbols. It is extremely error prone and a source of a lot
of the remaining integrated assembler bugs on x86-64.
This fixes rdar://7807601.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99902 91177308-0d34-0410-b5e6-96231b3b80d8
makes calls a little bit more consistent and allows easy removal of the
specializations in the future. Convert all callers to the templated functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99838 91177308-0d34-0410-b5e6-96231b3b80d8
Most of these were unused, some of them were wrong and unused (isS16Constant<short>,
isS10Constant<short>).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99827 91177308-0d34-0410-b5e6-96231b3b80d8
"the bigstack patch for SPU, with testcase. It is essentially the patch committed as 97091, and reverted as 97099, but with the following additions:
-in vararg handling, registers are marked to be live, to not confuse the register scavenger
-function prologue and epilogue are not emitted, if the stack size is 16. 16 means it is empty - there is only the register scavenger emergency spill slot, which is not used as there is no stack."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99819 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions use byte index in a control vector (M:Vm) to lookup byte
values in a table and generate a new vector (D:Vd). The table is specified via
a list of vectors, which can be:
{Dn}
{Dn D<n+1>}
{Dn D<n+1> D<n+2>}
{Dn D<n+1> D<n+2> D<n+3>}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99789 91177308-0d34-0410-b5e6-96231b3b80d8
input to be v8i8 or v16i8, which buildvectors get canonicalized to.
This allows the patterns that were previously using a bare 'vnot' to
match, before they couldn't.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99754 91177308-0d34-0410-b5e6-96231b3b80d8
it as the format for the appropriate N3V*SL*<> classes. These instructions
require special handling of the M:Vm field which encodes the restricted Dm and
the lane index within Dm.
Examples are A8.6.325 VMLA, VMLAL, VMLS, VMLSL (by scalar):
vmlal.s32 q3, d2, d10[0]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99690 91177308-0d34-0410-b5e6-96231b3b80d8
through to the generic version. The generic functions use STR/LDR, but T2
needs the t2STR/t2LDR instead so we get the addressing mode correct.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99678 91177308-0d34-0410-b5e6-96231b3b80d8
to now take a format argument. N3VDInt<> and N3VQInt<> are modified to take a
format argument as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99676 91177308-0d34-0410-b5e6-96231b3b80d8
to encode the byte location of the extracted result in the concatenation of the
operands, from the least significant end.
Modify VEXTd and VEXTq classes to use the format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99659 91177308-0d34-0410-b5e6-96231b3b80d8
follow the N3RegFrm's operand order of D:Vd N:Vn M:Vm. The operand order of
N3RegVShFrm is D:Vd M:Vm N:Vn (notice that M:Vm is the first src operand).
Add a parent class N3Vf which requires passing a Format argument and which the
N3V class is modified to inherit from. N3V class represents the "normal"
3-Register NEON Instructions with N3RegFrm.
Also add a multiclass N3VSh_QHSD to represent clusters of NEON 3-Register Shift
Instructions and replace 8 invocations with it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99655 91177308-0d34-0410-b5e6-96231b3b80d8
Examples are VABA (Vector Absolute Difference and Accumulate), VABAL (Vector
Absolute Difference and Accumulate Long), and VABD (Vector Absolute Difference).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99628 91177308-0d34-0410-b5e6-96231b3b80d8
dispatch to the appropriate routines to handle the different interpretations of
the shift amount encoded in the imm6 field. The Vd, Vm fields are interpreted
the same between the two, though.
See, for example, A8.6.367 VQSHL, VQSHLU (immediate) for N2RegVShLFrm format and
A8.6.368 VQSHRN, VQSHRUN for N2RegVShRFrm format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99590 91177308-0d34-0410-b5e6-96231b3b80d8
Remove much horribleness from X86InstrFormats as a result. Similar
simplifications are probably possible for other targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99539 91177308-0d34-0410-b5e6-96231b3b80d8
On Nehalem and newer CPUs there is a 2 cycle latency penalty on using a register
in a different domain than where it was defined. Some instructions have
equvivalents for different domains, like por/orps/orpd.
The SSEDomainFix pass tries to minimize the number of domain crossings by
changing between equvivalent opcodes where possible.
This is a work in progress, in particular the pass doesn't do anything yet. SSE
instructions are tagged with their execution domain in TableGen using the last
two bits of TSFlags. Note that not all instructions are tagged correctly. Life
just isn't that simple.
The SSE execution domain issue is very similar to the ARM NEON/VFP pipeline
issue handled by NEONMoveFixPass. This pass may become target independent to
handle both.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99524 91177308-0d34-0410-b5e6-96231b3b80d8
handles dead implicit results more aggressively. More
to come, I think this is now just a data entry problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99486 91177308-0d34-0410-b5e6-96231b3b80d8
addl $12, %esp
popl %esi
popl %edi
popl %ebx
popl %ebp
jmpl *__Block_deallocator-L1$pb(%esi) # TAILCALL
The problem is the global base register is assigned GR32 register class. TCRETURNmi needs the registers making up the address mode to have the GR32_TC register class.
The *proper* fix is for X86DAGToDAGISel::getGlobalBaseReg() to return a copy from the global base register of the machine function rather than returning the register itself. But that has the potential of causing it to be coalesced to a more restrictive register class: GR32_TC. It can introduce additional copies and spills. For something as important the PIC base, it's not worth it especially since this is not an issue on 64-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99455 91177308-0d34-0410-b5e6-96231b3b80d8
--- Reverse-merging r99440 into '.':
U test/MC/AsmParser/X86/x86_32-bit_cat.s
U test/MC/AsmParser/X86/x86_32-encoding.s
U include/llvm/IntrinsicsX86.td
U include/llvm/CodeGen/SelectionDAGNodes.h
U lib/Target/X86/X86InstrSSE.td
U lib/Target/X86/X86ISelLowering.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99450 91177308-0d34-0410-b5e6-96231b3b80d8
not get an "Unknown immediate size" assert failure when used. All instructions
of this form have an 8-bit immediate. Also added a test case of an example
instruction that is of this form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@99435 91177308-0d34-0410-b5e6-96231b3b80d8