llvm-6502/lib/Target
Evan Cheng d51425a82d Use movaps / movapd (instead of movss / movsd) to do FR32 / FR64 reg to reg
transfer.

According to the Intel P4 Optimization Manual:

Moves that write a portion of a register can introduce unwanted
dependences. The movsd reg, reg instruction writes only the bottom
64 bits of a register, not to all 128 bits. This introduces a dependence on
the preceding instruction that produces the upper 64 bits (even if those
bits are not longer wanted). The dependence inhibits register renaming,
and thereby reduces parallelism.

Not to mention movaps is shorter than movss.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26226 91177308-0d34-0410-b5e6-96231b3b80d8
2006-02-16 01:50:02 +00:00
..
Alpha improved zap discovery 2006-02-13 18:52:29 +00:00
CBackend Another hack due to allowing multiple symbols with the same name. 2006-02-13 22:22:42 +00:00
IA64 Switch targets over to using SelectionDAG::getCALLSEQ_START to create 2006-02-13 09:00:43 +00:00
PowerPC If we have zero initialized data with external linkage, use .zerofill to 2006-02-14 22:18:23 +00:00
Skeleton
Sparc Sparc actually *DOES* have a directive for emitting zeros. In fact, it requires 2006-02-15 07:07:14 +00:00
SparcV8 Remove the SparcV8 backend. It has been renamed to be the Sparc backend. 2006-02-05 06:33:29 +00:00
SparcV9 Adjust to MachineConstantPool interface change: instead of keeping a 2006-02-09 04:46:04 +00:00
X86 Use movaps / movapd (instead of movss / movsd) to do FR32 / FR64 reg to reg 2006-02-16 01:50:02 +00:00
Makefile
MRegisterInfo.cpp Finegrainify namespacification 2006-02-01 18:10:56 +00:00
README.txt Remove an entry. 2006-02-15 22:14:34 +00:00
SubtargetFeature.cpp
Target.td
TargetData.cpp
TargetFrameInfo.cpp
TargetInstrInfo.cpp
TargetMachine.cpp
TargetMachineRegistry.cpp
TargetSchedInfo.cpp
TargetSchedule.td
TargetSelectionDAG.td Targets all now request ConstantFP to be legalized into TargetConstantFP. 2006-01-29 06:26:08 +00:00
TargetSubtarget.cpp

Target Independent Opportunities:

===-------------------------------------------------------------------------===

FreeBench/mason contains code like this:

static p_type m0u(p_type p) {
  int m[]={0, 8, 1, 2, 16, 5, 13, 7, 14, 9, 3, 4, 11, 12, 15, 10, 17, 6};
  p_type pu;
  pu.a = m[p.a];
  pu.b = m[p.b];
  pu.c = m[p.c];
  return pu;
}

We currently compile this into a memcpy from a static array into 'm', then
a bunch of loads from m.  It would be better to avoid the memcpy and just do
loads from the static array.

===-------------------------------------------------------------------------===

Get the C front-end to expand hypot(x,y) -> llvm.sqrt(x*x+y*y) when errno and
precision don't matter (ffastmath).  Misc/mandel will like this. :)

//===---------------------------------------------------------------------===//

Solve this DAG isel folding deficiency:

int X, Y;

void fn1(void)
{
  X = X | (Y << 3);
}

compiles to

fn1:
	movl Y, %eax
	shll $3, %eax
	orl X, %eax
	movl %eax, X
	ret

The problem is the store's chain operand is not the load X but rather
a TokenFactor of the load X and load Y, which prevents the folding.

There are two ways to fix this:

1. The dag combiner can start using alias analysis to realize that y/x
   don't alias, making the store to X not dependent on the load from Y.
2. The generated isel could be made smarter in the case it can't
   disambiguate the pointers.

Number 1 is the preferred solution.

//===---------------------------------------------------------------------===//

DAG combine this into mul A, 8:

int %test(int %A) {
  %B = mul int %A, 8  ;; shift
  %C = add int %B, 7  ;; dead, no demanded bits.
  %D = and int %C, -8 ;; dead once add is gone.
  ret int %D
}

This sort of thing occurs in the alloca lowering code and other places that
are generating alignment of an already aligned value.