llvm-6502/lib/Target
Evan Cheng d51425a82d Use movaps / movapd (instead of movss / movsd) to do FR32 / FR64 reg to reg
transfer.

According to the Intel P4 Optimization Manual:

Moves that write a portion of a register can introduce unwanted
dependences. The movsd reg, reg instruction writes only the bottom
64 bits of a register, not to all 128 bits. This introduces a dependence on
the preceding instruction that produces the upper 64 bits (even if those
bits are not longer wanted). The dependence inhibits register renaming,
and thereby reduces parallelism.

Not to mention movaps is shorter than movss.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@26226 91177308-0d34-0410-b5e6-96231b3b80d8
2006-02-16 01:50:02 +00:00
..
Alpha improved zap discovery 2006-02-13 18:52:29 +00:00
CBackend Another hack due to allowing multiple symbols with the same name. 2006-02-13 22:22:42 +00:00
IA64 Switch targets over to using SelectionDAG::getCALLSEQ_START to create 2006-02-13 09:00:43 +00:00
PowerPC If we have zero initialized data with external linkage, use .zerofill to 2006-02-14 22:18:23 +00:00
Skeleton PHI and INLINEASM are now built-in instructions provided by Target.td 2006-01-27 01:46:15 +00:00
Sparc Sparc actually *DOES* have a directive for emitting zeros. In fact, it requires 2006-02-15 07:07:14 +00:00
SparcV8 Remove the SparcV8 backend. It has been renamed to be the Sparc backend. 2006-02-05 06:33:29 +00:00
SparcV9 Adjust to MachineConstantPool interface change: instead of keeping a 2006-02-09 04:46:04 +00:00
X86 Use movaps / movapd (instead of movss / movsd) to do FR32 / FR64 reg to reg 2006-02-16 01:50:02 +00:00
Makefile DONT_BUILD_RELINKED is gone and implied by BUILD_ARCHIVE now 2005-10-24 02:26:13 +00:00
MRegisterInfo.cpp Finegrainify namespacification 2006-02-01 18:10:56 +00:00
README.txt Remove an entry. 2006-02-15 22:14:34 +00:00
SubtargetFeature.cpp Improve compatibility with VC2005, patch by Morten Ofstad! 2006-01-26 20:41:32 +00:00
Target.td Subtarget feature can now set any variable to any value 2006-01-27 08:09:42 +00:00
TargetData.cpp Implement a new InvalidateStructLayoutInfo method and add some comments 2006-01-14 00:07:34 +00:00
TargetFrameInfo.cpp Eliminate all remaining tabs and trailing spaces. 2005-07-27 06:12:32 +00:00
TargetInstrInfo.cpp Convert tabs to spaces 2005-04-22 17:54:37 +00:00
TargetMachine.cpp Remove the X86 and PowerPC Simple instruction selectors; their time has 2005-08-18 23:53:15 +00:00
TargetMachineRegistry.cpp 1. Use SubtargetFeatures in llc/lli. 2005-09-01 21:38:21 +00:00
TargetSchedInfo.cpp Convert tabs to spaces 2005-04-22 17:54:37 +00:00
TargetSchedule.td Add a default NoItinerary class for targets to use. 2006-01-27 01:41:38 +00:00
TargetSelectionDAG.td Targets all now request ConstantFP to be legalized into TargetConstantFP. 2006-01-29 06:26:08 +00:00
TargetSubtarget.cpp Eliminate all remaining tabs and trailing spaces. 2005-07-27 06:12:32 +00:00

Target Independent Opportunities:

===-------------------------------------------------------------------------===

FreeBench/mason contains code like this:

static p_type m0u(p_type p) {
  int m[]={0, 8, 1, 2, 16, 5, 13, 7, 14, 9, 3, 4, 11, 12, 15, 10, 17, 6};
  p_type pu;
  pu.a = m[p.a];
  pu.b = m[p.b];
  pu.c = m[p.c];
  return pu;
}

We currently compile this into a memcpy from a static array into 'm', then
a bunch of loads from m.  It would be better to avoid the memcpy and just do
loads from the static array.

===-------------------------------------------------------------------------===

Get the C front-end to expand hypot(x,y) -> llvm.sqrt(x*x+y*y) when errno and
precision don't matter (ffastmath).  Misc/mandel will like this. :)

//===---------------------------------------------------------------------===//

Solve this DAG isel folding deficiency:

int X, Y;

void fn1(void)
{
  X = X | (Y << 3);
}

compiles to

fn1:
	movl Y, %eax
	shll $3, %eax
	orl X, %eax
	movl %eax, X
	ret

The problem is the store's chain operand is not the load X but rather
a TokenFactor of the load X and load Y, which prevents the folding.

There are two ways to fix this:

1. The dag combiner can start using alias analysis to realize that y/x
   don't alias, making the store to X not dependent on the load from Y.
2. The generated isel could be made smarter in the case it can't
   disambiguate the pointers.

Number 1 is the preferred solution.

//===---------------------------------------------------------------------===//

DAG combine this into mul A, 8:

int %test(int %A) {
  %B = mul int %A, 8  ;; shift
  %C = add int %B, 7  ;; dead, no demanded bits.
  %D = and int %C, -8 ;; dead once add is gone.
  ret int %D
}

This sort of thing occurs in the alloca lowering code and other places that
are generating alignment of an already aligned value.