Chris Lattner b3f0673d52 Teach valuetracking that byval arguments with a specified alignment are
aligned.

Teach memcpyopt to not give up all hope when confonted with an underaligned
memcpy feeding an overaligned byval.  If the *source* of the memcpy can be
determined to be adequeately aligned, or if it can be forced to be, we can
eliminate the memcpy.

This addresses PR9794.  We now compile the example into:

define i32 @f(%struct.p* nocapture byval align 8 %q) nounwind ssp {
entry:
  %call = call i32 @g(%struct.p* byval align 8 %q) nounwind
  ret i32 %call
}

in both x86-64 and x86-32 mode.  We still don't get a tailcall though,
because tailcalls apparently can't handle byval.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131884 91177308-0d34-0410-b5e6-96231b3b80d8
2011-05-23 00:03:39 +00:00
..
2011-03-01 00:02:51 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//