llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-02-22 13:29:44 +00:00

Author	SHA1	Message	Date
Elena Demikhovsky	3d1ae71813	AVX-512: masked load/store + intrinsics for them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203790 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 12:05:52 +00:00
Hal Finkel	ab849adec4	[PowerPC] Initial support for the VSX instruction set VSX is an ISA extension supported on the POWER7 and later cores that enhances floating-point vector and scalar capabilities. Among other things, this adds <2 x double> support and generally helps to reduce register pressure. The interesting part of this ISA feature is the register configuration: there are 64 new 128-bit vector registers, the 32 of which are super-registers of the existing 32 scalar floating-point registers, and the second 32 of which overlap with the 32 Altivec vector registers. This makes things like vector insertion and extraction tricky: this can be free but only if we force a restriction to the right register subclass when needed. A new "minipass" PPCVSXCopy takes care of this (although it could do a more-optimal job of it; see the comment about unnecessary copies below). Please note that, currently, VSX is not enabled by default when targeting anything because it is not yet ready for that. The assembler and disassembler are fully implemented and tested. However: - CodeGen support causes miscompiles; test-suite runtime failures: MultiSource/Benchmarks/FreeBench/distray/distray MultiSource/Benchmarks/McCat/08-main/main MultiSource/Benchmarks/Olden/voronoi/voronoi MultiSource/Benchmarks/mafft/pairlocalalign MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 SingleSource/Benchmarks/CoyoteBench/almabench SingleSource/Benchmarks/Misc/matmul_f64_4x4 - The lowering currently falls back to using Altivec instructions far more than it should. Worse, there are some things that are scalarized through the stack that shouldn't be. - A lot of unnecessary copies make it past the optimizers, and this needs to be fixed. - Many more regression tests are needed. Normally, I'd fix these things prior to committing, but there are some students and other contributors who would like to work this, and so it makes sense to move this development process upstream where it can be subject to the regular code-review procedures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203768 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 07:58:58 +00:00
Saleem Abdulrasool	0ed4ef85a8	ARM: support emission of complex SO expressions Support to the IAS was added to actually parse and handle the complex SO expressions. However, the object file lowering was not updated to compensate for the fact that the shift operand may be an absolute expression. When trying to assemble to an object file, the lowering would fail while succeeding when emitting purely assembly. Add an appropriate test. The test case is inspired by the test case provided by Jiangning Liu who also brought the issue to light. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203762 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 07:02:41 +00:00
Saleem Abdulrasool	b0f12dfab6	Support: add support to identify WinCOFF/ARM objects Add the Windows COFF ARM object file magic. This enables the LLVM tools to interact with COFF object files for Windows on ARM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203761 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 07:02:35 +00:00
Karthik Bhat	178df3f1bb	Fix PR18800. llvm intrinsic memcpy takes 5 arguments void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, i32 <len>, i32 <align>, i1 <isvolatile>).The test case incorrectly uses the old format resulting in isVolatile function in MemIntrinsic to crash during SROA transformation.Modified the test case to use correct signature of memcpy and memset. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203750 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 04:50:29 +00:00
NAKAMURA Takumi	2b5925ba74	llvm/test/BugPoint/compile-custom.ll.py: Make it py3-compatible. [PR19112] FIXME: Get rid of invoking this. I guess it wouldn't run on win32 due to lacking of shell support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203740 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 00:10:37 +00:00
NAKAMURA Takumi	9a4d525b7d	decl-derived-member.ll: Try to unbreak. Don't add -mtriple to %llc_dwarf. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203732 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 23:08:19 +00:00
Rafael Espindola	eb8eef0b3f	This test need the X86 backend, move it to the X86 sub directory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203725 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 22:03:43 +00:00
Justin Bogner	efa9416a21	Back out Profile library and dependent commits Chandler voiced some concern with checking this in without some discussion first. Reverting for now. This reverts r203703, r203704, r203708, and 203709. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203723 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 22:00:57 +00:00
Michael Zolotukhin	4a0593ccd3	PR17473: Don't normalize an expression during postinc transformation unless it's invertible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203719 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 21:31:05 +00:00
Adam Nemet	a65ca9dcf0	[X86] Add peephole for masked rotate amount Extend what's currently done for shift because the HW performs this masking implicitly: (rotl:i32 x, (and y, 31)) -> (rotl:i32 x, y) I use the newly factored out multiclass that was only supporting shifts so far. For testing I extended my testcase for the new rotation idiom. <rdar://problem/15295856> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203718 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 21:20:55 +00:00
Rafael Espindola	9367c49f5d	Fix the ocaml test to not create a alias to a declaration. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203717 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 21:20:42 +00:00
Raul E. Silvera	230eda4bdf	Resubmit "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..." This reverts commit 86cb795388643710dab34941ddcb5a9470ac39d8. The problems previously found have been resolved through other CLs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203707 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 20:21:50 +00:00
Rafael Espindola	12a405757c	Add a triple to fix the test on OS X. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203706 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 20:21:35 +00:00
Rafael Espindola	38048cdb1c	Reject alias to undefined symbols in the verifier. On ELF and COFF an alias is just another name for a position in the file. There is no way to refer to a position in another file, so an alias to undefined is meaningless. MachO currently doesn't support aliases. The spec has a N_INDR, which when implemented will have a different set of restrictions. Adding support for it shouldn't be harder than any other IR extension. For now, having the IR represent what is actually possible with current tools makes it easier to fix the design of GlobalAlias. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203705 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 20:15:49 +00:00
Justin Bogner	4207c6759c	llvm-profdata: Use the Profile library, implement show and generate This replaces the llvm-profdata tool with a version that uses the recently introduced Profile library. The new tool has the ability to generate and summarize profdata files as well as merging them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203704 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 20:14:17 +00:00
Eric Christopher	7eb747e373	Fix two thinkos in testcase and remove XFAIL. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203702 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 20:12:02 +00:00
Roman Divacky	060c0eb1d2	Allow exclamation and tilde to be parsed as a part of the ppc asm operand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203699 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 19:25:57 +00:00
Eric Christopher	b2ff2c7efa	XFAIL this temporarily. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203698 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 19:06:03 +00:00
Eric Christopher	2aeb92e640	Move test to X86 only for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203697 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 19:02:44 +00:00
Matt Arsenault	054f4eccd2	R600: Fix trunc store from i64 to i1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203695 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 18:45:52 +00:00
Hans Wennborg	09a31f3154	Allow switch-to-lookup table for tables with holes by adding bitmask check This allows us to generate table lookups for code such as: unsigned test(unsigned x) { switch (x) { case 100: return 0; case 101: return 1; case 103: return 2; case 105: return 3; case 107: return 4; case 109: return 5; case 110: return 6; default: return f(x); } } Since cases 102, 104, etc. are not constants, the lookup table has holes in those positions. We therefore guard the table lookup with a bitmask check. Patch by Jasper Neumann! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203694 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 18:35:40 +00:00
Eric Christopher	365f0455a6	When computing the size of a base type be conservative if the type is a declaration and return the size of the type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203690 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 18:18:05 +00:00
Evan Cheng	9225686155	Revert r203488 and r203520. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203687 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 18:09:37 +00:00
Eric Christopher	020026c5f6	Turn on hashing by default for split dwarf compile units. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203680 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 17:14:43 +00:00
Rafael Espindola	3b8cc2299b	Try harder to evaluate expressions when printing assembly. When printing assembly we don't have a Layout object, but we can still try to fold some constants. Testcase by Ulrich Weigand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203677 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 16:55:59 +00:00
Daniel Sanders	fe6bd52bf2	[mips] BSEL's and BINS[RL] operands are reversed compared to the vselect node used in the pattern. Summary: Correct the match patterns and the lowerings that made the CodeGen tests pass despite the mistakes. The original testcase that discovered the problem was SingleSource/UnitTests/SignlessType/factor.c in test-suite. During review, we also found that some of the existing CodeGen tests were incorrect and fixed them: * bitwise.ll: In bsel_v16i8 the IfSet/IfClear were reversed because bsel and bmnz have different operand orders and the test didn't correctly account for this. bmnz goes 'IfClear, IfSet, CondMask', while bsel goes 'CondMask, IfClear, IfSet'. * vec.ll: In the cases where a bsel is emitted as a bmnz (they are the same operation with a different input tied to the result) the operands were in the wrong order. * compare.ll and compare_float.ll: The bsel operand order was correct for a greater-than comparison, but a greater-than comparison instruction doesn't exist. Lowering this operation inverts the condition so the IfSet/IfClear need to be swapped to match. The differences between BSEL, BMNZ, and BMZ and how they map to/from vselect are rather confusing. I've therefore added a note to MSA.txt to explain this in a single place in addition to the comments that explain each case. Reviewers: matheusalmeida, jacksprat Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3028 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203657 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 11:54:00 +00:00
Tim Northover	d4517fa24d	ARM: correct Dwarf output for non-contiguous VFP saves. When the list of VFP registers to be saved was non-contiguous (so multiple vpush/vpop instructions were needed) these were being ordered oddly, as in: vpush {d8, d9} vpush {d11} This led to the layout in memory being [d11, d8, d9] which is ugly and doesn't match the CFI_INSTRUCTIONs we're generating either (so Dwarf info would be broken). This switches the order of vpush/vpop (in both prologue and epilogue, obviously) so that the Dwarf locations are correct again. rdar://problem/16264856 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203655 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 11:29:23 +00:00
Hans Wennborg	e03daa01f6	[ARM] Use DWARF register numbers for CFI directives in ELF assembly It seems gas can't handle CFI directives with VFP register names ("d12", etc.). This broke us trying to build Chromium for Android after 201423. A gas bug has been filed: https://sourceware.org/bugzilla/show_bug.cgi?id=16694 compnerd suggested making this conditional on whether we're using the integrated assembler or not. I'll look into that in a follow-up patch. Differential Revision: http://llvm-reviews.chandlerc.com/D3049 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203635 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 03:52:34 +00:00
David Blaikie	9d3e746b85	DebugInfo: Omit pubnames/pubtypes when compiling with -gmlt git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203634 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 03:34:38 +00:00
David Blaikie	1c7fef193f	DebugInfo: Do not emit pubnames/pubtypes sections if they are empty git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203622 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 23:35:06 +00:00
David Blaikie	c4d9cc0e09	Test for empty pubnames/pubtypes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203621 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 23:35:03 +00:00
David Blaikie	e98b0e466f	DebugInfo: Refactor emitDebugPubNames/Types into a common implementation I could fold the callers into their one call site, but the indirection (given how verbose choosing the section is) seemed helpful. The use of a member function pointer's a bit "tricky", but seems limited enough, the call sites are simple/clean/clear, and there's only one use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203619 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 23:18:15 +00:00
David Blaikie	31add9cbed	Clean up test/DebugInfo/empty.ll now that we have an alias for "llc with dwarf output" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203616 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 22:46:12 +00:00
Sasa Stankovic	ff73a2bf86	[mips] Implement NaCl sandboxing of function calls: * Add masking instructions before indirect calls (in MC layer). * Align call + branch delay to the bundle end (in MC layer). Differential Revision: http://llvm-reviews.chandlerc.com/D3032 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203606 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 21:23:40 +00:00
Rafael Espindola	7e3e9aa8e1	Don't assume an empty stderr. GuardMalloc can print info to stderr, causing these tests to fail. Since FileCheck errors on empty inputs, just add a bit of dummy data to make it happy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203595 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 18:25:33 +00:00
Hans Wennborg	1332459dbb	X86: Don't generate 64-bit movd after cmpneqsd in 32-bit mode (PR19059) This fixes the bug where we would bitcast the 64-bit floating point result of cmpneqsd to a 64-bit integer even on 32-bit targets. Differential Revision: http://llvm-reviews.chandlerc.com/D3009 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203581 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:49:24 +00:00
Saleem Abdulrasool	90d0ed297f	ARM: honour -f{no-,}optimize-sibling-calls Use the options in the ARMISelLowering to control whether tail calls are optimised or not. Previously, this option was entirely ignored on the ARM target and only honoured on x86. This option is mostly useful in profiling scenarios. The default remains that tail call optimisations will be applied. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203577 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:09:54 +00:00
Saleem Abdulrasool	2b42ff6fdb	ARM: remove ancient -arm-tail-calls option This option is from 2010, designed to work around a linker issue on Darwin for ARM. According to grosbach this is no longer an issue and this option can safely be removed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203576 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:09:49 +00:00
Saleem Abdulrasool	cde1f2eae2	ARM: enable tail call optimisation on Thumb 2 Tail call optimisation was previously disabled on all targets other than iOS5.0+. This enables the tail call optimisation on all Thumb 2 capable platforms. The test adjustments are to remove the IR hint "tail" to function invocation. The tests were designed assuming that tail call optimisations would not kick in which no longer holds true. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203575 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:09:44 +00:00
Erik Verbruggen	0638609d66	Fix crash in PRE. After r203553 overflow intrinsics and their non-intrinsic (normal) instruction get hashed to the same value. This patch prevents PRE from moving an instruction into a predecessor block, and trying to add a phi node that gets two different types (the intrinsic result and the non-intrinsic result), resulting in a failing assert. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203574 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:07:32 +00:00
Tim Northover	ca396e391e	IR: add a second ordering operand to cmpxhg for failure The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203559 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 10:48:52 +00:00
Erik Verbruggen	cf5240642b	GVN: merge overflow intrinsics with non-overflow instructions. When an overflow intrinsic is followed by a non-overflow instruction, replace the latter with an extract. For example: %sadd = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) %sadd3 = add i32 %a, %b Here the add statement will be replaced by an extract. When an overflow intrinsic follows a non-overflow instruction, a clone of the intrinsic is inserted before the normal instruction, which makes it the same as the previous case. Subsequent runs of GVN can then clean up the duplicate instructions and insert the extract. This fixes PR8817. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203553 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 09:36:48 +00:00
Jim Grosbach	7a37166a7a	X86: Enable ISel of 16-bit MOVBE instructions. When the MOVBE instructions are available, use them for 16-bit endian swapping as well as for 32 and 64 bit. The patterns were already present on the instructions, but weren't being matched because the operation was unconditionally marked to 'Expand.' Change that to be conditional on whether the MOVBE instructions are available. Use 'rolw' to implement the in-register version (32 and 64 bit have the dedicated 'bswap' instruction for that). Patch by Louis Gerbarg <lgg@apple.com>. rdar://15479984 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203524 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 00:44:14 +00:00
Matt Arsenault	53131629dc	Fix undefined behavior in vector shift tests. These were all shifting the same amount as the bitwidth. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203519 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 00:01:41 +00:00
Duncan P. N. Exon Smith	f5d17528ee	Module: Don't rename in getOrInsertFunction() During LTO, user-supplied definitions of C library functions often exist. -instcombine uses Module::getOrInsertFunction() to get a handle on library functions (e.g., @puts, when optimizing @printf). Previously, Module::getOrInsertFunction() would rename any matching functions with local linkage, and create a new declaration. In LTO, this is the opposite of desired behaviour, as it skips by the user-supplied version of the library function and creates a new undefined reference which the linker often cannot resolve. After some discussing with Rafael on the list, it looks like it's undesired behaviour. If a consumer actually needs this behaviour, we should add new API with a more explicit name. I added two testcases: one specifically for the -instcombine behaviour and one for the LTO flow. <rdar://problem/16165191> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203513 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-10 23:42:28 +00:00
Raul E. Silvera	6df2b69098	When analyzing vectors of element type that require legalization, the legalization cost must be included to get an accurate estimation of the total cost of the scalarized vector. The inaccurate cost triggered unprofitable SLP vectorization on 32-bit X86. Summary: Include legalization overhead when computing scalarization cost Reviewers: hfinkel, nadav CC: chandlerc, rnk, llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2992 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203509 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-10 22:59:13 +00:00
Diego Novillo	87393cfd6b	Use discriminator information in sample profiles. Summary: When the sample profiles include discriminator information, use the discriminator values to distinguish instruction weights in different basic blocks. This modifies the BodySamples mapping to map <line, discriminator> pairs to weights. Instructions on the same line but different blocks, will use different discriminator values. This, in turn, means that the blocks may have different weights. Other changes in this patch: - Add tests for positive values of line offset, discriminator and samples. - Change data types from uint32_t to unsigned and int and do additional validation. Reviewers: chandlerc CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2857 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203508 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-10 22:41:28 +00:00
Benjamin Kramer	8da0b7358d	MemCpyOpt: When merging memsets also merge the trivial case of two memsets with the same destination. The testcase is from PR19092, but I think the bug described there is actually a clang issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203489 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-10 21:05:13 +00:00
Evan Cheng	d89b0f200c	For functions with ARM target specific calling convention, when simplify-libcall optimize a call to a llvm intrinsic to something that invovles a call to a C library call, make sure it sets the right calling convention on the call. e.g. extern double pow(double, double); double t(double x) { return pow(10, x); } Compiles to something like this for AAPCS-VFP: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x) ret double %0 } declare double @llvm.pow.f64(double, double) #1 Simplify libcall (part of instcombine) will turn the above into: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %__exp10 = call double @__exp10(double %x) #1 ret double %__exp10 } declare double @__exp10(double) The pre-instcombine code works because calls to LLVM builtins are special. Instruction selection will chose the right calling convention for the call. However, the code after instcombine is wrong. The call to __exp10 will use the C calling convention. I can think of 3 options to fix this. 1. Make "C" calling convention just work since the target should know what CC is being used. This doesn't work because each function can use different CC with the "pcs" attribute. 2. Have Clang add the right CC keyword on the calls to LLVM builtin. This will work but it doesn't match the LLVM IR specification which states these are "Standard C Library Intrinsics". 3. Fix simplify libcall so the resulting calls to the C routines will have the proper CC keyword. e.g. %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1 This works and is the solution I implemented here. Both solutions #2 and #3 would work. After carefully considering the pros and cons, I decided to implement #3 for the following reasons. 1. It doesn't change the "spec" of the intrinsics. 2. It's a self-contained fix. There are a couple of potential downsides. 1. There could be other places in the optimizer that is broken in the same way that's not addressed by this. 2. There could be other calling conventions that need to be propagated by simplify-libcall that's not handled. But for now, this is the fix that I'm most comfortable with. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203488 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-10 20:49:45 +00:00

1 2 3 4 5 ...

23086 Commits