llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-16 11:30:51 +00:00

Author	SHA1	Message	Date
Michael Zolotukhin	7d5100d14e	Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN, safe.urem.iN (iN = i8, i16, i32, or i64). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206732 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-21 05:33:09 +00:00
Duncan P. N. Exon Smith	f44eda4764	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commit r206704, as expected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206707 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-19 22:46:00 +00:00
Duncan P. N. Exon Smith	c404a5334e	Revert "blockfreq: Temporarily turn on -debug-only=block-freq" This reverts commit r206705, as planned. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206706 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-19 22:45:44 +00:00
Duncan P. N. Exon Smith	69552aa77e	blockfreq: Temporarily turn on -debug-only=block-freq These tests fail after my BlockFrequencyInfo rewrite on two buildbots [1][2]. I can't reproduce it locally, so I'm temporarily turning on -debug-only=block-freq so I can find the problem. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1860 [2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18477 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206705 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-19 22:40:56 +00:00
Duncan P. N. Exon Smith	f465370a49	Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commit r206677, reapplying my BlockFrequencyInfo rewrite. I've done a careful audit, added some asserts, and fixed a couple of bugs (unfortunately, they were in unlikely code paths). There's a small chance that this will appease the failing bots [1][2]. (If so, great!) If not, I have a follow-up commit ready that will temporarily add -debug-only=block-freq to the two failing tests, allowing me to compare the code path between what the failing bots and what my machines (and the rest of the bots) are doing. Once I've triggered those builds, I'll revert both commits so the bots go green again. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 [2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445 <rdar://problem/14292693> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206704 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-19 22:34:26 +00:00
Yaron Keren	64b2297786	Patch by Vadim Chugunov Win64 stack unwinder gets confused when execution flow "falls through" after a call to 'noreturn' function. This fixes the "missing epilogue" problem by emitting a trap instruction for IR 'unreachable' on x86_x64-pc-windows. A secondary use for it would be for anyone wanting to make double-sure that 'noreturn' functions, indeed, do not return. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206684 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-19 13:47:43 +00:00
Yaron Keren	2fa9e6ca34	Patch by Ray Donnelly to print register names instead of numbers. http://reviews.llvm.org/D3422 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206683 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-19 05:40:09 +00:00
Duncan P. N. Exon Smith	2033057de8	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2 ) This reverts commit r206666, as planned. Still stumped on why the bots are failing. Sanitizer bots haven't turned anything up. If anyone can help me debug either of the failures (referenced in r206666) I'll owe them a beer. (In the meantime, I'll be auditing my patch for undefined behaviour.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206677 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-19 00:42:46 +00:00
Justin Bogner	aae82fb2f7	llvm-profdata: Avoid writing to /dev/null in tests We fseek on our output file in llvm-profdata, which errors on some systems. Avoid getting into the situation by writing to /dev/null git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206670 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 23:25:35 +00:00
Kevin Enderby	1a47d66496	Change the ARM assembler to require a :lower16: or :upper16 on non-constant expressions for mov instructions instead of silently truncating by default. For the ARM assembler, we want to avoid misleadingly allowing something like "mov r0, <symbol>" especially when we turn it into a movw and the expression <symbol> does not have a :lower16: or :upper16" as part of the expression. We don't want the behavior of silently truncating, which can be unexpected and lead to bugs that are difficult to find since this is an easy mistake to make. This does change the previous behavior of llvm but actually matches an older gnu assembler that would not allow this but print less useful errors of like “invalid constant (0x927c0) after fixup” and “unsupported relocation on symbol foo”. The error for llvm is "immediate expression for mov requires :lower16: or :upper16" with correct location information on the operand as shown in the added test cases. rdar://12342160 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206669 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 23:06:39 +00:00
Justin Bogner	ad326ae3f6	test: Add extra run lines to investigate an error on the bots git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206668 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 23:05:31 +00:00
Duncan P. N. Exon Smith	036e26bc29	Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2 ) This reverts commit r206628, reapplying r206622 (and r206626). Two tests are failing only on buildbots [1][2]: i.e., I can't reproduce on Darwin, and Chandler can't reproduce on Linux. Asan and valgrind don't tell us anything, but we're hoping the msan bot will catch it. So, I'm applying this again to get more feedback from the bots. I'll leave it in long enough to trigger builds in at least the sanitizer buildbots (it was failing for reasons unrelated to my commit last time it was in), and hopefully a few others.... and then I expect to revert a third time. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 [2]: http://llvm-amd64.freebsd.your.org/b/builders/clang-i386-freebsd/builds/18445 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206666 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 22:30:03 +00:00
Alexey Samsonov	e0d2d7fb26	[llvm-symbolizer] Print file/line for a PC even if there is no DIE describing it. This is important for symbolizing executables with debug info in unavailable .dwo files. Even if all DIE entries are missing, we can still symbolize an address: function name can be fetched from symbol table, and file/line info can be fetched from line table. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206665 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 22:22:44 +00:00
David Blaikie	2e3463ec43	Compress debug sections only when beneficial. Both ZLIB and the debug info compressed section header ("ZLIB" + the size of the uncompressed data) take some constant overhead so in some cases the compressed data is actually larger than the uncompressed data. In these cases, just don't compress or rename the section at all. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206659 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 21:52:26 +00:00
Justin Bogner	e153fb33e4	ProfileData: Add support for the indexed instrprof format This adds support for an indexed instrumentation based profiling format, which is just a small header and an on disk hash table. This format will be used by clang's -fprofile-instr-use= for PGO. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206656 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 21:48:40 +00:00
David Blaikie	037da24c10	Update the fragments of symbols in compressed sections. While unnamed relocations are already cached in side tables in ELFObjectWriter::RecordRelocation, symbols still need their fragments updated to refer to the newly compressed fragment (even if that fragment isn't big enough to fit the offset). Even though we only create temporary symbols in debug info sections this comes up in 32 bit builds where even temporary symbols in mergeable sections (such as debug_str) have to be emitted as named symbols. I tried a few other ways to do this but they all didn't work for various reasons: 1) Canonicalize the MCSymbolData in RecordRelocation, nulling out the Fragment (so it didn't have to be updated by CompressDebugSection). This doesn't work because some code relies on symbols having fragments to indicate that they're defined, I think. 2) Canonicalize the MCSymbolData in RecordRelocation to be "first fragment + absolute offset" so it would be cheaper to just test and update the fragment in CompressDebugSections. This doesn't work because the offset computed in RecordRelocation isn't that of the symbol's fragment, it's the passed in fragment (I haven't figured out what that fragment is - perhaps it's the location where the relocation is to be written). And if the fragment offset has to be computed only for this use we might as well just do it when we need to, in CompressDebugSection. I also added an assert to help catch this a bit more clearly, even though it is UB. The test case improvements would either assert fail and/or valgrind vail without the fix, even if they wouldn't necessarily fail the FileCheck output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206653 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 21:24:12 +00:00
Chad Rosier	6c4ec69c6b	[ARM64] Ports the Cortex-A53 Machine Model description from AArch64. Summary: This port includes the rudimentary latencies that were provided for the Cortex-A53 Machine Model in the AArch64 backend. It also changes the SchedAlias for COPY in the Cyclone model to an explicit WriteRes mapping to avoid conflicts in other subtargets. Differential Revision: http://reviews.llvm.org/D3427 Patch by Dave Estes <cestes@codeaurora.org>! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206652 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 21:22:04 +00:00
Yaron Keren	904f8dcaa4	Expanded test for x86-pc-windows-gnu and x86_64-pc-windows-gnu environments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206649 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 21:10:11 +00:00
Adam Nemet	d290fa608f	[X86] Improve buildFromShuffleMostly for AVX For a 256-bit BUILD_VECTOR consisting mostly of shuffles of 256-bit vectors, both the BUILD_VECTOR and its operands may need to be legalized in multiple steps. Consider: (v8f32 (BUILD_VECTOR (extract_vector_elt (v8f32 %vreg0,) Constant<1>), (extract_vector_elt %vreg0, Constant<2>), (extract_vector_elt %vreg0, Constant<3>), (extract_vector_elt %vreg0, Constant<4>), (extract_vector_elt %vreg0, Constant<5>), (extract_vector_elt %vreg0, Constant<6>), (extract_vector_elt %vreg0, Constant<7>), %vreg1)) a. We can't build a 256-bit vector efficiently so, we need to split it into two 128-bit vecs and combine them with VINSERTX128. b. Operands like (extract_vector_elt (v8f32 %vreg0), Constant<7>) needs to be split into a VEXTRACTX128 and a further extract_vector_elt from the resulting 128-bit vector. c. The extract_vector_elt from b. is lowered into a shuffle to the first element and a movss. Depending on the order in which we legalize the BUILD_VECTOR and its operands[1], buildFromShuffleMostly may be faced with: (v4f32 (BUILD_VECTOR (extract_vector_elt (vector_shuffle<1,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), (extract_vector_elt (vector_shuffle<2,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), (extract_vector_elt (vector_shuffle<3,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), %vreg1)) In order to figure out the underlying vector and their identity we need to see through the shuffles. [1] Note that the order in which operations and their operands are legalized is only guaranteed in the first iteration of LegalizeDAG. Fixes <rdar://problem/16296956> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206634 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 19:44:16 +00:00
Duncan P. N. Exon Smith	ebb5d29473	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2 ) This reverts commit r206622 and the MSVC fixup in r206626. Apparently the remotely failing tests are still failing, despite my attempt to fix the nondeterminism in r206621. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206628 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 17:56:08 +00:00
Duncan P. N. Exon Smith	54850bedf2	Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commit r206556, effectively reapplying commit r206548 and its fixups in r206549 and r206550. In an intervening commit I've added target triples to the tests that were failing remotely [1] (but passing locally). I'm hoping the mystery is solved? I'll revert this again if the tests are still failing remotely. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206622 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 17:22:25 +00:00
Duncan P. N. Exon Smith	1e1954f749	Add some target triples for better determinism These tests were failing on some buildbots after r206548 (reverted in r206556), but passing locally. They were missing target triples, so maybe that's the problem? git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206621 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 17:22:19 +00:00
Tim Northover	7b4b261611	AArch64/ARM64: add more NEON tests. Mostly no testing this time, since they were just wrangling target-specific intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206613 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 14:54:53 +00:00
Tim Northover	f34a512a68	ARM64: disable generation of .loh directives outside MachO. Part of PR19455. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206611 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 14:54:46 +00:00
Tim Northover	9cfd368302	ARM64: don't emit .subsections_via_symbols on ELF. Part of PR19455. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206610 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 14:54:41 +00:00
Tim Northover	1d5a2ad8a6	ARM64: add extra NEG pattern. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206609 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 14:54:35 +00:00
Tim Northover	936285440b	AArch64/ARM64: port more AArch64 tests to ARM64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206592 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 13:16:55 +00:00
Tim Northover	753cfe6172	AArch64/ARM64: add non-scalar lowering for more FCVT operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206591 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 13:16:42 +00:00
Tim Northover	7b4b522ec8	AArch64/ARM64: improve spotting of EXT instructions from VECTOR_SHUFFLE. We couldn't cope if the first mask element was UNDEF before, which isn't ideal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206588 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 12:50:58 +00:00
Evgeniy Stepanov	f19e327319	[msan] Add -msan-instrumentation-with-call-threshold. This flag replaces inline instrumentation for checks and origin stores with calls into MSan runtime library. This is a workaround for PR17409. Disabled by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206585 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 12:17:20 +00:00
Chandler Carruth	4c7edb1240	[LCG] Add support for building persistent and connected SCCs to the LazyCallGraph. This is the start of the whole point of this different abstraction, but it is just the initial bits. Here is a run-down of what's going on here. I'm planning to incorporate some (or all) of this into comments going forward, hopefully with better editing and wording. =] The crux of the problem with the traditional way of building SCCs is that they are ephemeral. The new pass manager however really needs the ability to associate analysis passes and results of analysis passes with SCCs in order to expose these analysis passes to the SCC passes. Making this work is kind-of the whole point of the new pass manager. =] So, when we're building SCCs for the call graph, we actually want to build persistent nodes that stick around and can be reasoned about later. We'd also like the ability to walk the SCC graph in more complex ways than just the traditional postorder traversal of the current CGSCC walk. That means that in addition to being persistent, the SCCs need to be connected into a useful graph structure. However, we still want the SCCs to be formed lazily where possible. These constraints are quite hard to satisfy with the SCC iterator. Also, using that would bypass our ability to actually add data to the nodes of the call graph to facilite implementing the Tarjan walk. So I've re-implemented things in a more direct and embedded way. This immediately makes it easy to get the persistence and connectivity correct, and it also allows leveraging the existing nodes to simplify the algorithm. I've worked somewhat to make this implementation more closely follow the traditional paper's nomenclature and strategy, although it is still a bit obtuse because it isn't recursive, using an explicit stack and a tail call instead, and it is interruptable, resuming each time we need another SCC. The other tricky bit here, and what actually took almost all the time and trials and errors I spent building this, is exactly what graph structure to build for the SCCs. The naive thing to build is the call graph in its newly acyclic form. I wrote about 4 versions of this which did precisely this. Inevitably, when I experimented with them across various use cases, they became incredibly awkward. It was all implementable, but it felt like a complete wrong fit. Square peg, round hole. There were two overriding aspects that pushed me in a different direction: 1) We want to discover the SCC graph in a postorder fashion. That means the root node will be the last node we find. Using the call-SCC DAG as the graph structure of the SCCs results in an orphaned graph until we discover a root. 2) We will eventually want to walk the SCC graph in parallel, exploring distinct sub-graphs independently, and synchronizing at merge points. This again is not helped by the call-SCC DAG structure. The structure which, quite surprisingly, ended up being completely natural to use is the inverse of the call-SCC DAG. We add the leaf SCCs to the graph as "roots", and have edges to the caller SCCs. Once I switched to building this structure, everything just fell into place elegantly. Aside from general cleanups (there are FIXMEs and too few comments overall) that are still needed, the other missing piece of this is support for iterating across levels of the SCC graph. These will become useful for implementing #2, but they aren't an immediate priority. Once SCCs are in good shape, I'll be working on adding mutation support for incremental updates and adding the pass manager that this analysis enables. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206581 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 10:50:32 +00:00
Benjamin Kramer	c32e261a1a	X86: Pattern match scalar loads + vcvtph2ps into just vcvtph2ps. vcvtph2ps only reads the lower 64 bits of the address passed to the intrinsic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206579 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 10:45:33 +00:00
Chandler Carruth	64eed05a3a	Revert r206565 (and r206566 which updated tests). This commit was attributed to a different person from the person who posted the patch to the list, and the person who posted it the list claimed when they did that they were not the author, but that the author was yet a third person. I don't know what is going on here, but reverting until the attribution is clear and the author has explicitly contributed the patch. Also, the review hasn't really involved any of the MC maintainers and that seems questionable too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206576 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:35:51 +00:00
Tim Northover	fb96efa7dd	AArch64/ARM64: port atomics test to ARM64. Covers quite a few extra instructions (like any of the max/min ones which were broken until recently on ARM64). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206575 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:31 +00:00
Tim Northover	0d6995985a	AArch64/ARM64: spot a greater variety of concat_vector operations. Code mostly copied from AArch64, just tidied up a trifle and plumbed into the ARM64 way of doing things. This also enables the AArch64 tests which inspired the previous untested commits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206574 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:27 +00:00
Tim Northover	70b63374f2	ARM64: implement cunning optimisation from AArch64 A vector extract followed by a dup can become a single instruction even if the types don't match. AArch64 handled this in ISelLowering, but a few reasonably simple patterns can take care of it in TableGen, so that's where I've put it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206573 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:20 +00:00
Tim Northover	66643da8fc	AArch64/ARM64: emit all vector FP comparisons as such. ARM64 was scalarizing some vector comparisons which don't quite map to AArch64's compare and mask instructions. AArch64's approach of sacrificing a little efficiency to emulate them with the limited set available was better, so I ported it across. More "inspired by" than copy/paste since the backend's internal expectations were a bit different, but the tests were invaluable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206570 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:07 +00:00
Tim Northover	937290d7ed	AArch64/ARM64: port BSL logic from AArch64 & enable test. I enhanced it a little in the process. The decision shouldn't really be beased on whether a BUILD_VECTOR is a splat: any set of constants will do the job provided they're related in the correct way. Also, the BUILD_VECTOR could be any operand of the incoming AND nodes, so it's best to check for all 4 possibilities rather than assuming it'll be the RHS. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206569 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:01 +00:00
Tim Northover	2f5d14af9d	AArch64/ARM64: copy byval implementation from AArch64. It's not actually used to handle C or C++ ABI rules on ARM64, but could well be emitted by other language front-ends, so it's as well to have a sensible implementation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206568 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:30:52 +00:00
Jiangning Liu	eea662fead	Add missing config file for newly added test case introduced by r206563. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206567 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:05:50 +00:00
Yaron Keren	188195c3f9	Updated test with register names following r206565. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206566 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 08:50:09 +00:00
Kostya Serebryany	40a9c0f58b	[asan] one more workaround for PR17409: don't do BB-level coverage instrumentation if there are more than N (=1500) basic blocks. This makes ASanCoverage work on libjpeg_turbo/jchuff.c used by Chrome, which has 1824 BBs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206564 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 08:02:42 +00:00
Jiangning Liu	a1da819896	This commit allows vectorized loops to be unrolled by a factor of 2 for AArch64. A new test case is also added for ARM64. Patched by Z.Zheng git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206563 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 07:57:54 +00:00
Jiangning Liu	bc3655f9c8	This is one of the optimizations ported from ARM64 to AArch64 to address the performance gap between these two back ends. The test case newly added for AArch64 already exists in ARM64. Patched by Z.Zheng git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206559 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 05:58:09 +00:00
Matt Arsenault	746734df1a	R600/SI: Try to use scalar BFE. Use scalar BFE with constant shift and offset when possible. This is complicated by the fact that the scalar version packs the two operands of the vector version into one. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206558 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 05:19:26 +00:00
Jiangning Liu	532a5ffe4c	This commit enables unaligned memory accesses of vector types on AArch64 back end. This should boost vectorized code performance. Patched by Z. Zheng git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206557 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 03:58:38 +00:00
Duncan P. N. Exon Smith	c7a3b95c0f	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commits r206548, r206549 and r206549. There are some unit tests failing that aren't failing locally [1], so reverting until I have time to investigate. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206556 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 02:17:43 +00:00
Duncan P. N. Exon Smith	cc1e1707b8	blockfreq: Rewrite BlockFrequencyInfoImpl Rewrite the shared implementation of BlockFrequencyInfo and MachineBlockFrequencyInfo entirely. The old implementation had a fundamental flaw: precision losses from nested loops (or very wide branches) compounded past loop exits (and convergence points). The @nested_loops testcase at the end of test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating. This function has three nested loops, with branch weights in the loop headers of 1:4000 (exit:continue). The old analysis gives non-sensical results: Printing analysis 'Block Frequency Analysis' for function 'nested_loops': ---- Block Freqs ---- entry = 1.0 for.cond1.preheader = 1.00103 for.cond4.preheader = 5.5222 for.body6 = 18095.19995 for.inc8 = 4.52264 for.inc11 = 0.00109 for.end13 = 0.0 The new analysis gives correct results: Printing analysis 'Block Frequency Analysis' for function 'nested_loops': block-frequency-info: nested_loops - entry: float = 1.0, int = 8 - for.cond1.preheader: float = 4001.0, int = 32007 - for.cond4.preheader: float = 16008001.0, int = 128064007 - for.body6: float = 64048012001.0, int = 512384096007 - for.inc8: float = 16008001.0, int = 128064007 - for.inc11: float = 4001.0, int = 32007 - for.end13: float = 1.0, int = 8 Most importantly, the frequency leaving each loop matches the frequency entering it. The new algorithm leverages BlockMass and PositiveFloat to maintain precision, separates "probability mass distribution" from "loop scaling", and uses dithering to eliminate probability mass loss. I have unit tests for these types out of tree, but it was decided in the review to make the classes private to BlockFrequencyInfoImpl, and try to shrink them (or remove them entirely) in follow-up commits. The new algorithm should generally have a complexity advantage over the old. The previous algorithm was quadratic in the worst case. The new algorithm is still worst-case quadratic in the presence of irreducible control flow, but it's linear without it. The key difference between the old algorithm and the new is that control flow within a loop is evaluated separately from control flow outside, limiting propagation of precision problems and allowing loop scale to be calculated independently of mass distribution. Loops are visited bottom-up, their loop scales are calculated, and they are replaced by pseudo-nodes. Mass is then distributed through the function, which is now a DAG. Finally, loops are revisited top-down to multiply through the loop scales and the masses distributed to pseudo nodes. There are some remaining flaws. - Irreducible control flow isn't modelled correctly. LoopInfo and MachineLoopInfo ignore irreducible edges, so this algorithm will fail to scale accordingly. There's a note in the class documentation about how to get closer. See also the comments in test/Analysis/BlockFrequencyInfo/irreducible.ll. - Loop scale is limited to 4096 per loop (2^12) to avoid exhausting the 64-bit integer precision used downstream. - The "bias" calculation proposed on llvmdev is not incorporated here. This will be added in a follow-up commit, once comments from this review have been handled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206548 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 01:57:45 +00:00
Matt Arsenault	6834a55df3	R600/SI: Match sign_extend_inreg to s_sext_i32_i8 and s_sext_i32_i16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206547 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 01:53:18 +00:00
Tom Stellard	cfe02c46dc	R600/SI: Use SReg_64 instead of VSrc_64 when selecting BUILD_PAIR git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206541 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 00:36:21 +00:00

1 2 3 4 5 ...

23742 Commits