llvm-6502/lib/Support/PrettyStackTrace.cpp

149 lines
4.7 KiB
C++
Raw Normal View History

//===- PrettyStackTrace.cpp - Pretty Crash Handling -----------------------===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file defines some helpful functions for dealing with the possibility of
// Unix signals occurring while your program is running.
//
//===----------------------------------------------------------------------===//
#include "llvm/Support/PrettyStackTrace.h"
#include "llvm-c/Core.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/Config/config.h" // Get autoconf configuration settings
#include "llvm/Support/Signals.h"
#include "llvm/Support/Watchdog.h"
#include "llvm/Support/raw_ostream.h"
#ifdef HAVE_CRASHREPORTERCLIENT_H
#include <CrashReporterClient.h>
#endif
using namespace llvm;
[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack tracing code. Managed static was just insane overhead for this. We took memory fences and external function calls in every path that pushed a pretty stack frame. This includes a multitude of layers setting up and tearing down passes, the parser in Clang, everywhere. For the regression test suite or low-overhead JITs, this was contributing to really significant overhead. Even the LLVM ThreadLocal is really overkill here because it uses pthread_{set,get}_specific logic, and has careful code to both allocate and delete the thread local data. We don't actually want any of that, and this code in particular has problems coping with deallocation. What we want is a single TLS pointer that is valid to use during global construction and during global destruction, any time we want. That is exactly what every host compiler and OS we use has implemented for a long time, and what was standardized in C++11. Even though not all of our host compilers support the thread_local keyword, we can directly use the platform-specific keywords to get the minimal functionality needed. Provided this limited trial survives the build bots, I will move this to Compiler.h so it is more widely available as a light weight if limited alternative to the ThreadLocal class. Many thanks to David Majnemer for helping me think through the implications across platforms and craft the MSVC-compatible syntax. The end result is *substantially* faster. When running llc in a tight loop over a small IR file targeting the aarch64 backend, this improves its performance by over 10% for me. It also seems likely to fix the remaining regressions seen by JIT users with threading enabled. This may actually have more impact on real-world compile times due to the use of the pretty stack tracing utility throughout the rest of Clang or LLVM, but I've not collected any detailed measurements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 09:52:14 +00:00
// We need a thread local pointer to manage the stack of our stack trace
// objects, but we *really* cannot tolerate destructors running and do not want
// to pay any overhead of synchronizing. As a consequence, we use a raw
// thread-local variable. Some day, we should be able to use a limited subset
// of C++11's thread_local, but compilers aren't up for it today.
// FIXME: This should be moved to a Compiler.h abstraction.
#ifdef _MSC_VER // MSVC supports this with a __declspec.
static __declspec(thread) const PrettyStackTraceEntry
*PrettyStackTraceHead = nullptr;
#else // Clang, GCC, and all compatible compilers tend to use __thread.
static __thread const PrettyStackTraceEntry *PrettyStackTraceHead = nullptr;
#endif
static unsigned PrintStack(const PrettyStackTraceEntry *Entry, raw_ostream &OS){
unsigned NextID = 0;
if (Entry->getNextEntry())
NextID = PrintStack(Entry->getNextEntry(), OS);
OS << NextID << ".\t";
{
sys::Watchdog W(5);
Entry->print(OS);
}
return NextID+1;
}
/// PrintCurStackTrace - Print the current stack trace to the specified stream.
static void PrintCurStackTrace(raw_ostream &OS) {
// Don't print an empty trace.
[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack tracing code. Managed static was just insane overhead for this. We took memory fences and external function calls in every path that pushed a pretty stack frame. This includes a multitude of layers setting up and tearing down passes, the parser in Clang, everywhere. For the regression test suite or low-overhead JITs, this was contributing to really significant overhead. Even the LLVM ThreadLocal is really overkill here because it uses pthread_{set,get}_specific logic, and has careful code to both allocate and delete the thread local data. We don't actually want any of that, and this code in particular has problems coping with deallocation. What we want is a single TLS pointer that is valid to use during global construction and during global destruction, any time we want. That is exactly what every host compiler and OS we use has implemented for a long time, and what was standardized in C++11. Even though not all of our host compilers support the thread_local keyword, we can directly use the platform-specific keywords to get the minimal functionality needed. Provided this limited trial survives the build bots, I will move this to Compiler.h so it is more widely available as a light weight if limited alternative to the ThreadLocal class. Many thanks to David Majnemer for helping me think through the implications across platforms and craft the MSVC-compatible syntax. The end result is *substantially* faster. When running llc in a tight loop over a small IR file targeting the aarch64 backend, this improves its performance by over 10% for me. It also seems likely to fix the remaining regressions seen by JIT users with threading enabled. This may actually have more impact on real-world compile times due to the use of the pretty stack tracing utility throughout the rest of Clang or LLVM, but I've not collected any detailed measurements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 09:52:14 +00:00
if (!PrettyStackTraceHead) return;
// If there are pretty stack frames registered, walk and emit them.
OS << "Stack dump:\n";
[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack tracing code. Managed static was just insane overhead for this. We took memory fences and external function calls in every path that pushed a pretty stack frame. This includes a multitude of layers setting up and tearing down passes, the parser in Clang, everywhere. For the regression test suite or low-overhead JITs, this was contributing to really significant overhead. Even the LLVM ThreadLocal is really overkill here because it uses pthread_{set,get}_specific logic, and has careful code to both allocate and delete the thread local data. We don't actually want any of that, and this code in particular has problems coping with deallocation. What we want is a single TLS pointer that is valid to use during global construction and during global destruction, any time we want. That is exactly what every host compiler and OS we use has implemented for a long time, and what was standardized in C++11. Even though not all of our host compilers support the thread_local keyword, we can directly use the platform-specific keywords to get the minimal functionality needed. Provided this limited trial survives the build bots, I will move this to Compiler.h so it is more widely available as a light weight if limited alternative to the ThreadLocal class. Many thanks to David Majnemer for helping me think through the implications across platforms and craft the MSVC-compatible syntax. The end result is *substantially* faster. When running llc in a tight loop over a small IR file targeting the aarch64 backend, this improves its performance by over 10% for me. It also seems likely to fix the remaining regressions seen by JIT users with threading enabled. This may actually have more impact on real-world compile times due to the use of the pretty stack tracing utility throughout the rest of Clang or LLVM, but I've not collected any detailed measurements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 09:52:14 +00:00
PrintStack(PrettyStackTraceHead, OS);
OS.flush();
}
// Integrate with crash reporter libraries.
#if defined (__APPLE__) && HAVE_CRASHREPORTERCLIENT_H
// If any clients of llvm try to link to libCrashReporterClient.a themselves,
// only one crash info struct will be used.
extern "C" {
CRASH_REPORTER_CLIENT_HIDDEN
struct crashreporter_annotations_t gCRAnnotations
__attribute__((section("__DATA," CRASHREPORTER_ANNOTATIONS_SECTION)))
= { CRASHREPORTER_ANNOTATIONS_VERSION, 0, 0, 0, 0, 0, 0 };
}
#elif defined (__APPLE__) && HAVE_CRASHREPORTER_INFO
static const char *__crashreporter_info__ = 0;
asm(".desc ___crashreporter_info__, 0x10");
#endif
/// CrashHandler - This callback is run if a fatal signal is delivered to the
/// process, it prints the pretty stack trace.
static void CrashHandler(void *) {
#ifndef __APPLE__
// On non-apple systems, just emit the crash stack trace to stderr.
PrintCurStackTrace(errs());
#else
// Otherwise, emit to a smallvector of chars, send *that* to stderr, but also
// put it into __crashreporter_info__.
SmallString<2048> TmpStr;
{
raw_svector_ostream Stream(TmpStr);
PrintCurStackTrace(Stream);
}
if (!TmpStr.empty()) {
#ifdef HAVE_CRASHREPORTERCLIENT_H
// Cast to void to avoid warning.
(void)CRSetCrashLogMessage(std::string(TmpStr.str()).c_str());
#elif HAVE_CRASHREPORTER_INFO
__crashreporter_info__ = strdup(std::string(TmpStr.str()).c_str());
#endif
errs() << TmpStr.str();
}
#endif
}
PrettyStackTraceEntry::PrettyStackTraceEntry() {
// Link ourselves.
[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack tracing code. Managed static was just insane overhead for this. We took memory fences and external function calls in every path that pushed a pretty stack frame. This includes a multitude of layers setting up and tearing down passes, the parser in Clang, everywhere. For the regression test suite or low-overhead JITs, this was contributing to really significant overhead. Even the LLVM ThreadLocal is really overkill here because it uses pthread_{set,get}_specific logic, and has careful code to both allocate and delete the thread local data. We don't actually want any of that, and this code in particular has problems coping with deallocation. What we want is a single TLS pointer that is valid to use during global construction and during global destruction, any time we want. That is exactly what every host compiler and OS we use has implemented for a long time, and what was standardized in C++11. Even though not all of our host compilers support the thread_local keyword, we can directly use the platform-specific keywords to get the minimal functionality needed. Provided this limited trial survives the build bots, I will move this to Compiler.h so it is more widely available as a light weight if limited alternative to the ThreadLocal class. Many thanks to David Majnemer for helping me think through the implications across platforms and craft the MSVC-compatible syntax. The end result is *substantially* faster. When running llc in a tight loop over a small IR file targeting the aarch64 backend, this improves its performance by over 10% for me. It also seems likely to fix the remaining regressions seen by JIT users with threading enabled. This may actually have more impact on real-world compile times due to the use of the pretty stack tracing utility throughout the rest of Clang or LLVM, but I've not collected any detailed measurements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 09:52:14 +00:00
NextEntry = PrettyStackTraceHead;
PrettyStackTraceHead = this;
}
PrettyStackTraceEntry::~PrettyStackTraceEntry() {
[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack tracing code. Managed static was just insane overhead for this. We took memory fences and external function calls in every path that pushed a pretty stack frame. This includes a multitude of layers setting up and tearing down passes, the parser in Clang, everywhere. For the regression test suite or low-overhead JITs, this was contributing to really significant overhead. Even the LLVM ThreadLocal is really overkill here because it uses pthread_{set,get}_specific logic, and has careful code to both allocate and delete the thread local data. We don't actually want any of that, and this code in particular has problems coping with deallocation. What we want is a single TLS pointer that is valid to use during global construction and during global destruction, any time we want. That is exactly what every host compiler and OS we use has implemented for a long time, and what was standardized in C++11. Even though not all of our host compilers support the thread_local keyword, we can directly use the platform-specific keywords to get the minimal functionality needed. Provided this limited trial survives the build bots, I will move this to Compiler.h so it is more widely available as a light weight if limited alternative to the ThreadLocal class. Many thanks to David Majnemer for helping me think through the implications across platforms and craft the MSVC-compatible syntax. The end result is *substantially* faster. When running llc in a tight loop over a small IR file targeting the aarch64 backend, this improves its performance by over 10% for me. It also seems likely to fix the remaining regressions seen by JIT users with threading enabled. This may actually have more impact on real-world compile times due to the use of the pretty stack tracing utility throughout the rest of Clang or LLVM, but I've not collected any detailed measurements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 09:52:14 +00:00
assert(PrettyStackTraceHead == this &&
"Pretty stack trace entry destruction is out of order");
[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack tracing code. Managed static was just insane overhead for this. We took memory fences and external function calls in every path that pushed a pretty stack frame. This includes a multitude of layers setting up and tearing down passes, the parser in Clang, everywhere. For the regression test suite or low-overhead JITs, this was contributing to really significant overhead. Even the LLVM ThreadLocal is really overkill here because it uses pthread_{set,get}_specific logic, and has careful code to both allocate and delete the thread local data. We don't actually want any of that, and this code in particular has problems coping with deallocation. What we want is a single TLS pointer that is valid to use during global construction and during global destruction, any time we want. That is exactly what every host compiler and OS we use has implemented for a long time, and what was standardized in C++11. Even though not all of our host compilers support the thread_local keyword, we can directly use the platform-specific keywords to get the minimal functionality needed. Provided this limited trial survives the build bots, I will move this to Compiler.h so it is more widely available as a light weight if limited alternative to the ThreadLocal class. Many thanks to David Majnemer for helping me think through the implications across platforms and craft the MSVC-compatible syntax. The end result is *substantially* faster. When running llc in a tight loop over a small IR file targeting the aarch64 backend, this improves its performance by over 10% for me. It also seems likely to fix the remaining regressions seen by JIT users with threading enabled. This may actually have more impact on real-world compile times due to the use of the pretty stack tracing utility throughout the rest of Clang or LLVM, but I've not collected any detailed measurements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-28 09:52:14 +00:00
PrettyStackTraceHead = getNextEntry();
}
void PrettyStackTraceString::print(raw_ostream &OS) const {
OS << Str << "\n";
}
void PrettyStackTraceProgram::print(raw_ostream &OS) const {
OS << "Program arguments: ";
// Print the argument list.
for (unsigned i = 0, e = ArgC; i != e; ++i)
OS << ArgV[i] << ' ';
OS << '\n';
}
static bool RegisterCrashPrinter() {
sys::AddSignalHandler(CrashHandler, nullptr);
return false;
}
void llvm::EnablePrettyStackTrace() {
// The first time this is called, we register the crash printer.
static bool HandlerRegistered = RegisterCrashPrinter();
(void)HandlerRegistered;
}
void LLVMEnablePrettyStackTrace() {
EnablePrettyStackTrace();
}