Improve handling of end of file in the bitcode reader.

Before this patch the bitcode reader would read a module from a file
that contained in order:

* Any number of non MODULE_BLOCK sub blocks.
* One MODULE_BLOCK
* Any number of non MODULE_BLOCK sub blocks.
* 4 '\n' characters to handle OS X's ranlib.

Since we support lazy reading of modules, any information that is relevant
for the module has to be in the MODULE_BLOCK or before it. We don't gain
anything from checking what is after.

This patch then changes the reader to stop once the MODULE_BLOCK has been
successfully parsed.

This avoids the ugly special case for .bc files in an archive and makes it
easier to embed bitcode files.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239845 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Rafael Espindola 2015-06-16 20:03:39 +00:00
parent 3f53fc8f5f
commit 0c650627ca
3 changed files with 12 additions and 42 deletions

View File

@ -3079,7 +3079,7 @@ std::error_code BitcodeReader::parseModule(bool Resume,
std::error_code BitcodeReader::parseBitcodeInto(Module *M,
bool ShouldLazyLoadMetadata) {
TheModule = nullptr;
TheModule = M;
if (std::error_code EC = initStream())
return EC;
@ -3097,8 +3097,6 @@ std::error_code BitcodeReader::parseBitcodeInto(Module *M,
// need to understand them all.
while (1) {
if (Stream.AtEndOfStream()) {
if (TheModule)
return std::error_code();
// We didn't really read a proper Module.
return error("Malformed IR file");
}
@ -3106,47 +3104,14 @@ std::error_code BitcodeReader::parseBitcodeInto(Module *M,
BitstreamEntry Entry =
Stream.advance(BitstreamCursor::AF_DontAutoprocessAbbrevs);
switch (Entry.Kind) {
case BitstreamEntry::Error:
if (Entry.Kind != BitstreamEntry::SubBlock)
return error("Malformed block");
case BitstreamEntry::EndBlock:
return std::error_code();
case BitstreamEntry::SubBlock:
switch (Entry.ID) {
case bitc::BLOCKINFO_BLOCK_ID:
if (Stream.ReadBlockInfoBlock())
return error("Malformed block");
break;
case bitc::MODULE_BLOCK_ID:
// Reject multiple MODULE_BLOCK's in a single bitstream.
if (TheModule)
return error("Invalid multiple blocks");
TheModule = M;
if (std::error_code EC = parseModule(false, ShouldLazyLoadMetadata))
return EC;
if (Streamer)
return std::error_code();
break;
default:
if (Stream.SkipBlock())
return error("Invalid record");
break;
}
continue;
case BitstreamEntry::Record:
// There should be no records in the top-level of blocks.
// The ranlib in Xcode 4 will align archive members by appending newlines
// to the end of them. If this file size is a multiple of 4 but not 8, we
// have to read and ignore these final 4 bytes :-(
if (Stream.getAbbrevIDWidth() == 2 && Entry.ID == 2 &&
Stream.Read(6) == 2 && Stream.Read(24) == 0xa0a0a &&
Stream.AtEndOfStream())
return std::error_code();
if (Entry.ID == bitc::MODULE_BLOCK_ID)
return parseModule(false, ShouldLazyLoadMetadata);
if (Stream.SkipBlock())
return error("Invalid record");
}
}
}

Binary file not shown.

View File

@ -1,7 +1,7 @@
Test that both llvm-dis (uses a data streamer) and opt (no data streamer)
handle a .bc file padded with '\n' at the end.
handle a .bc file with any padding.
This files can be produced under a peculiar situation:
A file padded with '\n' can be produced under a peculiar situation:
* A .bc is produced os OS X, but without a darwin triple, so it has no
wrapper.
@ -9,5 +9,10 @@ This files can be produced under a peculiar situation:
* ranlib is ran on that archive. It will pad the members to make them multiple
of 8 bytes.
and there is no reason to not handle the general case.
RUN: llvm-dis -disable-output %p/Inputs/padding.bc
RUN: opt -disable-output %p/Inputs/padding.bc
RUN: llvm-dis -disable-output %p/Inputs/padding-garbage.bc
RUN: opt -disable-output %p/Inputs/padding-garbage.bc