mirror of
https://github.com/sheumann/hush.git
synced 2024-12-27 01:32:08 +00:00
added small doc about tar 'pax header' format
This commit is contained in:
parent
664733f1a3
commit
ec0c920a78
239
docs/tar_pax.txt
Normal file
239
docs/tar_pax.txt
Normal file
@ -0,0 +1,239 @@
|
||||
'pax headers' is POSIX 2003 (iirc) addition designed to fix
|
||||
tar format limitations - older tar format has fixed fields
|
||||
for everything (filename, uid, filesize etc) which can overflow.
|
||||
|
||||
pax Header Block
|
||||
|
||||
The pax header block shall be identical to the ustar header block
|
||||
described in ustar Interchange Format, except that two additional
|
||||
typeflag values are defined:
|
||||
|
||||
x
|
||||
Represents extended header records for the following file in
|
||||
the archive (which shall have its own ustar header block).
|
||||
|
||||
g
|
||||
Represents global extended header records for the following
|
||||
files in the archive. Each value shall affect all subsequent files
|
||||
that do not override that value in their own extended header
|
||||
record and until another global extended header record is reached
|
||||
that provides another value for the same field. The typeflag g
|
||||
global headers should not be used with interchange media that
|
||||
could suffer partial data loss in transporting the archive.
|
||||
|
||||
For both of these types, the size field shall be the size of the
|
||||
extended header records in octets. The other fields in the header
|
||||
block are not meaningful to this version of the pax utility.
|
||||
However, if this archive is read by a pax utility conforming to
|
||||
the ISO POSIX-2:1993 standard, the header block fields are used to
|
||||
create a regular file that contains the extended header records as
|
||||
data. Therefore, header block field values should be selected to
|
||||
provide reasonable file access to this regular file.
|
||||
|
||||
A further difference from the ustar header block is that data
|
||||
blocks for files of typeflag 1 (the digit one) (hard link) may be
|
||||
included, which means that the size field may be greater than
|
||||
zero.
|
||||
|
||||
pax Extended Header
|
||||
|
||||
An extended header shall consist of one or more records, each
|
||||
constructed as follows:
|
||||
|
||||
"%d %s=%s\n", <length>, <keyword>, <value>
|
||||
|
||||
The <length> field shall be the decimal length of the extended
|
||||
header record in octets, including length string itself and the
|
||||
trailing <newline>.
|
||||
|
||||
[skip]
|
||||
|
||||
atime
|
||||
The file access time for the following file(s), equivalent to
|
||||
the value of the st_atime member of the stat structure for a file,
|
||||
as described by the stat() function. The access time shall be
|
||||
restored if the process has the appropriate privilege required to
|
||||
do so. The format of the <value> shall be as described in pax
|
||||
Extended Header File Times.
|
||||
|
||||
charset
|
||||
The name of the character set used to encode the data in the
|
||||
following file(s).
|
||||
|
||||
The encoding is included in an extended header for information
|
||||
only; when pax is used as described in IEEE Std 1003.1-2001, it
|
||||
shall not translate the file data into any other encoding. The
|
||||
BINARY entry indicates unencoded binary data.
|
||||
|
||||
When used in write or copy mode, it is implementation-defined
|
||||
whether pax includes a charset extended header record for a file.
|
||||
|
||||
comment
|
||||
A series of characters used as a comment. All characters in
|
||||
the <value> field shall be ignored by pax.
|
||||
|
||||
gid
|
||||
The group ID of the group that owns the file, expressed as a
|
||||
decimal number using digits from the ISO/IEC 646:1991 standard.
|
||||
This record shall override the gid field in the following header
|
||||
block(s). When used in write or copy mode, pax shall include a gid
|
||||
extended header record for each file whose group ID is greater
|
||||
than 2097151 (octal 7777777).
|
||||
|
||||
gname
|
||||
The group of the file(s), formatted as a group name in the
|
||||
group database. This record shall override the gid and gname
|
||||
fields in the following header block(s), and any gid extended
|
||||
header record. When used in read, copy, or list mode, pax shall
|
||||
translate the name from the UTF-8 encoding in the header record to
|
||||
the character set appropriate for the group database on the
|
||||
receiving system. If any of the UTF-8 characters cannot be
|
||||
translated, and if the -o invalid= UTF-8 option is not specified,
|
||||
the results are implementation-defined. When used in write or copy
|
||||
mode, pax shall include a gname extended header record for each
|
||||
file whose group name cannot be represented entirely with the
|
||||
letters and digits of the portable character set.
|
||||
|
||||
linkpath
|
||||
The pathname of a link being created to another file, of any
|
||||
type, previously archived. This record shall override the linkname
|
||||
field in the following ustar header block(s). The following ustar
|
||||
header block shall determine the type of link created. If typeflag
|
||||
of the following header block is 1, it shall be a hard link. If
|
||||
typeflag is 2, it shall be a symbolic link and the linkpath value
|
||||
shall be the contents of the symbolic link. The pax utility shall
|
||||
translate the name of the link (contents of the symbolic link)
|
||||
from the UTF-8 encoding to the character set appropriate for the
|
||||
local file system. When used in write or copy mode, pax shall
|
||||
include a linkpath extended header record for each link whose
|
||||
pathname cannot be represented entirely with the members of the
|
||||
portable character set other than NUL.
|
||||
|
||||
mtime
|
||||
The file modification time of the following file(s),
|
||||
equivalent to the value of the st_mtime member of the stat
|
||||
structure for a file, as described in the stat() function. This
|
||||
record shall override the mtime field in the following header
|
||||
block(s). The modification time shall be restored if the process
|
||||
has the appropriate privilege required to do so. The format of the
|
||||
<value> shall be as described in pax Extended Header File Times.
|
||||
|
||||
path
|
||||
The pathname of the following file(s). This record shall
|
||||
override the name and prefix fields in the following header
|
||||
block(s). The pax utility shall translate the pathname of the file
|
||||
from the UTF-8 encoding to the character set appropriate for the
|
||||
local file system.
|
||||
|
||||
When used in write or copy mode, pax shall include a path
|
||||
extended header record for each file whose pathname cannot be
|
||||
represented entirely with the members of the portable character
|
||||
set other than NUL.
|
||||
|
||||
realtime.any
|
||||
The keywords prefixed by "realtime." are reserved for future
|
||||
standardization.
|
||||
|
||||
security.any
|
||||
The keywords prefixed by "security." are reserved for future
|
||||
standardization.
|
||||
|
||||
size
|
||||
The size of the file in octets, expressed as a decimal number
|
||||
using digits from the ISO/IEC 646:1991 standard. This record shall
|
||||
override the size field in the following header block(s). When
|
||||
used in write or copy mode, pax shall include a size extended
|
||||
header record for each file with a size value greater than
|
||||
8589934591 (octal 77777777777).
|
||||
|
||||
uid
|
||||
The user ID of the file owner, expressed as a decimal number
|
||||
using digits from the ISO/IEC 646:1991 standard. This record shall
|
||||
override the uid field in the following header block(s). When used
|
||||
in write or copy mode, pax shall include a uid extended header
|
||||
record for each file whose owner ID is greater than 2097151 (octal
|
||||
7777777).
|
||||
|
||||
uname
|
||||
The owner of the following file(s), formatted as a user name
|
||||
in the user database. This record shall override the uid and uname
|
||||
fields in the following header block(s), and any uid extended
|
||||
header record. When used in read, copy, or list mode, pax shall
|
||||
translate the name from the UTF-8 encoding in the header record to
|
||||
the character set appropriate for the user database on the
|
||||
receiving system. If any of the UTF-8 characters cannot be
|
||||
translated, and if the -o invalid= UTF-8 option is not specified,
|
||||
the results are implementation-defined. When used in write or copy
|
||||
mode, pax shall include a uname extended header record for each
|
||||
file whose user name cannot be represented entirely with the
|
||||
letters and digits of the portable character set.
|
||||
|
||||
If the <value> field is zero length, it shall delete any header
|
||||
block field, previously entered extended header value, or global
|
||||
extended header value of the same name.
|
||||
|
||||
If a keyword in an extended header record (or in a -o
|
||||
option-argument) overrides or deletes a corresponding field in the
|
||||
ustar header block, pax shall ignore the contents of that header
|
||||
block field.
|
||||
|
||||
Unlike the ustar header block fields, NULs shall not delimit
|
||||
<value>s; all characters within the <value> field shall be
|
||||
considered data for the field. None of the length limitations of
|
||||
the ustar header block fields in ustar Header Block shall apply to
|
||||
the extended header records.
|
||||
|
||||
pax Extended Header File Times
|
||||
|
||||
Time records shall be formatted as a decimal representation of the
|
||||
time in seconds since the Epoch. If a period ( '.' ) decimal point
|
||||
character is present, the digits to the right of the point shall
|
||||
represent the units of a subsecond timing granularity. In read or
|
||||
copy mode, the pax utility shall truncate the time of a file to
|
||||
the greatest value that is not greater than the input header
|
||||
file time. In write or copy mode, the pax utility shall output a
|
||||
time exactly if it can be represented exactly as a decimal number,
|
||||
and otherwise shall generate only enough digits so that the same
|
||||
time shall be recovered if the file is extracted on a system whose
|
||||
underlying implementation supports the same time granularity.
|
||||
|
||||
Example from Linux kernel archive tarball:
|
||||
|
||||
00000000 70 61 78 5f 67 6c 6f 62 61 6c 5f 68 65 61 64 65 |pax_global_heade|
|
||||
00000010 72 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |r...............|
|
||||
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000060 00 00 00 00 30 30 30 30 36 36 36 00 30 30 30 30 |....0000666.0000|
|
||||
00000070 30 30 30 00 30 30 30 30 30 30 30 00 30 30 30 30 |000.0000000.0000|
|
||||
00000080 30 30 30 30 30 36 34 00 30 30 30 30 30 30 30 30 |0000064.00000000|
|
||||
00000090 30 30 30 00 30 30 31 34 30 35 33 00 67 00 00 00 |000.0014053.g...|
|
||||
000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000100 00 75 73 74 61 72 00 30 30 67 69 74 00 00 00 00 |.ustar.00git....|
|
||||
00000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000120 00 00 00 00 00 00 00 00 00 67 69 74 00 00 00 00 |.........git....|
|
||||
00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000140 00 00 00 00 00 00 00 00 00 30 30 30 30 30 30 30 |.........0000000|
|
||||
00000150 00 30 30 30 30 30 30 30 00 00 00 00 00 00 00 00 |.0000000........|
|
||||
00000160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000001c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
00000200 35 32 20 63 6f 6d 6d 65 6e 74 3d 62 31 30 35 30 |52 comment=b1050|
|
||||
00000210 32 62 32 32 61 31 32 30 39 64 36 62 34 37 36 33 |2b22a1209d6b4763|
|
||||
00000220 39 64 38 38 62 38 31 32 62 32 31 66 62 35 39 34 |9d88b812b21fb594|
|
||||
00000230 39 65 34 0a 00 00 00 00 00 00 00 00 00 00 00 00 |9e4.............|
|
||||
00000240 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
|
||||
...
|
Loading…
Reference in New Issue
Block a user