TAR(5) UNIX Programmer's Manual TAR(5) NAME tar - tape archive file format DESCRIPTION _T_a_r, (the tape archive command) dumps several files into one, in a medium suitable for transportation. A ``tar tape'' or file is a series of blocks. Each block is of size TBLOCK. A file on the tape is represented by a header block which describes the file, followed by zero or more blocks which give the contents of the file. At the end of the tape are two blocks filled with binary zeros, as an end-of-file indicator. The blocks are grouped for physical I/O operations. Each group of _n blocks (where _n is set by the b keyletter on the _t_a_r(1) command line - default is 20 blocks) is written with a single system call; on nine-track tapes, the result of this write is a single tape record. The last group is always written at the full size, so blocks after the two zero blocks contain random data. On reading, the specified or default group size is used for the first read, but if that read returns less than a full tape block, the reduced block size is used for further reads. The header block looks like: #define TBLOCK 512 #define NAMSIZ 100 union hblock { char dummy[TBLOCK]; struct header { char name[NAMSIZ]; char mode[8]; char uid[8]; char gid[8]; char size[12]; char mtime[12]; char chksum[8]; char linkflag; char linkname[NAMSIZ]; } dbuf; }; _N_a_m_e is a null-terminated string. The other fields are zero-filled octal numbers in ASCII. Each field (of width w) contains w-2 digits, a space, and a null, except _s_i_z_e and _m_t_i_m_e, which do not contain the trailing null and _c_h_k_s_u_m which has a null followed by a space. _N_a_m_e is the name of the file, as specified on the _t_a_r command line. Files dumped because they were in a directory which was named in Printed 11/26/99 November 7, 1985 1 TAR(5) UNIX Programmer's Manual TAR(5) the command line have the directory name as prefix and /_f_i_l_e_n_a_m_e as suffix. _M_o_d_e is the file mode, with the top bit masked off. _U_i_d and _g_i_d are the user and group numbers which own the file. _S_i_z_e is the size of the file in bytes. Links and symbolic links are dumped with this field speci- fied as zero. _M_t_i_m_e is the modification time of the file at the time it was dumped. _C_h_k_s_u_m is an octal ASCII value which represents the sum of all the bytes in the header block. When calculating the checksum, the _c_h_k_s_u_m field is treated as if it were all blanks. _L_i_n_k_f_l_a_g is NULL if the file is ``normal'' or a special file, ASCII `1' if it is an hard link, and ASCII `2' if it is a symbolic link. The name linked-to, if any, is in _l_i_n_k_n_a_m_e, with a trailing null. Unused fields of the header are binary zeros (and are included in the checksum). The first time a given i-node number is dumped, it is dumped as a regular file. The second and subsequent times, it is dumped as a link instead. Upon retrieval, if a link entry is retrieved, but not the file it was linked to, an error message is printed and the tape must be manually re-scanned to retrieve the linked-to file. The encoding of the header is designed to be portable across machines. SEE ALSO tar(1) BUGS Names or linknames longer than NAMSIZ produce error reports and cannot be dumped. Printed 11/26/99 November 7, 1985 2