yauzl

yet another unzip library for node. For zipping, see
yazl.

Design principles:

* Follow the spec.
Don't scan for local file headers.
Read the central directory for file metadata.
(see No Streaming Unzip API).
* Don't block the JavaScript thread.
Use and provide async APIs.
* Keep memory usage under control.
Don't attempt to buffer entire files in RAM at once.
* Never crash (if used properly).
Don't let malformed zip files bring down client applications who are trying to catch errors.
* Catch unsafe file names.
See validateFileName().

Usage

``js var yauzl = require("yauzl");

yauzl.open("path/to/file.zip", {lazyEntries: true}, function(err, zipfile) { if (err) throw err; zipfile.readEntry(); zipfile.on("entry", function(entry) { if (/\/$/.test(entry.fileName)) { // Directory file names end with '/'. // Note that entries for directories themselves are optional. // An entry's fileName implicitly requires its parent directories to exist. zipfile.readEntry(); } else { // file entry zipfile.openReadStream(entry, function(err, readStream) { if (err) throw err; readStream.on("end", function() { zipfile.readEntry(); }); readStream.pipe(somewhere); }); } }); });`

See also examples/ for more usage examples.

`API`

The default for every optional callback parameter is:

`js function defaultCallback(err) { if (err) throw err; }`

`$3`

Calls fs.open(path, "r") and reads the fd effectively the same as fromFd() would.

options may be omitted or null. The defaults are {autoClose: true, lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}.

autoClose is effectively equivalent to:

`js zipfile.once("end", function() { zipfile.close(); });`

lazyEntries indicates that entries should be read only when readEntry()is called. IflazyEntries is false, entry events will be emitted as fast as possible to allow pipe()ing file data from all entries in parallel. This is not recommended, as it can lead to out of control memory usage for zip files with many entries. See issue #22. IflazyEntries is true, an entry or end event will be emitted in response to each call to readEntry(). This allows processing of one entry at a time, and will keep memory usage under control for zip files with many entries.

decodeStrings is the default and causes yauzl to decode strings with CP437 or UTF-8as required by the spec. The exact effects of turning this option off are:

* zipfile.comment, entry.fileName, and entry.fileComment will be Buffer objects instead of Strings. * Any Info-ZIP Unicode Path Extra Field will be ignored. SeeextraFields. * Automatic file name validation will not be performed. SeevalidateFileName().

validateEntrySizesis the default and ensures that an entry's reported uncompressed size matches its actual uncompressed size. This check happens as early as possible, which is either before emitting each"entry"event (for entries with no compression), or during thereadStream piping after calling openReadStream(). SeeopenReadStream() for more information on defending against zip bomb attacks.

When strictFileNames is false (the default) and decodeStrings is true, all backslash (\) characters in each entry.fileName are replaced with forward slashes (/). The spec forbids file names with backslashes, but Microsoft'sSystem.IO.Compression.ZipFileclass in .NET versions 4.5.0 until 4.6.1 creates non-conformant zipfiles with backslashes in file names.strictFileNames is falseby default so that clients can read these non-conformant zipfiles without knowing about this Microsoft-specific bug. WhenstrictFileNames is true and decodeStrings is true, entries with backslashes in their file names will result in an error. SeevalidateFileName(). WhendecodeStrings is false, strictFileNames has no effect.

The callback is given the arguments (err, zipfile). Anerris provided if the End of Central Directory Record cannot be found, or if its metadata appears malformed. This kind of error usually indicates that this is not a zip file. Otherwise,zipfile is an instance of ZipFile.

`$3`

Reads from the fd, which is presumed to be an open .zip file. Note that random access is required by the zip file specification, so the fd cannot be an open socket or any other fd that does not support random access.

options may be omitted or null. The defaults are {autoClose: false, lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}.

See open() for the meaning of the options and callback.

`$3`

Like fromFd(), but reads from a RAM buffer instead of an open file.buffer is a Buffer.

If a ZipFileis acquired from this method, it will never emit thecloseevent, and callingclose() is not necessary.

options may be omitted or null. The defaults are {lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}.

See open()for the meaning of the options and callback. TheautoClose option is ignored for this method.

`$3`

This method of reading a zip file allows clients to implement their own back-end file system. For example, a client might translate read calls into network requests.

The readerparameter must be of a type that is a subclass of RandomAccessReader that implements the required methods. ThetotalSize is a Number and indicates the total file size of the zip file.

options may be omitted or null. The defaults are {autoClose: true, lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}.

See open() for the meaning of the options and callback.

`$3`

Deprecated. Since yauzl 3.2.0, it is highly recommended to call entry.getLastModDate()instead of this function due to enhanced support for reading third-party extra fields. If you ever have a use case for calling this function directly please open an issue against yauzl requesting that this function be properly supported again.

This function only remains exported in order to maintain compatibility with older version of yauzl. It will be removed in yauzl 4.0.0 unless someone asks for it to remain supported.

`$3`

If you are setting decodeStrings to false, then this function can be used to decode the file name yourself. This function is effectively used internally by yauzl to populate theentry.fileName field when decodeStrings is true.

WARNING: This method of getting the file name bypasses the security checks in validateFileName(). You should call that function yourself to be sure to guard against malicious file paths.

generalPurposeBitFlag can be found on an Entry or LocalFileHeader. Only General Purpose Bit 11 is used, and only when an Info-ZIP Unicode Path Extra Field cannot be found inextraFields.

fileNameBuffer is a Bufferrepresenting the file name field of the entry. This isentry.fileNameRaw or localFileHeader.fileName.

extraFields is the parsed extra fields array from entry.extraFields or parseExtraFields().

strictFileNames is a boolean, the same as the option of the same name in open(). Whenfalse, backslash characters (\) will be replaced with forward slash characters (/).

This function always returns a string, although it may not be a valid file name. SeevalidateFileName().

`$3`

Returns null or a String error message depending on the validity of fileName. IffileName starts with "/" or /[A-Za-z]:\// or if it contains ".." path segments or "\\", this function returns an error message appropriate for use like this:

`js var errorMessage = yauzl.validateFileName(fileName); if (errorMessage != null) throw new Error(errorMessage);`

This function is automatically run for each entry, as long as decodeStrings is true. Seeopen(), strictFileNames, and Event: "entry" for more information.

`$3`

This function is used internally by yauzl to compute entry.extraFields. It is exported in case you want to call it onlocalFileHeader.extraField.

extraFieldBuffer is a Buffer, such as localFileHeader.extraField. Returns anArray with each item in the form {id: id, data: data}, whereid is a Number and data is a Buffer. Throws anError if the data encodes an item with a size that exceeds the bounds of the buffer.

You may want to surround calls to this function with try { ... } catch (err) { ... } to handle the error.

`$3`

The constructor for the class is not part of the public API. Useopen(), fromFd(), fromBuffer(), or fromRandomAccessReader() instead.

#### Event: "entry"

Callback gets (entry), which is an Entry. Seeopen() and readEntry() for when this event is emitted.

If decodeStrings is true, entries emitted via this event have already passed file name validation. SeevalidateFileName() and open() for more information.

If validateEntrySizes is true and this entry's compressionMethod is 0(stored without compression), this entry has already passed entry size validation. Seeopen() for more information.

#### Event: "end"

Emitted after the last entryevent has been emitted. Seeopen() and readEntry() for more info on when this event is emitted.

#### Event: "close"

Emitted after the fd is actually closed. This is after callingclose() (or after the end event when autoClose is true), and after all stream pipelines created fromopenReadStream() have finished reading data from the fd.

If this ZipFile was acquired from fromRandomAccessReader(), the "fd" in the previous paragraph refers to theRandomAccessReader implemented by the client.

If this ZipFile was acquired from fromBuffer(), this event is never emitted.

#### Event: "error"

Emitted in the case of errors with reading the zip file. (Note that other errors can be emitted from the streams created fromopenReadStream()as well.) After this event has been emitted, no furtherentry, end, or errorevents will be emitted, but theclose event may still be emitted.

#### readEntry()

Causes this ZipFile to emit an entry or end event (or an errorevent). This method must only be called when thisZipFile was created with the lazyEntries option set to true (see open()). When thisZipFile was created with the lazyEntries option set to true,entry and end events are only ever emitted in response to this method call.

The event that is emitted in response to this method will not be emitted until after this method has returned, so it is safe to call this method before attaching event listeners.

After calling this method, calling this method again before the response event has been emitted will cause undefined behavior. Calling this method after theendevent has been emitted will cause undefined behavior. Calling this method after callingclose() will cause undefined behavior.

#### openReadStream(entry, [options], callback)

entry must be an Entry object from this ZipFile.callback gets (err, readStream), where readStream is a Readable Streamthat provides the file data for this entry. If this zipfile is already closed (seeclose()), the callback will receive an err.

options may be omitted or null, and has the following defaults:

`js { decompress: entry.isCompressed() ? true : null, decrypt: null, start: 0, // actually the default is null, see below end: entry.compressedSize, // actually the default is null, see below }`

If the entry is compressed (with a supported compression method), and thedecompress option is true(or omitted), the read stream provides the decompressed data. Omitting thedecompress option is what most clients should do.

The decompress option must be null (or omitted) when the entry is not compressed (see isCompressed()), and eithertrue (or omitted) or falsewhen the entry is compressed. Specifyingdecompress: falsefor a compressed entry causes the read stream to provide the raw compressed file data without going through a zlib inflate transform.

If the entry is encrypted (see isEncrypted()), clients may want to avoid calling openReadStream()on the entry entirely. Alternatively, clients may callopenReadStream() for encrypted entries and specify decrypt: false. If the entry is also compressed, clients must also specifydecompress: false. Specifyingdecrypt: falsefor an encrypted entry causes the read stream to provide the raw, still-encrypted file data. (This data includes the 12-byte header described in the spec.)

The decrypt option must be null (or omitted) for non-encrypted entries, and falsefor encrypted entries. Omitting thedecrypt option (or specifying it as null) for an encrypted entry will result in thecallback receiving an err. This default behavior is so that clients not accounting for encrypted files aren't surprised by bogus file data.

The start (inclusive) and end(exclusive) options are byte offsets into this entry's file data, and can be used to obtain part of an entry's file data rather than the whole thing. If either of these options are specified and non-null, then the above options must be used to obain the file's raw data. Specifying{start: 0, end: entry.compressedSize}will result in the complete file, which is effectively the default values for these options, but note that unlike omitting the options, when you specifystart or end as any non-nullvalue, the above requirement is still enforced that you must also pass the appropriate options to get the file's raw data.

It's possible for the readStream provided to the callbackto emit errors for several reasons. For example, if zlib cannot decompress the data, the zlib error will be emitted from thereadStream. Two more error cases (whenvalidateEntrySizes is true) are if the decompressed data has too many or too few actual bytes compared to the reported byte count from the entry'suncompressedSizefield. yauzl notices this false information and emits an error from thereadStreamafter some number of bytes have already been piped through the stream.

This check allows clients to trust the uncompressedSize field in Entryobjects. Guarding against zip bomb attacks can be accomplished by doing some heuristic checks on the size metadata and then watching out for the above errors. Such heuristics are outside the scope of this library, but enforcing theuncompressedSize is implemented here as a security feature.

It is possible to destroy the readStreambefore it has piped all of its data. To do this, callreadStream.destroy(). You mustunpipe() the readStream from any destination before calling readStream.destroy(). If this zipfile was created usingfromRandomAccessReader(), the RandomAccessReaderimplementation must provide readable streams that implement a._destroy()method according to https://nodejs.org/api/stream.html#writable_destroyerr-callback (seerandomAccessReader._readStreamForRange()) in order for calls toreadStream.destroy() to work in this context.

#### readLocalFileHeader(entry, [options], callback)

This is a low-level function you probably don't need to call. The intended use case is either preparing to callopenReadStreamLowLevel()or simply examining the content of the local file header out of curiosity or for debugging zip file structure issues.

entry is an entry obtained from Event: "entry". Anentryin this library is a file's metadata from a Central Directory Header, and this function gives the corresponding redundant data in a Local File Header.

options may be omitted or null, and has the following defaults:

`js { minimal: false, }`

If minimal is false (or omitted or null), the callback receives a full LocalFileHeader. Ifminimal is true, the callback receives an object with a single property and no prototype {fileDataStart: fileDataStart}. For typical zipfile reading usecases, this field is the only one you need, and yauzl internally effectively uses the{minimal: true} option as part of openReadStream().

The callback receives (err, localFileHeaderOrAnObjectWithJustOneFieldDependingOnTheMinimalOption), where the type of the second parameter is described in the above discussion of theminimal option.

#### openReadStreamLowLevel(fileDataStart, compressedSize, relativeStart, relativeEnd, decompress, uncompressedSize, callback)

This is a low-level function available for advanced use cases. You probably want openReadStream() instead.

The intended use case for this function is calling readEntry() and readLocalFileHeader() with {minimal: true}first, and then opening the read stream at a later time, possibly after closing and reopening the entire zipfile, possibly even in a different process. The parameters are all integers and booleans, which are friendly to serialization.

* fileDataStart - from localFileHeader.fileDataStart*compressedSize - from entry.compressedSize*relativeStart - the resolved value of options.start from openReadStream(). Must be a non-negative integer, not null. Typically 0to start at the beginning of the data. *relativeEnd - the resolved value of options.end from openReadStream(). Must be a non-negative integer, not null. Typically entry.compressedSizeto include all the data. *decompress- boolean indicating whether the data should be piped through a zlib inflate stream. *uncompressedSize - from entry.uncompressedSize. Only used when validateEntrySizes is true. If validateEntrySizes is false, this value is ignored, but must still be present, not omitted, in the arguments; you have to give it some value, even if it's null. *callback - receives (err, readStream), the same as for openReadStream()

This low-level function does not read any metadata from the underlying storage before opening the read stream. This is both a performance feature and a safety hazard. None of the integer parameters are bounds checked. None of the validation fromopenReadStream()with respect to compression and encryption is done here either. Only the bounds checks fromvalidateEntrySizes are done, because that is part of processing the stream data.

#### close()

Causes all future calls to openReadStream()to fail, and closes the fd, if any, after all streams created byopenReadStream() have emitted their end events.

If the autoClose option is set to true (see open()), this function will be called automatically effectively in response to this object'send event.

If the lazyEntries option is set to false (see open()) and this object's endevent has not been emitted yet, this function causes undefined behavior. If thelazyEntries option is set to true, you can call this function instead of callingreadEntry() to abort reading the entries of a zipfile.

It is safe to call this function multiple times; after the first call, successive calls have no effect. This includes situations where theautoClose option effectively calls this function for you.

If close()is never called, then the zipfile is "kept open". For zipfiles created withfromFd(), this will leave the fdopen, which may be desirable. For zipfiles created withopen(), this will leave the underlying fdopen, thereby "leaking" it, which is probably undesirable. For zipfiles created withfromRandomAccessReader(), the reader's close()method will never be called. For zipfiles created withfromBuffer(), the close() function has no effect whether called or not.

Regardless of how this ZipFilewas created, there are no resources other than those listed above that require cleanup from this function. This means it may be desirable to never callclose() in some usecases.

#### isOpen

Boolean. true until close() is called; then it's false.

#### entryCount

Number. Total number of central directory records.

#### comment

String. Always decoded with CP437 per the spec.

If decodeStrings is false (see open()), this field is the undecoded Buffer instead of a decoded String.

`$3`

Objects of this class represent Central Directory Records. Refer to the zipfile specification for more details about these fields.

These fields are of type Number:

* versionMadeBy*versionNeededToExtract*generalPurposeBitFlag*compressionMethod*lastModFileTime (MS-DOS format, see getLastModDate()) *lastModFileDate (MS-DOS format, see getLastModDate()) *crc32*compressedSize*uncompressedSize*fileNameLength(in bytes) *extraFieldLength(in bytes) *fileCommentLength(in bytes) *internalFileAttributes*externalFileAttributes*relativeOffsetOfLocalHeader

These fields are of type Buffer, and represent variable-length bytes before being processed: *fileNameRaw*extraFieldRaw*fileCommentRaw

There are additional fields described below: fileName, extraFields, fileComment. These are the processed versions of the*Rawfields listed above. See their own sections below. (Note the inconsistency in pluralization of "field" vs "fields" inextraField, extraFields, and extraFieldRaw. Sorry about that.)

The new Entry()constructor is available for clients to call, but it's usually not useful. The constructor takes no parameters and does nothing; no fields will exist.

#### fileName

String. Following the spec, the bytes for the file name are decoded withUTF-8 if generalPurposeBitFlag & 0x800, otherwise with CP437. Alternatively, this field may be populated from the Info-ZIP Unicode Path Extra Field (seeextraFields).

This field is automatically validated by validateFileName()before yauzl emits an "entry" event. If this field would contain unsafe characters, yauzl emits an error instead of an entry.

If decodeStrings is false (see open()), this field is the undecoded Buffer instead of a decoded String. Therefore,generalPurposeBitFlagand any Info-ZIP Unicode Path Extra Field are ignored. Furthermore, no automatic file name validation is performed for this file name.

#### extraFields

Array with each item in the form {id: id, data: data}, whereid is a Number and data is a Buffer.

This library looks for and reads the ZIP64 Extended Information Extra Field (0x0001) in order to support ZIP64 format zip files.

This library also looks for and reads the Info-ZIP Unicode Path Extra Field (0x7075) in order to support some zipfiles that use it instead of General Purpose Bit 11 to conveyUTF-8file names. When the field is identified and verified to be reliable (see the zipfile spec), the file name in this field is stored in thefileNameproperty, and the file name in the central directory record for this entry is ignored. Note that whendecodeStrings is false, all Info-ZIP Unicode Path Extra Fields are ignored.

None of the other fields are considered significant by this library. Fields that this library reads are left unaltered in theextraFields array.

#### fileComment

String decoded with the charset indicated by generalPurposeBitFlag & 0x800 as with the fileName. (The Info-ZIP Unicode Path Extra Field has no effect on the charset used for this field.)

If decodeStrings is false (see open()), this field is the undecoded Buffer instead of a decoded String.

Prior to yauzl version 2.7.0, this field was erroneously documented as comment instead of fileComment. For compatibility with any code that uses the field namecomment, yauzl creates an alias field namedcomment which is identical to fileComment.

#### getLastModDate([options])

Returns the modification time of the file as a JavaScript Dateobject. The timezone situation is a mess; read on to learn more.

Due to the zip file specification having lackluster support for specifying timestamps natively, there are several third-party extensions that add better support. yauzl supports these encodings:

1. InfoZIP "universal timestamp" extended field (0x5455 aka "UT"): signed 32-bit seconds since 1970-01-01 00:00:00Z, which supports the years 1901-2038 (partially inclusive) with 1-second precision. The value is timezone agnostic, i.e. always UTC. 2. NTFS extended field (0x000a): 64-bit signed 100-nanoseconds since 1601-01-01 00:00:00Z, which supports the approximate years 20,000BCE-20,000CE with precision rounded to 1-millisecond (due to the JavaScript Datetype). The value is timezone agnostic, i.e. always UTC. 3. DOSlastModFileDate and lastModFileTime: supports the years 1980-2108 (inclusive) with 2-second precision. Timezone is interpreted either as the local timezone or UTC depending on the timezone option documented below.

If both the InfoZIP "universal timestamp" and NTFS extended fields are found, yauzl uses one of them, but which one is unspecified. If neither are found, yauzl falls back to the built-in DOSlastModFileDate and lastModFileTime. Every possible bit pattern of every encoding can be represented by a JavaScriptDateobject, meaning this function cannot fail (barring parameter validation), and will never return anInvalid Date object.

options may be omitted or null, and has the following defaults:

`js { timezone: "local", // or "UTC" forceDosFormat: false, }`

Set forceDosFormat to true (and do not set timezone) to enable pre-yauzl 3.2.0 behavior where the InfoZIP "universal timestamp" and NTFS extended fields are ignored.

The timezoneoption is only used in the DOS fallback. Iftimezone is omitted, null or "local", the lastModFileDate and lastModFileTime are interpreted in the system's current timezone (using new Date(year, ...)). Iftimezone is "UTC", the interpretation is in UTC+00:00 (using new Date(Date.UTC(year, ...))).

The JavaScript Dateobject, has several inherent limitations surrounding timezones. There is an ECMAScript proposal to add better timezone support to JavaScript called theTemporalAPI. Last I checked, it was at stage 3. https://github.com/tc39/proposal-temporal Once that new API is available and stable, better timezone handling should be possible here somehow. If you notice that the new API has become widely available, please open a feature request against this library to add support for it.

#### isEncrypted()

Returns is this entry encrypted with "Traditional Encryption". Effectively implemented as:

`js return (this.generalPurposeBitFlag & 0x1) !== 0;`

See openReadStream() for the implications of this value.

Note that "Strong Encryption" is not supported, and will result in an "error" event emitted from the ZipFile.

#### isCompressed()

Effectively implemented as:

`js return this.compressionMethod === 8;`

See openReadStream() for the implications of this value.

`$3`

This is a trivial class that has no methods and only the following properties. The constructor is available to call, but it doesn't do anything. SeereadLocalFileHeader().

See the zipfile spec for what these fields mean.

* fileDataStart - Number: inferred from fileNameLength, extraFieldLength, and this struct's position in the zipfile. *versionNeededToExtract - Number*generalPurposeBitFlag - Number*compressionMethod - Number*lastModFileTime - Number*lastModFileDate - Number*crc32 - Number*compressedSize - Number*uncompressedSize - Number*fileNameLength - Number*extraFieldLength - Number*fileName - Buffer*extraField - Buffer

Note that unlike Class: Entry, the fileName and extraFieldare completely unprocessed. This notably lacks Unicode and ZIP64 handling as well as any kind of safety validation on the file name. See alsoparseExtraFields().

Also note that if your object is missing some of these fields, make sure to read the docs on theminimal option in readLocalFileHeader().

`$3`

This class is meant to be subclassed by clients and instantiated for the fromRandomAccessReader() function.

An example implementation can be found in test/test.js.

#### randomAccessReader._readStreamForRange(start, end)

Subclasses must implement this method.

start and endare Numbers and indicate byte offsets from the start of the file.end is exclusive, so _readStreamForRange(0x1000, 0x2000) would indicate to read 0x1000bytes.end - start will always be at least 1.

This method should return a readable stream which will be pipe()ed into another stream. It is expected that the readable stream will provide data in several chunks if necessary. If the readable stream provides too many or too few bytes, an error will be emitted. (Note thatvalidateEntrySizeshas no effect on this check, because this is a low-level API that should behave correctly regardless of the contents of the file.) Any errors emitted on the readable stream will be handled and re-emitted on the client-visible stream (returned fromzipfile.openReadStream()) or provided as the errargument to the appropriate callback (for example, forfromRandomAccessReader()).

If you call readStream.destroy() on streams you get from openReadStream(), the returned stream must implement a method._destroy()according to https://nodejs.org/api/stream.html#writable_destroyerr-callback . If you never callreadStream.destroy(), then streams returned from this method do not need to implement a method ._destroy().._destroy()should abort any streaming that is in progress and clean up any associated resources.._destroy() will only be called after the stream has been unpipe()d from its destination.

Note that the stream returned from this method might not be the same object that is provided by openReadStream(). The stream returned from this method might bepipe()d through one or more filter streams (for example, a zlib inflate stream).

#### randomAccessReader.read(buffer, offset, length, position, callback)

Subclasses may implement this method. The default implementation usescreateReadStream() to fill the buffer.

This method should behave like fs.read().

#### randomAccessReader.close(callback)

Subclasses may implement this method. The default implementation is effectivelysetImmediate(callback);.

callback takes parameters (err).

This method is called once the all streams returned from _readStreamForRange()have ended, and no more_readStreamForRange() or read() requests will be issued to this object.

`How to Avoid Crashing`

When a malformed zipfile is encountered, the default behavior is to crash (throw an exception). If you want to handle errors more gracefully than this, be sure to do the following:

* Provide callback parameters where they are allowed, and check the errparameter. * Attach a listener for theerror event on any ZipFile object you get from open(), fromFd(), fromBuffer(), or fromRandomAccessReader(). * Attach a listener for theerror event on any stream you get from openReadStream().

Minor version updates to yauzl will not add any additional requirements to this list.

`Limitations`

The automated tests for this project run on node versions 12 and up. Older versions of node are not supported.

`$3`

For a lengthy discussion, see issue #69. In summary, the Mac Archive Utility is buggy when creating large zip files, and this library does not make any effort to work around the bugs. This library will attempt to interpret the zip file data at face value, which may result in errors, or even silently incomplete data. If this bothers you, that's good! Please complain to Apple. :) I have accepted that this library will simply not support that nonsense.

`$3`

Due to the design of the .zip file format, it's impossible to interpret a .zip file from start to finish (such as from a readable stream) without sacrificing correctness. The Central Directory, which is the authority on the contents of the .zip file, is at the end of a .zip file, not the beginning. A streaming API would need to either buffer the entire .zip file to get to the Central Directory before interpreting anything (defeating the purpose of a streaming interface), or rely on the Local File Headers which are interspersed through the .zip file. However, the Local File Headers are explicitly denounced in the spec as being unreliable copies of the Central Directory, so trusting them would be a violation of the spec.

Any library that offers a streaming unzip API must make one of the above two compromises, which makes the library either dishonest or nonconformant (usually the latter). This library insists on correctness and adherence to the spec, and so does not offer a streaming API.

Here is a way to create a spec-conformant .zip file using the zipcommand line program (Info-ZIP) available in most unix-like environments, that is (nearly) impossible to parse correctly with a streaming parser:

`$ echo -ne '\x50\x4b\x07\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' > file.txt $ zip -q0 - file.txt | cat > out.zip`

This .zip file contains a single file entry that uses General Purpose Bit 3, which means the Local File Header doesn't know the size of the file. Any streaming parser that encounters this situation will either immediately fail, or attempt to search for the Data Descriptor after the file's contents. The file's contents is a sequence of 16-bytes crafted to exactly mimic a valid Data Descriptor for an empty file, which will fool any parser that gets this far into thinking that the file is empty rather than containing 16-bytes. What follows the file's real contents is the file's real Data Descriptor, which will likely cause some kind of signature mismatch error for a streaming parser (if one hasn't occurred already).

By using General Purpose Bit 3 (and compression method 0), it's possible to create arbitrarily ambiguous .zip files that distract parsers with file contents that contain apparently valid .zip file metadata.

`$3`

For ZIP64, only zip files smaller than 8PiBare supported, not the full16EiBrange that a 64-bit integer should be able to index. This is due to the JavaScript Number type being an IEEE 754 double precision float.

The Node.js fs module probably has this same limitation.

`$3`

The spec does not allow zip file creators to put arbitrary data here, but rather reserves its use for PKWARE and mentions something about Z390. This doesn't seem useful to expose in this library, so it is ignored.

`$3`

This library does not support multi-disk zip files. The multi-disk fields in the zipfile spec were intended for a zip file to span multiple floppy disks, which probably never happens now. If the "number of this disk" field in the End of Central Directory Record is not0, theopen(), fromFd(), fromBuffer(), or fromRandomAccessReader() callback will receive an err. By extension the following zip file fields are ignored by this library and not provided to clients:

* Disk where central directory starts * Number of central directory records on this disk * Disk number where file starts

`$3`

You can detect when a file entry is encrypted with "Traditional Encryption" via isEncrypted(), but yauzl will not help you decrypt it. SeeopenReadStream().

If a zip file contains file entries encrypted with "Strong Encryption", yauzl emits an error.

If the central directory is encrypted or compressed, yauzl emits an error.

`$3`

Many unzip libraries mistakenly read the Local File Header data in zip files. This data is officially defined to be redundant with the Central Directory information, and is not to be trusted. Aside from checking the signature, yauzl ignores the content of the Local File Header.

`$3`

This library provides the crc32 field of Entryobjects read from the Central Directory. However, this field is not used for anything in this library.

`$3`

The field versionNeededToExtractis ignored, because this library doesn't support the complete zip file spec at any version,

`$3`

Regarding the compressionMethod field of Entryobjects, only method0(stored with no compression) and method8(deflated) are supported. Any of the other 15 official methods will cause theopenReadStream() callback to receive an err.

`$3`

There may or may not be Data Descriptor sections in a zip file. This library provides no support for finding or interpreting them.

`$3`

There may or may not be an Archive Extra Data Record section in a zip file. This library provides no support for finding or interpreting it.

`$3`

Zip files officially support charset encodings other than CP437 and UTF-8, but the zip file spec does not specify how it works. This library makes no attempt to interpret the Language Encoding Flag.

`$3`

The zip file specification has several ambiguities inherent in its design. Yikes!

* The .ZIP file comment must not contain the end of central dir signature bytes 50 4b 05 06. This corresponds to the text "PK☺☻"in CP437. While this is allowed by the specification, yauzl will hopefully reject this situation with an "Invalid comment length" error. However, in some situations unpredictable incorrect behavior will ensue, which will probably manifest in either an invalid signature error or some kind of bounds check error, such as "Unexpected EOF". * In non-ZIP64 files, the last central directory header must not have the bytes50 4b 06 07 ("PK♠•" in CP437) exactly 20 bytes from its end, which might be in the file name, the extra field, or the file comment. The presence of these bytes indicates that this is a ZIP64 file.

`Change History`

* 3.2.0 * Added support for reading third-party extensions for timestamps: InfoZIP "universal timestamp" extra field and NTFS extra field. pull #160 *entry.getLastModDate() takes options forceDosFormat to revert the above change, and timezoneto allow UTC interpretation of DOS timestamps. * DocumenteddosDateTimeToDate()as now deprecated. * 3.1.3 * Fixed a crash when usingfromBuffer()to read corrupt zip files that specify out of bounds file offsets. issue #156 * Enahnced the test suite to run the error tests throughfromBuffer() and fromRandomAccessReader() in addition to open(), which would have caught the above. * 3.1.2 * Fixed handling non-64 bit entries (similar to the version 3.1.1 fix) that actually have exactly 0xffffffff values in the fields. This fixes erroneous "expected zip64 extended information extra field" errors. issue #109 * 3.1.1 * Fixed handling non-64 bit files that actually have exactly 0xffff or 0xffffffff values in End of Central Directory Record. This fixes erroneous "invalid zip64 end of central directory locator signature" errors. issue #108 * Fixed handling of 64-bit zip files that put 0xffff or 0xffffffff in every field overridden in the Zip64 end of central directory record even if the value would have fit without overflow. In particular, this fixes an incorrect "multi-disk zip files are not supported" error. pull #118 * 3.1.0 * AddedreadLocalFileHeader() and Class: LocalFileHeader. * AddedopenReadStreamLowLevel(). * AddedgetFileNameLowLevel() and parseExtraFields(). Added fields toClass: Entry: fileNameRaw, extraFieldRaw, fileCommentRaw. * Addedexamples/compareCentralAndLocalHeaders.jsthat demonstrate many of these low level APIs. * Noted dropped support of node versions before 12 in the"engines" field of package.json. * Fixed a crash when callingopenReadStream() with an explicitly nulloptions parameter (as opposed to omitted). * 3.0.0 * BREAKING CHANGE: implementations of RandomAccessReader that implement adestroy method must instead implement _destroy in accordance with the node standard https://nodejs.org/api/stream.html#writable_destroyerr-callback (note the error and callback parameters). If you continue to override destoryinstead, some error handling may be subtly broken. Additionally, this is required for async iterators to work correctly in some versions of node. issue #110 * BREAKING CHANGE: Drop support for node versions older than 12. * Maintenance: Fix buffer deprecation warning by bundlingfd-slicerwith a 1-line change, rather than depending on it. issue #114 * Maintenance: Upgradebl dependency; add package-lock.json; drop deprecated istanbul dependency. This resolves all security warnings for this project. pull #125 * Maintenance: Replace broken Travis CI with GitHub Actions. pull #148 * Maintenance: Fixed a long-standing issue in the test suite where a premature exit would incorrectly signal success. * Officially gave up on supporting Mac Archive Utility corruption in order to rescue my motivation for this project. issue #69

* 2.10.0 * Added support for non-conformant zipfiles created by Microsoft, and added optionstrictFileNamesto disable the workaround. issue #66, issue #88 * 2.9.2 * Removedtools/hexdump-zip.js and tools/hex2bin.js. Those tools are now located here: thejoshwolfe/hexdump-zip and thejoshwolfe/hex2bin * Worked around performance problem with zlib when usingfromBuffer() and readStream.destroy()for large compressed files. issue #87 * 2.9.1 * Removedconsole.log()accidentally introduced in 2.9.0. issue #64 * 2.9.0 * Throw an exception ifreadEntry() is called without lazyEntries:true. Previously this caused undefined behavior. issue #63 * 2.8.0 * Added optionvalidateEntrySizes. issue #53 * Addedexamples/promises.js* Added ability to read raw file data viadecompress and decryptoptions. issue #11, issue #38, pull #39 * Addedstart and end options to openReadStream(). issue #38 * 2.7.0 * Added optiondecodeStrings. issue #42 * Fixed documentation forentry.fileCommentand added compatibility alias. issue #47 * 2.6.0 * Support Info-ZIP Unicode Path Extra Field, used by WinRAR for Chinese file names. issue #33 * 2.5.0 * Ignore malformed Extra Field that is common in Android .apk files. issue #31 * 2.4.3 * Fix crash when parsing malformed Extra Field buffers. issue #31 * 2.4.2 * Remove .npmignore and .travis.yml from npm package. * 2.4.1 * Fix error handling. * 2.4.0 * Add ZIP64 support. issue #6 * AddlazyEntriesoption. issue #22 * AddreadStream.destroy()method. issue #26 * AddfromRandomAccessReader(). issue #14 * Addexamples/unzip.js. * 2.3.1 * Documentation updates. * 2.3.0 * Check thatuncompressedSizeis correct, or else emit an error. issue #13 * 2.2.1 * Update dependencies. * 2.2.0 * Update dependencies. * 2.1.0 * Remove dependency oniconv. * 2.0.3 * Fix crash when trying to read a 0-byte file. * 2.0.2 * Fix event behavior after errors. * 2.0.1 * Fix bug with usingiconv. * 2.0.0 * Initial release.

`Development`

One of the trickiest things in development is crafting test cases located in test/{success,failure}/. These are zip files that have been specifically generated or design to test certain conditions in this library. I recommend using hexdump-zip to examine the structure of a zipfile.

For making new error cases, I typically start by copying test/success/linux-info-zip.zip`, and then editing a few bytes with a hex editor.

yauzl

yet another unzip library for node. For zipping, see
yazl.

Design principles:

Usage

``js var yauzl = require("yauzl");

See also examples/ for more usage examples.

`API`

The default for every optional callback parameter is:

`js function defaultCallback(err) { if (err) throw err; }`

`$3`

Calls fs.open(path, "r") and reads the fd effectively the same as fromFd() would.

options may be omitted or null. The defaults are {autoClose: true, lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}.

autoClose is effectively equivalent to:

`js zipfile.once("end", function() { zipfile.close(); });`

decodeStrings is the default and causes yauzl to decode strings with CP437 or UTF-8as required by the spec. The exact effects of turning this option off are:

`$3`

options may be omitted or null. The defaults are {autoClose: false, lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}.

See open() for the meaning of the options and callback.

`$3`

Like fromFd(), but reads from a RAM buffer instead of an open file.buffer is a Buffer.

If a ZipFileis acquired from this method, it will never emit thecloseevent, and callingclose() is not necessary.

options may be omitted or null. The defaults are {lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}.

See open()for the meaning of the options and callback. TheautoClose option is ignored for this method.

`$3`

This method of reading a zip file allows clients to implement their own back-end file system. For example, a client might translate read calls into network requests.

The readerparameter must be of a type that is a subclass of RandomAccessReader that implements the required methods. ThetotalSize is a Number and indicates the total file size of the zip file.

options may be omitted or null. The defaults are {autoClose: true, lazyEntries: false, decodeStrings: true, validateEntrySizes: true, strictFileNames: false}.

See open() for the meaning of the options and callback.

`$3`

This function only remains exported in order to maintain compatibility with older version of yauzl. It will be removed in yauzl 4.0.0 unless someone asks for it to remain supported.

`$3`

WARNING: This method of getting the file name bypasses the security checks in validateFileName(). You should call that function yourself to be sure to guard against malicious file paths.

generalPurposeBitFlag can be found on an Entry or LocalFileHeader. Only General Purpose Bit 11 is used, and only when an Info-ZIP Unicode Path Extra Field cannot be found inextraFields.

fileNameBuffer is a Bufferrepresenting the file name field of the entry. This isentry.fileNameRaw or localFileHeader.fileName.

extraFields is the parsed extra fields array from entry.extraFields or parseExtraFields().

strictFileNames is a boolean, the same as the option of the same name in open(). Whenfalse, backslash characters (\) will be replaced with forward slash characters (/).

This function always returns a string, although it may not be a valid file name. SeevalidateFileName().

`$3`

`js var errorMessage = yauzl.validateFileName(fileName); if (errorMessage != null) throw new Error(errorMessage);`

This function is automatically run for each entry, as long as decodeStrings is true. Seeopen(), strictFileNames, and Event: "entry" for more information.

`$3`

This function is used internally by yauzl to compute entry.extraFields. It is exported in case you want to call it onlocalFileHeader.extraField.

You may want to surround calls to this function with try { ... } catch (err) { ... } to handle the error.

`$3`

The constructor for the class is not part of the public API. Useopen(), fromFd(), fromBuffer(), or fromRandomAccessReader() instead.

#### Event: "entry"

Callback gets (entry), which is an Entry. Seeopen() and readEntry() for when this event is emitted.

If decodeStrings is true, entries emitted via this event have already passed file name validation. SeevalidateFileName() and open() for more information.

If validateEntrySizes is true and this entry's compressionMethod is 0(stored without compression), this entry has already passed entry size validation. Seeopen() for more information.

#### Event: "end"

Emitted after the last entryevent has been emitted. Seeopen() and readEntry() for more info on when this event is emitted.

#### Event: "close"

If this ZipFile was acquired from fromRandomAccessReader(), the "fd" in the previous paragraph refers to theRandomAccessReader implemented by the client.

If this ZipFile was acquired from fromBuffer(), this event is never emitted.

#### Event: "error"

#### readEntry()

The event that is emitted in response to this method will not be emitted until after this method has returned, so it is safe to call this method before attaching event listeners.

#### openReadStream(entry, [options], callback)

options may be omitted or null, and has the following defaults:

`js { decompress: entry.isCompressed() ? true : null, decrypt: null, start: 0, // actually the default is null, see below end: entry.compressedSize, // actually the default is null, see below }`

#### readLocalFileHeader(entry, [options], callback)

options may be omitted or null, and has the following defaults:

`js { minimal: false, }`

#### openReadStreamLowLevel(fileDataStart, compressedSize, relativeStart, relativeEnd, decompress, uncompressedSize, callback)

This is a low-level function available for advanced use cases. You probably want openReadStream() instead.

#### close()

Causes all future calls to openReadStream()to fail, and closes the fd, if any, after all streams created byopenReadStream() have emitted their end events.

If the autoClose option is set to true (see open()), this function will be called automatically effectively in response to this object'send event.

It is safe to call this function multiple times; after the first call, successive calls have no effect. This includes situations where theautoClose option effectively calls this function for you.

#### isOpen

Boolean. true until close() is called; then it's false.

#### entryCount

Number. Total number of central directory records.

#### comment

String. Always decoded with CP437 per the spec.

If decodeStrings is false (see open()), this field is the undecoded Buffer instead of a decoded String.

`$3`

Objects of this class represent Central Directory Records. Refer to the zipfile specification for more details about these fields.

These fields are of type Number:

These fields are of type Buffer, and represent variable-length bytes before being processed: *fileNameRaw*extraFieldRaw*fileCommentRaw

The new Entry()constructor is available for clients to call, but it's usually not useful. The constructor takes no parameters and does nothing; no fields will exist.

#### fileName

This field is automatically validated by validateFileName()before yauzl emits an "entry" event. If this field would contain unsafe characters, yauzl emits an error instead of an entry.

#### extraFields

Array with each item in the form {id: id, data: data}, whereid is a Number and data is a Buffer.

This library looks for and reads the ZIP64 Extended Information Extra Field (0x0001) in order to support ZIP64 format zip files.

None of the other fields are considered significant by this library. Fields that this library reads are left unaltered in theextraFields array.

#### fileComment

String decoded with the charset indicated by generalPurposeBitFlag & 0x800 as with the fileName. (The Info-ZIP Unicode Path Extra Field has no effect on the charset used for this field.)

If decodeStrings is false (see open()), this field is the undecoded Buffer instead of a decoded String.

#### getLastModDate([options])

Returns the modification time of the file as a JavaScript Dateobject. The timezone situation is a mess; read on to learn more.

Due to the zip file specification having lackluster support for specifying timestamps natively, there are several third-party extensions that add better support. yauzl supports these encodings:

options may be omitted or null, and has the following defaults:

`js { timezone: "local", // or "UTC" forceDosFormat: false, }`

Set forceDosFormat to true (and do not set timezone) to enable pre-yauzl 3.2.0 behavior where the InfoZIP "universal timestamp" and NTFS extended fields are ignored.

#### isEncrypted()

Returns is this entry encrypted with "Traditional Encryption". Effectively implemented as:

`js return (this.generalPurposeBitFlag & 0x1) !== 0;`

See openReadStream() for the implications of this value.

Note that "Strong Encryption" is not supported, and will result in an "error" event emitted from the ZipFile.

#### isCompressed()

Effectively implemented as:

`js return this.compressionMethod === 8;`

See openReadStream() for the implications of this value.

`$3`

This is a trivial class that has no methods and only the following properties. The constructor is available to call, but it doesn't do anything. SeereadLocalFileHeader().

See the zipfile spec for what these fields mean.

Also note that if your object is missing some of these fields, make sure to read the docs on theminimal option in readLocalFileHeader().

`$3`

This class is meant to be subclassed by clients and instantiated for the fromRandomAccessReader() function.

An example implementation can be found in test/test.js.

#### randomAccessReader._readStreamForRange(start, end)

Subclasses must implement this method.

#### randomAccessReader.read(buffer, offset, length, position, callback)

Subclasses may implement this method. The default implementation usescreateReadStream() to fill the buffer.

This method should behave like fs.read().

#### randomAccessReader.close(callback)

Subclasses may implement this method. The default implementation is effectivelysetImmediate(callback);.

callback takes parameters (err).

This method is called once the all streams returned from _readStreamForRange()have ended, and no more_readStreamForRange() or read() requests will be issued to this object.

`How to Avoid Crashing`

When a malformed zipfile is encountered, the default behavior is to crash (throw an exception). If you want to handle errors more gracefully than this, be sure to do the following:

Minor version updates to yauzl will not add any additional requirements to this list.

`Limitations`

The automated tests for this project run on node versions 12 and up. Older versions of node are not supported.

`$3`

`$ echo -ne '\x50\x4b\x07\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' > file.txt $ zip -q0 - file.txt | cat > out.zip`

`$3`

The Node.js fs module probably has this same limitation.

`$3`

* Disk where central directory starts * Number of central directory records on this disk * Disk number where file starts

`$3`

You can detect when a file entry is encrypted with "Traditional Encryption" via isEncrypted(), but yauzl will not help you decrypt it. SeeopenReadStream().

If a zip file contains file entries encrypted with "Strong Encryption", yauzl emits an error.

If the central directory is encrypted or compressed, yauzl emits an error.

`$3`

This library provides the crc32 field of Entryobjects read from the Central Directory. However, this field is not used for anything in this library.

`$3`

The field versionNeededToExtractis ignored, because this library doesn't support the complete zip file spec at any version,

`$3`

There may or may not be Data Descriptor sections in a zip file. This library provides no support for finding or interpreting them.

`$3`

There may or may not be an Archive Extra Data Record section in a zip file. This library provides no support for finding or interpreting it.

`$3`

The zip file specification has several ambiguities inherent in its design. Yikes!

`Change History`

`Development`

For making new error cases, I typically start by copying test/success/linux-info-zip.zip`, and then editing a few bytes with a hex editor.