Pack200 and Compression |
This chapter includes the following topics:
To increase server and network availability and band-width, two new compression formats are available to Java deployment of applications and applets: gzip and Pack200.
With both techniques the compressed JAR files are transmitted over the network and the receiving application decompresses and restores them.
HTTP 1.1 (RFC 2616) protocol discusses HTTP compression. HTTP Compression allows applications JAR files to be deployed as compressed JAR files. The supported compression techniques are gzip,compress,deflate.
As of SDK/JRE version 5.0, HTTP compression is implemented in Java Web Start
and Java Plug-in in compliance with RFC 2616. The supported techniques are
gzip and pack200-gzip.
The requesting application sends an
HTTP request to the server. An HTTP request has multiple fields. The
Accept-Encoding (AE) field is set to pack200-gzip
or
gzip
, indicating to the server that the application can handle
pack200-gzip
or gzip
format.
The server implementation will search for the requested JAR file with
.pack.gz
or .gz
file extension and respond back with
the located file. The server will set the response header Content-Encoding (CE)
field to pack200-gzip
, gzip
, or NULL depending on the
type of file that is being sent, and optionally may set the Content-Type (CT) to
application/Java-archive. Therefore, by inspecting the CE field, the
requesting application can apply the corresponding transformation to restore the
original JAR file.
The above can be achieved using a simple servlet or server module with any HTTP 1.1 compliant web-servers. Compressing files on the fly will degrade server performance, especially with Pack200, and therefore not recommended.
Sample Tomcat Servlet:
/** * A simple HTTP Compression Servlet */ import java.util.*; import java.io.*; import javax.servlet.*; import javax.servlet.http.*; import java.util.zip.*; import java.net.*; /** * The servlet class. */ public class ContentType extends HttpServlet { private static final String JNLP_MIME_TYPE = "application/x-java-jnlp-file"; private static final String JAR_MIME_TYPE = "application/x-java-archive"; private static final String PACK200_MIME_TYPE = "application/x-java-pack200"; // HTTP Compression RFC 2616 : Standard headers public static final String ACCEPT_ENCODING = "accept-encoding"; public static final String CONTENT_TYPE = "content-type"; public static final String CONTENT_ENCODING = "content-encoding"; // HTTP Compression RFC 2616 : Standard header for HTTP/Pack200 Compression public static final String GZIP_ENCODING = "gzip"; public static final String PACK200_GZIP_ENCODING = "pack200-gzip"; private void sendHtml(HttpServletResponse response, String s) throws IOException { PrintWriter out = response.getWriter(); out.println("<html>"); out.println("<head>"); out.println("<title>ContentType</title>"); out.println("</head>"); out.println("<body>"); out.println(s); out.println("</body>"); out.println("</html>"); } /* * Copy the inputStream to output , */ private void sendOut(InputStream in, OutputStream ostream) throws IOException { byte buf[] = new byte[8192]; int n = in.read(buf); while (n > 0 ) { ostream.write(buf,0,n); n = in.read(buf); } ostream.close(); in.close(); } boolean doFile(String name, HttpServletResponse response) { File f = new File(name); if (f.exists()) { getServletContext().log("Found file " + name); response.setContentLength(Integer.parseInt( Long.toString(f.length()))); response.setDateHeader("Last-Modified",f.lastModified()); return true; } getServletContext().log("File not found " + name); return false; } /** Called when someone accesses the servlet. */ public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException { String encoding = request.getHeader(ACCEPT_ENCODING); String pathInfo = request.getPathInfo(); String pathInfoEx = request.getPathTranslated(); String contentType = request.getContentType(); StringBuffer requestURL = request.getRequestURL(); String requestName = pathInfo; ServletContext sc = getServletContext(); sc.log("----------------------------"); sc.log("pathInfo="+pathInfo); sc.log("pathInfoEx="+pathInfoEx); sc.log("Accept-Encoding="+encoding); sc.log("Content-Type="+contentType); sc.log("requestURL="+requestURL); if (pathInfoEx == null) { response.sendError(response.SC_NOT_FOUND); return; } String outFile = pathInfo; boolean found = false; String contentEncoding = null; // Pack200 Compression if (encoding != null && contentType != null && contentType.compareTo(JAR_MIME_TYPE) == 0 && encoding.toLowerCase().indexOf(PACK200_GZIP_ENCODING) > -1){ contentEncoding = PACK200_GZIP_ENCODING; if (doFile(pathInfoEx.concat(".pack.gz"),response)) { outFile = pathInfo.concat(".pack.gz") ; found = true; } else { // Pack/Compress and transmit, not very efficient. found = false; } } // HTTP Compression if (found == false && encoding != null && contentType != null && contentType.compareTo(JAR_MIME_TYPE) == 0 && encoding.toLowerCase().indexOf("gzip") > -1) { contentEncoding = GZIP_ENCODING; if (doFile(pathInfoEx.concat(".gz"),response)) { outFile = pathInfo.concat(".gz"); found = true; } } // No Compression if (found == false) { // just send the file contentEncoding = null; sc.log(CONTENT_ENCODING + "=" + "null"); doFile(pathInfoEx,response); outFile = pathInfo; } response.setHeader(CONTENT_ENCODING, contentEncoding); sc.log(CONTENT_ENCODING + "=" + contentEncoding + " : outFile="+outFile); if (sc.getMimeType(pathInfo) != null) { response.setContentType(sc.getMimeType(pathInfo)); } InputStream in = sc.getResourceAsStream(outFile); OutputStream out = response.getOutputStream(); if (in != null) { try { sendOut(in,out); } catch (IOException ioe) { if (ioe.getMessage().compareTo("Broken pipe") == 0) { sc.log("Broken Pipe while writing"); return; } else throw ioe; } } else response.sendError(response.SC_NOT_FOUND); } }
Pack200 compresses large files very efficiently,
depending on the density and size of the class files in the JAR file. One can
expect compression to 1/9 the size of the JAR file, if it contains only class
files and is in the order of several MB.
Using the same jar in the previous
example:
Notepad.jar 46.25
kb
Notepad.jar.pack.gz 22.58 kb
In this case the same jar can be reduced by
50%.
Please note: when signing large jars, step 5 may fail with a Security Error a likely cause is bug 5078608. Please use one of the workarounds detailed in the release notes.
Pack200 works most efficiently on Java class files. It uses several techniques to efficiently reduce the size of JAR files:
Steps to Pack a file
1. Consider the size of the JAR file, the contents of the JAR file, and the bandwidth of your target audience.
All these factors play into choosing a compression technique. The unpack200 is designed to be as efficient as possible and it takes little time to restore the original file. If you have large JAR files (2 MB or more) comprised mostly of class files, Pack200 is the preferred compression technique. If you have large JAR files which are comprised of resource files (JPEG, GIF, data, etc.), then gzip is the preferred compression technique.
2. Pack200 segmenting.
Pack200 loads the entire packed file into memory. However, when target systems are memory and resource constrained, setting the
Pack200.Packer.SEGMENT_LIMIT
to a lower value, will reduce the memory requirements during packing and unpacking. ThePack200.Packer.SEGMENT_LIMIT=-1
will force one segment to be generated, which will be effect in size reduction, but will require a much larger Java heap on the packing and and unpacking system. Note that several of these packed segments may be concatenated to produce a single packed file.3. Signing the JAR files.
Pack200 rearranges the contents of the resultant JAR file. The jarsigner hashes the contents of the class file and stores the hash in an encrypted digest in the manifest. When the unpacker runs on a packed packed, the contents of the classes will be rearranged and thus invalidate the signature. Therefore, the JAR file must be normalized first using pack200 and unpack200, and thereafter signed.
(Here's why this works: Any reordering the packer does of any classfile structures is idempotent, so the second packing does not change the orderings produced by the first packing. Also, the unpacker is guaranteed by the JSR 200 specification to produce a specific bytewise image for any given transmission ordering of archive elements.)
An Example
Suppose you wish to use HelloWorld.jar.
Step 1: Repack the file to normalize the jar, repacking calls the packer and unpacks the file in one step.% pack200 --repack HelloWorld.jar
Step 2: Sign the jar after we normalize using repack.% jarsigner -keystore myKeystore HelloWorld.jar ksrini
Verify the just signed jar to ensure the signing worked.% jarsigner -verify HelloWorld.jar
jar verified.
Ensure the jar still works.% Java -jar HelloWorld.jar
HelloWorld
Step 3: Now we pack the file% pack200 HelloWorld.jar.pack.gz HelloWorld.jar
Step 4: Unpack the file% unpack200 HelloWorld.jar.pack.gz HelloT1.jar
Step 5: Verify the jar% jarsigner -verify HelloT1.jar
jar verified.
// Test the jar ...% Java -jar HelloT1.jar
HelloWorld
After verification, the compressed pack file HelloWorld.jar.pack.gz can be deployed.4. Reduction techniques:
Pack200 by default behaves in a High Fidelity (Hi-Fi) mode, meaning all the original attributes present in the classes as well as the attributes of each individual entry in a JAR file is retained. These typically tend to add to the packed file size, here are some of the
techniques one can use to further reduce the size of the download:
- Modification times: If modification time of the individual entries in a JAR file is not a concern, you can specify the option
Pack200.Packer.MODIFICATION_TIME="LATEST"
. This will allow one modification time to be transmitted in the pack file for each segment. The latest time will be the latest time of any entry within that segment.- Deflation hint: Similar to the above, if the compression state of the individual entries in the archive is not required, set Pack200.Packer.DEFLATION_HINT="false". This will fractionally reduce the download size, as individual compression hints will not be transmitted. However, the jar when recomposed will contain "stored" entries and hence may consume more disk space on the target system.
For example:
pack200 --modification-time=latest --deflate-hint="true" tools-md.jar.pack.gz tools.jar
Note: the above optimizations will yield better results with a JAR file containing thousands of entries.
- Attributes: Several class attributes are not required when deploying JAR files. These attributes can be stripped out of class files, significantly reducing download size. However, care must be taken to ensure that required runtime attributes are maintained.
- Debugging attributes: If debugging information, such as Line Numbers and Source File, is not required (typically in applications stack traces), then these attributes can be discarded by specifying
Pack200.Packer.STRIP_DEBUG=true.
This typically reduces the packed file by about 10%.Example:
pack200 --strip-debug tools-stripped.jar.pack.gz tools.jar
- Other attributes: Advanced users may use some of the other strip-related properties to strip out additional attributes. However, extreme caution should be used when doing so, the resultant JAR file must be tested on all possible Java runtime systems to ensure that the runtime does not depend on the stripped attributes.
5. Handling unknown attributes:
6. Installers:Pack200 deals with standard attributes defined by the Java Virtual Machine Specification, however compilers are free to introduce custom attributes. When such attributes are present, by default, Pack200 passes through the class, emitting a warning message. These "passed-through" class files, may contribute to bloating of packed files. If the unknown attributes are prevalent in the classes of a JAR file, this may lead to a very large bloat of the packed output. In such a cases, consider the following strategies:
Strip the attribute if the attribute is deemed to be redundant at runtime, this can be achieved by setting the property Pack200.Packer.
UNKNOWN_ATTRIBUTE=STRIP or
pack200 --unknown-attribute=strip foo.pack.gz foo.jar
If the attributes are required at runtime, and they do contribute to an inflation, then identify the attribute from the warning message and apply a suitable layout for these, as described in the Pack200 JSR 200 specification., and the Java API reference section for Pack200.Packer.
Its possible that a compiler could define an attribute not implemented in the layout specification of Pack200, and may cause the Packer to malfunction, in such cases an entire class file(s) can be "passed through", as if it were a resource by virtue of its name and can be specified as follows:
pack200 --pass-file="com/acme/foo/bar/baz.class" foo.pack.gz foo.jar
or an entire directory and its contents,pack200 --pass-file="com/acme/foo/bar/" foo.pack.gz foo.jar
You may wish to take advantage of the Pack200 technology in your installation program, whereby a product's jars may need to compressed using Pack200 and decompressed during the installation. If the JRE or SDK is bundled in the installation, you are free to use the unpack200 (Unix) or unpack200.exe(Windows) in the distribution 'bin' directory, this implementation is a pure C++ application requiring no Java runtime to be present for it to run.Windows: Installers may use a better algorithm than the one in GZIP to compress entries in such cases, one will get better compression using the Installer's intrinsic compression, by using the pack200 as follows:
pack200 --no-gzip foo.jar.pack foo.jar
This will prevent the output file from being gzip compressed.
unpack200 is a Windows Console application, ie. it will display a MS-DOS window during the install, to suppress this, you can use a launcher with a WinMain which will suppress this window, as shown below.
Sample Code:
#include "windows.h" #include <stdio.h> int APIENTRY WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow) { STARTUPINFO si; memset(&si, 0, sizeof(si)); si.cb = sizeof(si); PROCESS_INFORMATION pi; memset(&pi, 0, sizeof(pi)); //Test //lpCmdLine = "c:/build/windows-i586/bin/unpack200 -l c:/Temp/log c:/Temp/rt.pack c:/Temp/rt.jar"; int ret = CreateProcess(NULL, /* Exec. name */ lpCmdLine, /* cmd line */ NULL, /* proc. sec. attr. */ NULL, /* thread sec. attr */ TRUE, /* inherit file handle */ CREATE_NO_WINDOW | DETACHED_PROCESS, /* detach the process/suppress console */ NULL, /* env block */ NULL, /* inherit cwd */ &si, /* startup info */ &pi); /* process info */ if ( ret == 0) ExitProcess(255); // Wait until child process exits. WaitForSingleObject( pi.hProcess, INFINITE ); DWORD exit_val; // Be conservative and return if (GetExitCodeProcess(pi.hProcess, &exit_val) == 0) ExitProcess(255); ExitProcess(exit_val); // Return the error code of the child process return -1; }
It is required that all JAR files, packed and
unpacked, be tested for correctness with your applications test qualifiers.
When using the command line interface pack200
, the output file
will be compressed using gzip
with default values. A user
may create a simple pack file and compress using gzip
with user-specified options or using some other compressor.
For more information see
pack200
andunpack200
in Java Deployment Tools.
In Java SE 6, the Java class file format has been updated. For more information see JSR 202: Java Class File Specification Update. Due to JSR 202 the Pack200 engine needs to be updated accordingly for the following reasons:
- Align with the new class file format for Java SE 6
- Ensure that Java SE 6 class files are compressed effectively.
To keep the changes minimal and seamless for users, the packer will generate appropriately versioned pack files based on the version of the input class files.
Also to maintain backward compatibility, if the input JAR-files are solely comprised of JDK 1.5 or older class files, a 1.5 compatible pack file is produced. Otherwise a Java SE 6 compatible pack200 file is produced. For more information, refer the Pack200 man page.