Web Site Compression

Web Site Compression

The problem is that this technology is typically used for serving up static pages and majority of data sent over the Internet is dynamic and so does not lend itself to conventional caching technologies.

Solution? Compression.

The main idea for this is self-explanatory: data being sent out from your Web browser is compressed (and decompressed by the browser on the fly) to reduce the amount of data sent and thus increase the page display speed.
There are two ways to do this:
For static requests – pre-compressed text-based data is generated beforehand and stored on the server (.html.gz files etc) and served to multiple requests. This saves CPU cycles and makes for faster responses.
For dynamic pages (like Active Server/Cold Fusion pages or e-commerce apps, database-driven sites, etc.)- Dynamic Content Acceleration compresses the data transmission data on the fly since response has to be compressed on every request and output is different for every request.

Both types of compression utilize HTTP compression and compress HTML files fully three times smaller.

Some of its advantages include:

* faster page loading
* savings on CPU cycles
* reduced bandwidth usage, resulting in
* reduced cost in operating Web sites

There are, however, problems that need to be solved to enable seamless transmission from the server to the consumer:

* compression should not conflict with MIME types
* dynamic compression should not effect server performance
* server should know whether the user’s browser can decompress the content.

Different types of files compress differently. Files that are already compressed (jpgs and gifs) or that have a random set of bits do not compress well, while files that have a lot of white space or text (HTML) compress very well (up to 90%). PDFs cannot be compressed since the Adobe Acrobat reader can’t handle compressed files.

Also, different compression techniques can create different outputs (that usually come in a deflate format). An example is Gzip. It has a format of wrapping the compressed page for transmission (includes a ten byte header, followed by the compressed bytes in a deflate format, usually followed by a checksum and original file size). The first two bytes of any gzip file are 0x1f followed by 0x8b and the third byte is the compression format (0×08 is deflate).

According to Wayne Berry, in his article Web Site Compression, Internet Explorer 4.0 and above and Netscape 3.0 and above both have the ability to decompress responses from Web servers that send a compressed response. They are built in, enabled by default and doesn’t require a plug-in. Browsers with this feature signal the Web server by sending a request header called “Accept-Encoding:”. Internet Explorer sends the header as “Accept-Encoding: gzip, deflate” and Netscape sends the header as “Accept-Encoding: deflate”. Each indicates the type of compression that they can decompress.

IIS 5.0 has an HTTP compression option (turned off by default) that allows you to compress static or dynamic pages, or both. Instructions to enable and customize this feature can be found at `Using HTTP Compression On Your IIS 5.0 Web Site,’ by Dan Evers.

NOTE: Internet Explorer 5.5 and 6.0 have a bug with decompression that affects some users (with plug-ins like Adobe Photoshop). A fix is available from Microsoft via its product support with reference Q313712.

Compression products made by 3rd parties are also available, including httpZip, which is discussed in detail by George Petrov, in the article Speed Up Your Site – Using HTTP Compression!, and HTML2Zip – a powerful HTML compression tool for web developers.

Other products can also be found at InnerMedia – Mastering Compression for Development and the Web or you can use online services such as: Net-Bizz Site Compression Service or Web Page Analyzer – 0.82, which includes compression detection.

In Apache, compression is achieved using different methods:

* In Content Negotiation, two separate sets of HTML files need to be generated, one for clients that can handle GZIP-encoding, and one for those who can’t. The problem with this is the lack of provision in methodology for GZIP-encoding dynamically-generated pages.
* For those who want to add GZIP-encoding to Apache can use mod_gzip. Stephen Pierzchala discusses this in detail in the following articles: Compressing Web Output Using mod_gzip for Apache 1.3.x and 2.0.x, Compress Web Output Using mod_gzip and Apache, and Compressing Web Content with mod_gzip and mod_deflate.

Similar Articles : Ad Management Scripts/SoftwareBroken Link CheckersCaching Web Site for SpeedCloaking in Web Hosting Web Sites PagesHacking Attacks – How and WhyHacking Attacks – PreventionManaged Hosting Web Hosting,New Webmasters Guide to Hosting Bloggers and Bulletin BoardsOnline News Publishing for FreeosCommerce ContributionsOSCommerce for ECommerce StoresSpeeding Up Web Page Loading – Part I (1)Speeding Up Web Page Loading – Part II (2)Tracking Web Site TrafficWeb Site Backup WebpageWeb Site CompressionWebsite/Network MonitoringGuide to Setting Up Your Own Website