One-Click Backup Program

Here is another unique and great feature for Art Sites clients. With just a click on your administration pages, you back-up your entire site. Our client backup program packs your entire site into a single file that you may download to your computer for safe keeping. The backup is also constructed such that it can be ported to almost any Web server and look exactly like your site hosted by Art Sites.

For all the whys and hows of this backup program, take a look at the readme file, below. A copy of this readme file is custom written and packed with each backup. The one below is from a test run made on a client site, earlier today:

bullet bullet bullet bullet bullet bullet bullet

readme.txt file generated by
Art Sites Client Backup Program version 0.1.8
written by Patrick Moore, 2006.
for yourdomain.com

Both backup files, yourdomain_com_backup_20060905_145956.zip and yourdomain_com_backup_20060905_145956.tgz, have exactly the same contents. The only reason for two is that one format may better suit your unpacking preferences. Either can be unpacked with common decompression utilities like winzip or gunzip and be placed on practically any Web server, presenting your website to the public with exactly the same content and appearance as the original.

Behind the scenes, however, there are some significant differences between the backup and the original. This document will explain the nature and reasons for the differences and it will suggest how to launch the backup as a publicly viewable replacement for the original should the need or desire arise. The main points addressed are:

How the backup is the same as the original.

From the general public perspective, the backup of yourdomain.com will look and work exactly like the original. It includes all of the publicly viewable pages and images of the original, and clicking the various links on the site will produce exactly the same results and appearance to a visitor.

How the backup is different from the original.

The major difference between the backup and the original is in how it must be maintained when you wish to make any changes to the website content. Unlike the original, the backup does not include any administration pages. To make any changes in the content of the backup will require the employment of someone who is familiar with editing HTML code, JavaScript and CSS stylesheets.

Why the differences are necessary.

Art Sites websites are dynamic while the backup is necessarily static. What this means is that your original Art Sites website is run by a suite of programs and databases on an especially configured server that creates the content delivered to the public on the fly. These programs, databases and configurations were written and developed especially for our target clientele, artists, and it would not be feasible to bundle all of the features that Art Sites offers into a backup file for you. The Art Sites server and programming is a one of a kind thing, and to provide you with a portable backup of your website would do you little good if the Art Sites server were the only platform that could run it. The purpose of this backup is to provide you with an easily portable copy of your website that can be delivered from practically any server. And that's just what it is. Give the backup to any modestly competent Web host, and they can have your site up and running in a matter of minutes—not the years it has taken to develop Art Sites.

Dynamic vs. Static websites

Dynamic websites, as mentioned above, generate content on the fly by running various programs, querying databases and collecting images from various folders, etc. For a simple example, when a visitor to your gallery clicks on a thumbnail to view a large image, a dynamic site doesn't have a separate page for each large image sitting there ready and waiting to be viewed. There is just one program called "large_image." Whether a visitor wants to see "large_image/tree" or "large_image/bush," the exact same program is run. In the first case it tells a database that information about a tree is wanted, then the database returns the requested information to the program—including where to find the right pictures elsewhere on the server. The large_image program then puts it all together and creates and delivers the wanted page to the visitor.

By comparison, in a static website, like your backup, there is a separate page for every large image—and every other page on the site. It's the only way to do it without the databases and so forth of a dynamic site. It's also the way that almost all websites once were and maybe most still are. (And Sheesh! Backing up a static website is about a zillion times simpler. It's just a matter of copying a directory tree from one place to another. But that's what you have now, a website that can be ported as simply as that.)

How the backup was made.

To back-up your site, converting it from dynamic to static, the backup program systematically read through every publicly viewable page of the dynamic site. Starting with the home page, it harvested all of its links to other pages, images, stylesheets, JavaScripts, etc. Along the way, it then harvested all of the links from those resources as well. In the process, it found and considered 1,560 internal links to various resources at yourdomain.com. Every time it found some new resource, it made a copy of it as the public would see it. Then, when it had exhausted every possibility of what the site could show to the public, the program packed all of the files together into one big file for downloading.

Beyond just a point of interest, most if not all of those resource links were changed within each page, too. This was necessary because dynamic, database-driven sites often have link addresses that look like "/large_image.html?subject=tree." Without the database, however, no other Web server would know what to do with that address. To manage this, the backup program created a folder called "/large_image" and then created a page called "tree.html" to put in the folder. Then, on every page where the backup program came across the link, "/large_image.html?subject=tree," it changed the link to "/large_image/tree.html."

Continuing the example, most websites developed by Art Sites now use what are called "clean urls." They are called "clean" because they do away with all the question marks and equal signs and ampersands in the urls of so many dynamic websites, making them prettier and easier to read. Using a clean URL, the address in the previous example might look like "/large_image/tree" instead of "/large_image.html?subject=tree." The Art Sites server knows exactly what to do with either address, but most servers wouldn't have a clue. In the case of "/large_image/tree," unless the server had a program named "large_image" and knew to feed "tree" to that program in order to make the appropriate database connections, etc., what nearly all Web servers would do with that address would be to look for a file called "/large_image/tree/index.html." Well, no such file would exist unless the backup program made it and called it that, so that's what it did and why.

The .htaccess file

Bundled in your backup is a file called .htaccess. This is a very important file. The problem created by changing all or nearly all of your page names as described above, is that other sites that have previously linked to your site will probably get a dreaded "404 file not found" error when visitors try to click through from a remote site. When someone goes directly to http://yourdomain.com or http://www.yourdomain.com, they won't have any problems. And, once there, they will be able to click through all of the pages of the backup site and never know it isn't exactly the same as the original.

Still, it's a serious concern that external links to other pages of your site would become broken if your website backup ever needs to be deployed on a non-Art Sites server. There are several possible solutions to this problem, but the simplest and best is to implement "301 Redirects." And that's what the .htaccess file does. The file includes redirect instructions that will automatically redirect visitors from the original address to the new, backup address.

For the .htaccess file to be useful, an Apache server offering mod_alias is required. Over 60% of public Web servers are Apache, and mod_alias is a very basic Apache module, so meeting this hosting requirement should not be difficult. Some Apache hosts limit how .htaccess files may be used on their servers. Again, however, providing for .htaccess is fairly basic, so if any prospective host denies this, they don't need your business. Nearly all Web servers provide for 301 redirects in one way or another, but the .htaccess file provided with the backup is the friendliest solution. There are more comments about implementing the .htaccess file in the .htaccess file, itself.

Your contact page

If your site has a contact form on your contact page, it has been commented out for the purpose of the backup. It still exists in the backup page's source code, but in order to keep it from being publicly viewable, it has simply been put between tags that look like this, . Again, because the static backup does not include the backend programming necessary to send mail from a server, it was better to make the form invisible than to let a visitor write a message to you only to be met with a "not found" error when they tried to send it. Any webmaster can produce a script for sending mail from the form, dependent on the particular server's mail programming, and then make the form publicly visible again by removing the comment tags. I'm mentioning this for the sake of that webmaster, too, because the work is already half done. By looking between the comment tags in the source code of the contact page, you can see the variables that the form will want to post, and the mailing script can be written accordingly.

The robots.txt file

If you go perusing the files in your backup, you may wonder what the robots.txt file is—especially if you find yours to be empty. robots.txt is simply a file that virtually every search engine robot looks at first when it comes to crawl your site to find interesting content for their search engine. The robots.txt file is an industry standard for the primary purpose of telling the robots what pages of your site are off limits and should not be indexed. Because, typically, there are no publicly linked pages on an Art Sites developed website that are off-limit to search engines, I often just make robots.txt a blank file. If it's blank, why have it at all? Because the robots are always looking for it by default, anyway, and the server returns a 404, file not found, error when there isn't any file of any sort by that name. By serving at least a blank robots.txt, it stops those errors. Plus it gives the robots what they want: The message to for it!

As a point of interest, on some sites I do use robots.txt to give instructions for robots NOT to visit some phony and otherwise unlinked pages. When a robot goes there anyway, I can be pretty sure it's up to no good—like trying to harvest email addresses to later send out spam—and I can then block them from all access to the server in the future.

The favicon.ico file

This is just a little picture, sixteen pixels square, of your logo or some image to help carry the motif of your site. I rarely mention it, but I typically make favicon.icos for all Art Sites websites. These little guys were first implemented by Internet Explorer, way back when. The file name stands for "favorites icon," and they were initially used to represent a site by including the little picture on the browser's "favorites" list when an Internet Explorer user bookmarked a website. Since then, many other browsers have also come use the favicon.ico file. Internet Explorer still only uses it for bookmarks, but browsers like Firefox and Opera also use them in places like the browser address bar and various buttons and tabs to help visitors identify which open window is which.

Like robots.txt, I first became aware of what favicon.icos were (also way back when) when I looked at my Web server logs and saw all of those 404 errors because so many browsers look for a favicon.ico by default, and I didn't have any! So, I made my first favion.ico mostly just to stop those "not found" errors. Since then, however, I've come to much appreciate the professional touch that a well-designed favion.ico adds to a website. I'll also say that, small as they are, it's often difficult to make them look like anything. So, cherish your favicon.ico!

The backup_log.html file

Also bundled in the backup file is a copy of the backup_log.html that was generated during the backup process and was displayed to you on the page after the backup was completed. This is provided mostly as a reference to make sure that the backup process went smoothly and to make sure that everything that should have been included was included. Beyond that, once the backup is functioning on a server, every file name in backup_log.html is a link that will open that file. A copy of each backup log is also saved on the Art Sites server, by the way, so I can review them to ensure that things are working as they should and to help me sort out any bugs.

Original images

Well, I said earlier that the only files in the backup file are those that include public links. As the .htaccess, robots.txt and backup_log.html attest, there is actually more than that. In addition to those, all of the original images that you first uploaded to your site for various galleries and other pages have also been saved in the backup. These original images are not directly viewable by the public, but they have been saved on the server from the beginning. Our purpose for keeping them is to provide the highest quality images possible when you use the online image editing features provided on your Art Sites administration pages.

When you use our proprietary image editing program to edit a gallery or page image by changing the compression, resizing, cropping, rotating or editing any of the many other image qualities, these edits are always made from your originally uploaded image so that there is no quality degradation no matter how many times or ways that you edit the image, and the original image is never changed.

While you will not be able to do any such image editing with the static backup of your site, it seemed like a good idea to include the originals of all the images that are displayed on the site. And, while there are no links to those images from the publicly viewable pages of the site, the backup_log.html does include links to all of the files included in the backup, including the original images.

How to make the whole thing work

Both the .zip and .tgz files can be unpacked with a large number of utilities. For Windows, for example, the most common such utility is WinZip, http://www.winzip.com/, which can open either format. The unpacked files will all be in a folder named "htdocs_as" that branches from there, so you needn't worry that the unpacked files will get scattered all over your computer.

Once the backup is unpacked it will properly work only if it is placed on a Web server. For a home test, if you were to copy everything in htdocs_as into the root directory of your computer (e.g., C:), it will probably all work, even without a server, but it depends on your computer. I don't recommend that you do that, but otherwise you can still click on various files to view what they contain. If the home page (/htdocs_as/index.html) is not in the root directory, however, images will not load into the pages and the type fonts will look funky because the style sheets cannot be properly included, either.

If you are brave and want to see how it will work without putting it on a proper Web server, you can try testing it by copying everything from htdocs_as to the root directory of your computer, but be careful not to hurt anything else in your computer's root directory. Because the backup only took seconds to be generated in the first place, it will be relatively easy to clean the files out of your root directory after you are satisfied that everything works. To clean up when you are done testing, in Windows Explorer or My Computer, you could sort all the files by time modified. That will group all of the backup files together so they can more easily be deleted. Just be careful not to accidentally delete something important to the functioning of your computer when you do so.

Technical requirements for launching on a public server.

Not much. I really can't imagine what kind of Web server the backup site wouldn't run on. The only issue that narrows the field a bit is that the server must run Apache with mod_alias enabled in order for the .htaccess file to work out of the box. That isn't necessary for the site, itself, to work, but it is necessary to keep external links to your site from breaking. Even with the requirements of Apache/mod_alias, that still leaves more than half of the Web servers "out there." A good resource for checking out prospecting Web hosts is http://webhostingtalk.com. In the end, you should be able to simply hand the backup file to virtually any Web hosting company, and they could have the site working in a matter of minutes.

Webmaster/developer requirements for maintaining the backup.

That's a slightly tougher nut to crack. If you don't have anyone in mind, I recommend doing some research at http://webmasterworld.com. A lot of top-notch talent hangs out there.

The technical skills required to create or maintain a static site are not all that great. The backup version of your site has been reduced to basic HTML with some JavaScript sprinkled in, and it also uses CSS stylesheets. This is stuff that any webmaster can manage with ease. The bad part is that it is very tedious work to maintain a static site as large as yourdomain.com.

In an emergency, the best solution might be a two step approach. First, get somebody, anybody, to get the backup files online, ASAP. Then, with some breathing room, find a webmaster/developer who is capable of using your backup site as a template for reproducing a dynamic site that can be more easily updated and maintained than the static one. Making that second step is a much taller order. Anybody can make a static website. Making one with the power of an Art Sites website is...well, there is nothing else like it.

The last piece of the puzzle for implementing your website on a different server would be repointing your domain name, yourdomain.com, so that prospective visitors can find it. That isn't a direct part of the backup, however, so I'll address the matter elsewhere.

Finally, this whole client-backup program is provided to you not because Art Sites isn't making it's own off-server backups of all the programs, databases and images that power your dynamic site. Nor is it because we have plans to go anywhere or because we want you to take your business elsewhere. Quite the opposite. We are religiously making redundant off-site backups of all the dynamic features that power your site, and we are confident that no one else can serve our niche market, independent artists, providing the value and offering the control over their own sites that we offer you. We want to give you wings to fly on the Web. And that is all this client-backup program is: A set of wings for you to even fly away from us should the desire or need ever arise. —Patrick



Featured Artist:

Plein Air Art
by John Larner

Plein Air Art by John Larner