| I l@ve RuBoard |
|
Hack 100 Ultrahosting: Mass Web Site Hosting with Wildcards, Proxy, and Rewrite
Support thousands of internal web servers without lifting a finger Suppose you have a large private network hiding from the Internet behind a NAT router. Your network layout looks something like Figure 8-3. Figure 8-3. Typical corporate lans use private addressing internally and at least one Internet gateway providing Network Address Translation.![]() You want to be able to allow anyone on your private network to set up their own web server. But like all good network administrators, you are smart and lazy and don't want to fiddle with updating forwarding rules on your firewall every time someone needs to make a change. Through the careful use of named virtual hosts, mod_proxy, and mod_rewrite, you can reduce the administrative overhead of your entire network to simple DNS updates. Then there is little keeping you from delegating that responsibility to the departments that wanted the web servers in the first place. To start, you'll need Apache running on your gateway machine, with mod_rewrite and mod_proxy installed. You'll also need a DNS server running your own top-level internal domain (as discussed in [Hack #80]). We'll assume that you own the Internet domain shelbyville.com, and have the internal TLD of .springfield already set up, serving your internal machines. Add the following to the Apache configuration on your gateway machine: Port 80
BindAddress *
NameVirtualHost *
<VirtualHost *>
ServerName mux.shelbyville.com
ServerAlias *.shelbyville.com
RewriteEngine On
RewriteCond %{HTTP_HOST} (.*).shelbyville.com
RewriteRule ^/(.*) http://%1.springfield/$1 [P]
</VirtualHost>
Briefly, this configuration translates to:
This works because hosts are specified by name in the http header under the http 1.1 specification and aren't tied to a particular IP address. Apache will look up the contents of %1 (everything before .shelbyville.com) via the system resolver and attempt to connect to it. Since the gateway is using a DNS server that serves your .springfield TLD, it will proxy to the proper internal host. For example, suppose that the original request was for http://jimbo.shelbyville.com/index.html. After the RewriteCond line, the %1 variable simply contains jimbo. It then attempts to proxy to %1 (aka jimbo) with .springfield appended to it. The result? A proxy request to the internal web server jimbo.springfield, with the original URI passed along as if the gateway weren't even there. This is the simplest configuration, but it will break internal servers that require cookies. To support cookies on servers residing on the internal network, try something like this: <VirtualHost *>
ServerName mux.shelbyville.com
ServerAlias *.shelbyville.com
RewriteEngine On
RewriteCond %{HTTP_HOST} (.*).shelbyville.com
RewriteRule (.*) $1 [E=WHERETO:%1.springfield]
ProxyPassReverse / http://%{ENV:WHERETO}/
RewriteRule ^/(.*) http://%{ENV:WHERETO}/$1 [P]
</VirtualHost>
This uses a "fake-out" RewriteRule that is only called to invoke the side effect of setting the %{WHERETO} environment variable. This gets set to the original requested http host with .shelbyville.com stripped off, but with .springfield appended. We need to do that to be able to feed the amended hostname to ProxyPassReverse. By manipulating the DNS configuration for .shelbyville.com and .springfield, you can bring internal web servers up and down at will, without ever touching the Apache configuration on the gateway. Of course, to make the job even easier, you could use wildcard DNS for .shelbyville.com: *.shelbyville.com IN A 12.34.56.78 (Naturally, substituting the external IP address of the gateway for 12.34.56.78). Now, requests for anything.shelbyville.com will be fed to the gateway and proxied, without ever changing the zone file again. That just leaves internal DNS maintenance (for .springfield). The simplest way to dispense with that responsibility is to divide it into multiple subdomains and defer to other internal DNS servers. For example, you could put something like this in your named.conf: zone "wiggum.springfield" {
type slave;
file "wiggum.db";
masters { 10.42.5.6; };
};
zone "krabappel.springfield" {
type slave;
file "krabappel.db";
masters { 10.42.6.43; };
};
zone "syzlak.springfield" {
type slave;
file "syzlak.db";
masters { 10.42.7.2; };
};
and so on. Now each department can have their own master DNS server and can add new hosts (and therefore new, Internet-ready web servers) without even dropping you an email. They can get to their new servers from the Internet by browsing to an address like http://ralph.wiggum.shelbyville.com/, which translates to http://ralph.wiggum.springfield/ internally and is ultimately looked up from the DNS server for the division that is responsible for the wiggum subdomain. As far as the Internet users are concerned, the information came directly from the gateway machine, and aren't even aware that .springfield even exists. Now, just what will you do with all of the time that this saves you? |
| I l@ve RuBoard |
|