Apache module mod_proxy
Apache can be configured in both a forward and reverse proxy mode.
An ordinary forward proxy is an intermediate server that sits between the client
and the origin server. In order to get content from the origin server, the client
sends a request to the proxy naming the origin server as the target and the proxy then
requests the content from the origin server and returns it to the client. The client must be
specially configured to use the forward proxy to access other sites.
A typical usage of a forward proxy is to provide Internet access to internal clients that
are otherwise restricted by a firewall. The forward proxy can also use caching to reduce
network usage.
The forward proxy is activated using the ProxyRequests
directive. Because forward proxys allow clients to access arbitrary sites through your server
and to hide their true origin, it is essential that you secure your server
so that only authorized clients can access the proxy before activating a forward proxy.
A reverse proxy, by contrast, appears to the client just like an ordinary web
server. No special configuration on the client is necessary. The client makes ordinary
requests for content in the name-space of the reverse proxy. The reverse proxy then decides
where to send those requests, and returns the content as if it was itself the origin.
A typical usage of a reverse proxy is to provide Internet users access to a server that is
behind a firewall. Reverse proxies can also be used to balance load among several back-end
servers, or to provide caching for a slower back-end server. In addition, reverse proxies can
be used simply to bring several servers into the same URL space.
A reverse proxy is activated using the ProxyPass
directive or the [P] flag to the
RewriteRule
directive. It is not necessary to turn ProxyRequests
on in order to configure a reverse proxy.
The examples below are only a very basic idea to help you get started. Please read the
documentation on the individual directives.
Forward Proxy
ProxyRequests On
ProxyVia On
<Directory proxy:*>
Order deny,allow
Deny from all
Allow from internal.example.com
</Directory>
CacheRoot "/usr/local/apache/proxy"
CacheSize 5
CacheGcInterval 4
CacheMaxExpire 24
CacheLastModifiedFactor 0.1
CacheDefaultExpire 1
NoCache a-domain.com another-domain.edu joes.garage-sale.com
Reverse Proxy
ProxyRequests Off
ProxyPass /foo http://foo.example.com/bar
ProxyPassReverse /foo http://foo.example.com/bar
You can control who can access your proxy via the normal <Directory> control block using
the following example:
<Directory proxy:*>
Order Deny,Allow
Deny from all
Allow from yournetwork.example.com
</Directory>
A <Files> block will also work, and is the only method known to work for all possible
URLs in Apache versions earlier than 1.2b10.
For more information, see
mod_access.
Strictly limiting access is essential if you are using a forward proxy (using the ProxyRequests directive). Otherwise, your server can be used
by any client to access arbitrary hosts while hiding his or her true identity. This is
dangerous both for your network and for the Internet at large. When using a reverse proxy
(using the ProxyPass directive with ProxyRequests
Off), access control is less critical because clients can only contact the hosts that
you have specifically configured.
There is an optional patch to the proxy module to allow Netscape-like hostname shortcuts to be
used. It's available from the contrib/patches/1.2
directory on the Apache Web site.
You probably don't have that particular file type defined as application/octet-stream
in your proxy's mime.types configuration file. A useful line can be
application/octet-stream bin dms lha lzh exe class tgz taz
In the rare situation where you must download a specific file using the FTP ASCII
transfer method (while the default transfer is in binary mode), you can
override mod_proxy's default by suffixing the request with ;type=a to force an
ASCII transfer. (FTP Directory listings are always executed in ASCII mode, however.)
A FTP URI is interpreted relative to the home directory of the user who is logging in.
Alas, to reach higher directory levels you cannot use /../, as the dots are interpreted by the
browser and not actually sent to the FTP server. To address this problem, the so called
"Squid %2f hack" was implemented in the Apache FTP proxy; it is a solution which is
also used by other popular proxy servers like the Squid
Proxy Cache. By prepending /%2f to the path of your request, you can make such a proxy
change the FTP starting directory to / (instead of the home directory).
Example: To retrieve the file /etc/motd, you would use the URL
ftp://user@host/%2f/etc/motd
To log in to an FTP server by username and password, Apache uses different strategies. In
absense of a user name and password in the URL altogether, Apache sends an anonymous login to
the FTP server, i.e.,
user: anonymous
password: apache_proxy@
This works for all popular FTP servers which are configured for anonymous access.
For a personal login with a specific username, you can embed the user name into the URL, like
in: ftp://username@host/myfile. If the FTP server asks for a password
when given this username (which it should), then Apache will reply with a [401 Authorization
required] response, which causes the Browser to pop up the username/password dialog. Upon
entering the password, the connection attempt is retried, and if successful, the requested
resource is presented. The advantage of this procedure is that your browser does not display
the password in cleartext (which it would if you had used ftp://username:password@host/myfile
in the first place).
Note that the password which is transmitted in such a way is not encrypted on its way.
It travels between your browser and the Apache proxy server in a base64-encoded cleartext
string, and between the Apache proxy and the FTP server as plaintext. You should therefore
think twice before accessing your FTP server via HTTP (or before accessing your personal files
via FTP at all!) When using unsecure channels, an eavesdropper might intercept your password
on its way.
If you're using the ProxyBlock or NoCache directives, hostnames' IP
addresses are looked up and cached during startup for later match test. This may take a few
seconds (or more) depending on the speed with which the hostname lookups occur.
Yes. Just build Apache with the rule SOCKS4=yes in your Configuration
file, and follow the instructions there. SOCKS5 capability can be added in a similar way
(there's no SOCKS5 rule yet), so use the EXTRA_LDFLAGS definition,
or build Apache normally and run it with the runsocks wrapper provided with SOCKS5,
if your OS supports dynamically linked libraries.
Some users have reported problems when using SOCKS version 4.2 on Solaris. The problem was
solved by upgrading to SOCKS 4.3.
Remember that you'll also have to grant access to your Apache proxy machine by permitting
connections on the appropriate ports in your SOCKS daemon's configuration.
An Apache proxy server situated in an intranet needs to forward external requests through
the company's firewall (for this, configure the ProxyRemote
directive to forward the respective scheme to the firewall proxy). However, when it
has to access resources within the intranet, it can bypass the firewall when accessing hosts.
The NoProxy directive is useful for specifying which hosts belong to
the intranet and should be accessed directly.
Users within an intranet tend to omit the local domain name from their WWW requests, thus
requesting "http://somehost/" instead of "http://somehost.my.dom.ain/".
Some commercial proxy servers let them get away with this and simply serve the request,
implying a configured local domain. When the ProxyDomain directive
is used and the server is configured for proxy service, Apache
can return a redirect response and send the client to the correct, fully qualified, server
address. This is the preferred method since the user's bookmark files will then contain fully
qualified hosts.
|