How to Use Robots txt For Your Proxy Websites

If you are running a free web proxy and do not use a robots.txt, you may find trouble coming your way from other angry webmasters claiming that you have stolen their web content. If you do not understand this, then at least remember this term “proxy hijacking” well. You see, when a proxy user uses your free web proxy is used to retrieve another website’s contents, those content are being rewritten by the proxy script and appear to be hosted on your proxy website automatically. What used to be on other websites now becomes your content after some proxy users visited those third party websites.

Next, you have search engine bots from Google,Yahoo and MSN etc crawling through your proxy websites content and indexing those automatically created or so called stolen content and associating those content to your proxy website. When the real owners and authors of those content do a search on search engines and find those content being listed on your web proxy (and not on their own websites), they turn angry and start issuing abuse emails to your hosting provider and to the search engines. Your proxy website will end up being removed from the search engine results and that may mean a great loss of web traffic and profits for you.

Some hosting companies will also suspend your hosting accounts although this is not likely for specialized proxy hosting providers that are used to handling such complaints and know that the real cause of the proclaimed abuses. If you are using AdSense or any other advertising networks for monetizing your web proxy, these complainers may even go as far as to try and get your AdSense accounts banned by report that you are a spammer that is using duplicate content.

If you do not know what web proxy scripts you are using but you know you got them free, then most likely you are using either of the three big proxy scripts: CGI Proxy, Phproxy and Glype. For convenience, we provide a sample robots.txt that works with their default installations:

    User-agent: *

    Disallow: /browse.php

    Disallow: /nph-proxy.pl/

    Disallow: /nph-proxy.cgi/

    Disallow: /index.php?q*

Copy the above source code into a robots.txt and upload it to the root directory for each proxy website. Creating proper robots.txt files for your proxy websites is an often forgotten but essential step for many proxy owners, especially those that own large proxy networks consisting of hundreds of web proxies.