Thursday, February 24, 2011

How to use wget behind proxy


wget is most powerful at the same time handy tool to crawl a small part of web. Restricted crawling is made easier when wget was invented.

Using wget behind the proxy can be a bit tricky. Here is the sample command

For windows
cmd> wget www.google.com
......................
....................
... failed: Connection Refused.

If you are getting the same error try:

cmd> wget -U="Firefox" www.google.com

Sometimes the proxy has a small setting which checks the user agent only. You may try crack it using -U option. If this also does not work try,

cmd> set http_proxy=myproxy.proxy.com
cmd> wget --proxy=on www.google.com

You should be able to download index.html. But if this is not working because proxy requires username and password, then try

cmd> set http_proxy=myproxy.proxy.com
cmd> wget --proxy=on www.google.com
cmd> wget --proxy-user "domain\foo" --proxy-password="chocolate" www.google.com

If this still does not work, then sorry you are on your on !!

Hope this is helpful. To figure out proxy settings check Internet Options or simple google it.

No comments:

Post a Comment