LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Wget: Error 403- Can I get around this? (https://www.linuxquestions.org/questions/linux-software-2/wget-error-403-can-i-get-around-this-606755/)

Jinouchi 12-14-2007 09:05 AM

Wget: Error 403- Can I get around this?
 
Is there a way I can get around an error 403: Forbidden when using wget?

theNbomr 12-14-2007 10:32 AM

If this is a problem related only to wget, but not other browsers, you may be able to spoof the site by using the wget '-U' option, giving it a user-agent description of another browser.
Code:

wget -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4' http://yourURL.com
To see what a working browser sends as a user-agent header, you can run netcat on your localhost, and have a browser try to fetch a page from it:
Code:

nc -l -p 8000 -v
Now, in your browser, go to 'http://localhost:8000'. Observe the user-agent header received by netcat. Cut and paste the string into the wget -U argument.
--- rod.

Jinouchi 12-14-2007 11:27 AM

Ty, I didn't know you could do that. (unfortunatly, it still didn't work)

Ok, so here's the user agent

Code:

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.10) Gecko/20071115 Iceweasel/2.0.0.10 (Debian-2.0.0.10-0etch1)
Here's what happened (in case you're wondering, I'm trying to download some wallpapers, don't want to do all x00+ individually)
Code:

***@***:/dir$ wget -U= 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.10) Gecko/20071115 Iceweasel/2.0.0.10 (Debian-2.0.0.10-0etch1)' -r http://www.wizards.com/magic/images/mtgcom/wallpapers
--11:19:07--  http://mozilla/5.0%20(X11;%20U;%20Linux%20i686;%20en-US;%20rv:1.8.1.10)%20Gecko/20071115%20Iceweasel/2.0.0.10%20(Debian-2.0.0.10-0etch1)
          => `mozilla/5.0 (X11'
Resolving mozilla... failed: Name or service not known.
--11:19:07--  http://www.wizards.com/magic/images/mtgcom/wallpapers
          => `www.wizards.com/magic/images/mtgcom/wallpapers'
Resolving www.wizards.com... 64.223.12.31
Connecting to www.wizards.com|64.223.12.31|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.wizards.com/magic/images/mtgcom/wallpapers/ [following]
--11:19:08--  http://www.wizards.com/magic/images/mtgcom/wallpapers/
          => `www.wizards.com/magic/images/mtgcom/wallpapers/index.html'
Reusing existing connection to www.wizards.com:80.
HTTP request sent, awaiting response... 403 Forbidden
11:19:08 ERROR 403: Forbidden.


FINISHED --11:19:08--
Downloaded: 0 bytes in 0 files

Did I do it correctly?
(ps if I did get this to work I would of corse use the --limit-rate and -w options.

theNbomr 12-14-2007 12:43 PM

Nope. The -U option does not need a '=' between it and the argument.
--- rod.

theNbomr 12-14-2007 12:56 PM

I was not able to use Mozilla to access the URL you cited, either. Seems to be a different problem.
--- rod.

Jinouchi 12-22-2007 11:33 PM

Well, that worked for a different website. Anyone got any other suggestions?

matthewg42 12-23-2007 01:12 AM

If you go to the URL you specified using firefox, you get the same error (at least I do). wget won't fix what doesn't work anyway.

arivendu 06-11-2008 05:36 AM

cant get rid of ERROR:403
 
Quote:

Originally Posted by theNbomr (Post 2990635)
If this is a problem related only to wget, but not other browsers, you may be able to spoof the site by using the wget '-U' option, giving it a user-agent description of another browser.
Code:

wget -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4' http://yourURL.com
To see what a working browser sends as a user-agent header, you can run netcat on your localhost, and have a browser try to fetch a page from it:
Code:

nc -l -p 8000 -v
Now, in your browser, go to 'http://localhost:8000'. Observe the user-agent header received by netcat. Cut and paste the string into the wget -U argument.
--- rod.


I did the same as advised by you, but i am still getting ERROR:403
i used the user agent as-- Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
help needed.

Jinouchi 06-12-2008 12:31 PM

Well, there was one specific site that wouldn't accept the entire user agreement, like the one u posted. I tried simply
Code:

wget -U firefox http://xxx.xxx.com/
and, lol, surprisingly, it worked! you should try that if you're still stuck. Or maybe i set the user agent as "mozilla" instead of "firefox", can't remember. Just remember, the user agent you're trying to use has to be able to access the website in the first place.

arivendu 06-12-2008 11:53 PM

i am still stuck
 
Quote:

Originally Posted by Jinouchi (Post 3182747)
Well, there was one specific site that wouldn't accept the entire user agreement, like the one u posted. I tried simply
Code:

wget -U firefox http://xxx.xxx.com/
and, lol, surprisingly, it worked! you should try that if you're still stuck. Or maybe i set the user agent as "mozilla" instead of "firefox", can't remember. Just remember, the user agent you're trying to use has to be able to access the website in the first place.

Ty, i tried your suggestion, but i am still not able to access the site thru Wget, though the same is working thru my browser, firefox 2.0

theNbomr 06-13-2008 12:16 AM

Since your browser ID contains whitespace characters, you must enclose it in quotes, "double" or 'single'.
--- rod.

EDIT:
Oops. Misread that post, thinking the URL was part of the browser ID. Never mind.

Jinouchi 06-13-2008 01:46 PM

Would you mind sharing the URL? That way other people could work on the same situation. If not, that's ok.

ras 09-15-2008 11:37 AM

Thanks!
 
'Just wanted to say THANKS- I've been having this problems and using -U Mozilla works great.

(I wrote a quick script to update my web site with curl if the version on my disk differs from a cached copy- much nicer to run a command than all these strange gooey programs...)
-r

glarrain 10-26-2009 09:20 PM

Thanks
 
Many thanks to those who have helped enormously!
wget -U firefox ...

So simple, so elegant!

yeseen 11-20-2010 06:10 PM

Well, I agree, that was a very bright answer that helped me: -U mozilla
Very good.


All times are GMT -5. The time now is 07:51 PM.