Tuesday, October 11, 2011

HTTP, Web Scraping and Python - Part 2

In Part 1, I talked about User Agents. Today we'll try to see what I said, is it actually true? i.e. Do servers really see that user agent value? Do they really identify you with it?


Last time we proposed this as a hypothesis, today we'll see if it's a fact or not? ;)

We'll do a little experiment. For this You'll need firefox. So swtich to firefox, if you haven't.

Now, Go to this url. It's an addon which allows you to switch User Agents. Just download and restart firefox.

Oye. Stop. Go and download that Plugin before you move on! Such a lazy person you are! :) Just kidding ;) (but you'll really learn a lot more if you do this)


Now Go to 
Tools > Default User Agent > Edit User Agents.
Select New > New User Agents.


Now fill in random crap in each text field. Yes you heard me. Fill all crap!! Utter nonsense, or write poetry.  At least Change the User Agent field.


Okay. Done? Select Okay. Okay.
Now go again to 
Tools > Default User Agent > And select the user agent you made.


Okay Done?


Now go back to this same page. Notice anything above? It says :- "To try the thousands of add-ons available here, download Mozilla Firefox, a fast, free way to surf the Web!"

Huh?? Download Firefox? But I am in firefox!!! :)
Okay cool.


Now go again to 
Tools > {Your User Agent Name} > And select the default user agent.


Now reload the Page again. Bam! That Banner is gone!!


It shows that the value of the User Agent String does matter, and that servers do read that value to identify who you are! Why is this important to know you'll see in the later part of the series...

Hence, Proved! :)

PS - This Idea came from a problem I had. From past few months my firefox was not being recognized by my gmail and it always use to go to default basic mode rather than standard mode. I tried googling for it, but I didn't understand what was going on, or what to search for! It was a weird problem! I knew about User Agents but it was all theory!! :) No practicals about it, hence it never clicked me, my user agent might be a problem!

While writting the part 1, i noticed the header value of User Agent String which was some rubbish and it just struck me!! I set it back to default and everything was fine again :)

Moral Of this Post - Don't just believe what You read anywhere, no matter who says it, try it out on your own and really test it before you believe it! Ask questions! Challenge what you learn! Learn the same thing in different ways. And you'll get awesome everyday :)

Have Fun! :)

References :-

1 comment:

  1. Good explanation. Another example => many website check User Agents to load scripts or stylesheets that work under that browser, to solve cross-browser compatibility problems as well. :-)

    ReplyDelete

Note: Only a member of this blog may post a comment.