How Websites Works ?
All of us use different websites on daily basis. But do we really know how it works? The moment we type the website name on the browser, what really happens? It is still a mystery for many of us. Here I am trying to unveil this mystery in very simple language.
First lets see things in brief and then we shall go in detail. The moment you type the address eg: http://facebook.com the following happens in your machine,
- Your browser makes a
DNS queryand retrieve the
IP addressof the requested website.
; <<>> DiG 9.9.7-P3 <<>> facebook.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2594 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;facebook.com. IN A ;; ANSWER SECTION: facebook.com. 190 IN A 22.214.171.124 ;; Query time: 7 msec ;; SERVER: 126.96.36.199#53(188.8.131.52) ;; WHEN: Sat Jan 27 14:43:14 IST 2018 ;; MSG SIZE rcvd: 57
- Once the browser get to know about the
IP addressit will then send a message to the
IP addresswith some content like this
> GET / HTTP/1.1 > Host: facebook.com > User-Agent: curl/7.54.0 > Accept: */*
- The message that is send by the browser is then interpreted by the webserver listening on the
IP address. (How come web server is able to understand some junk text sent to it? This is where protocols eg: HTTP, HTTPS and RFC's comes into picture eg: RFC2616)
- Webserver will handover the request to some web application for further processing (Can webserver simply give it to a web app? Again some protocols are involved. eg: WSGI)
- Web application / website process the request and give response back to the webserver
- Webserver then give response back to the browser
< HTTP/1.1 200 OK < Server: GitHub.com < Date: Sat, 27 Jan 2018 09:22:03 GMT < Content-Type: text/html; charset=utf-8 < Content-Length: 25624 < Vary: Accept-Encoding < Last-Modified: Tue, 06 Jun 2017 11:18:56 GMT < Vary: Accept-Encoding < Access-Control-Allow-Origin: * < Expires: Sat, 27 Jan 2018 09:32:03 GMT < Cache-Control: max-age=600 < Accept-Ranges: bytes < X-GitHub-Request-Id: C84D:2CA71:3643970:4DA8875:5A6C44BB <!-- content -->
How come webserver can interpret some junk text message?
Lets take the case of two people. If they are not speaking in a common language they wont be able to communicate right. Here also the same principle is applicable. Both the client and server should be speaking the same language. When I say language here it refers to protocols like
HTTPS etc. Well who defines these protocols? If different vendors are having different formats of protocols it's not going to work. So there is an central authority called Internet Engineering Task Force - IETF who defines the protocols. They publishes a document called Request For Comments - RFC which defines how a system should behave. So basically all browsers and webservers try to implement the behaviour mentioned in the
IETF and that's how they are able to communicate with each other.
How webservers and web applications communicate?
Well again like browser and web server communicates, web server and web application communicates to each other with a protocol called Web Server Gateway Interface - WSGI