curl – get only numeric HTTP response code
Most browsers have developer plugins where you can see the HTTP status code response and other request/response headers. For automation purposes though, you are most likely to use tools such as curl, httpie or python requests modules. In this post, we will see how to use curl for parsing HTTP response to get only the response code.
1. First attempt – use ‘-I’ option to fetch HTTP-header only.
The first line will show the response code.
daniel@linubuvma:~$ curl -I http://www.google.com
HTTP/1.1 200 OK
Date: Sun, 09 Apr 2017 06:45:00 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
domain=.google.com; Htty
Transfer-Encoding: chunked
Accept-Ranges: none
Vary: Accept-Encoding
But does this work all the time? No, some web services have problem with the HEAD HTTP request. Let us try amazon.com for instance –
daniel@linubuvma:~$ curl -I https://www.amazon.com
HTTP/1.1 503 Service Unavailable
Content-Type: text/html
Content-Length: 6450
Connection: keep-alive
Server: Server
Date: Sun, 09 Apr 2017 06:50:02 GMT
Set-Cookie: skin=noskin; path=/; domain=.amazon.com
Vary: Content-Type,Host,Cookie,Accept-Encoding,User-Agent
X-Cache: Error from cloudfront
Via: 1.1 a8dc63f9c2d878908bcd53ddc78da27f.cloudfront.net (CloudFront)
daniel@linubuvma:~$ curl -I -A "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" https://www.amazon.com
HTTP/1.1 405 MethodNotAllowed
Content-Type: text/html; charset=ISO-8859-1
Connection: keep-alive
Server: Server
Date: Sun, 09 Apr 2017 06:49:47 GMT
Set-Cookie: skin=noskin; path=/; domain=.amazon.com
Strict-Transport-Security: max-age=47474747; includeSubDomains; preload
x-amz-id-1: N2RDV79SBB791BTYG2K8
allow: POST, GET
Vary: Accept-Encoding,User-Agent
X-Frame-Options: SAMEORIGIN
X-Cache: Error from cloudfront
Via: 1.1 f3459bfce7b7b7b8e8bfb19301f39bef.cloudfront.net (CloudFront)
In the first attempt, amazon.com was actually blocking automated checks by looking at the user-agent in the header, so i had to trick it by changing the user-agent header. The response code was 503. Once I changed the user-agent, I am getting 405 – the web server does not like our HEAD HTTP (‘-I’) option.
2. Second attempt – use ‘-w’ option to write-out specific parameter.
curl has ‘-w’ option for defining specific parameter to write out to the screen or stdout. Some of the variables are content_type, size_header, http_code. In our case, we are interested in http_code, which will dump the numerical response code from the last HTTP transfer. Let us try it –
daniel@linubuvma:~$ curl -I -s -w "%{http_code}\n" -o /dev/null http://www.google.com
200
We use ‘-I’ to get only the header and redirect the header to /dev/null and only print http_code to stdout. This is by far the most efficient way of doing it, as we are not transferring the whole page. If the ‘-I’ option does not work though, for sites such as amazon.com, we can drop ‘-I’ as follows –
daniel@linubuvma:~$ curl -s -w "%{http_code}\n" -o /dev/null -A "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" https://www.amazon.com
200
This is very useful when are writing scripts to get only the HTTP status code.
References –
https://curl.haxx.se/docs/manpage.html
https://superuser.com/questions/272265/getting-curl-to-output-http-status-code