curl – get only numeric HTTP response code
Most browsers have developer plugins where you can see the HTTP status code response and other request/response headers. For automation purposes though, you are most likely to use tools such as curl, httpie or python requests modules. In this post, we will see how to use curl for parsing HTTP response to get only the response code.
1. First attempt – use ‘-I’ option to fetch HTTP-header only.
The first line will show the response code.
daniel@linubuvma:~$ curl -I http://www.google.com HTTP/1.1 200 OK Date: Sun, 09 Apr 2017 06:45:00 GMT Expires: -1 Cache-Control: private, max-age=0 Content-Type: text/html; charset=ISO-8859-1 Server: gws X-XSS-Protection: 1; mode=block X-Frame-Options: SAMEORIGIN domain=.google.com; Htty Transfer-Encoding: chunked Accept-Ranges: none Vary: Accept-Encoding
But does this work all the time? No, some web services have problem with the HEAD HTTP request. Let us try amazon.com for instance –
daniel@linubuvma:~$ curl -I https://www.amazon.com HTTP/1.1 503 Service Unavailable Content-Type: text/html Content-Length: 6450 Connection: keep-alive Server: Server Date: Sun, 09 Apr 2017 06:50:02 GMT Set-Cookie: skin=noskin; path=/; domain=.amazon.com Vary: Content-Type,Host,Cookie,Accept-Encoding,User-Agent X-Cache: Error from cloudfront Via: 1.1 a8dc63f9c2d878908bcd53ddc78da27f.cloudfront.net (CloudFront) daniel@linubuvma:~$ curl -I -A "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" https://www.amazon.com HTTP/1.1 405 MethodNotAllowed Content-Type: text/html; charset=ISO-8859-1 Connection: keep-alive Server: Server Date: Sun, 09 Apr 2017 06:49:47 GMT Set-Cookie: skin=noskin; path=/; domain=.amazon.com Strict-Transport-Security: max-age=47474747; includeSubDomains; preload x-amz-id-1: N2RDV79SBB791BTYG2K8 allow: POST, GET Vary: Accept-Encoding,User-Agent X-Frame-Options: SAMEORIGIN X-Cache: Error from cloudfront Via: 1.1 f3459bfce7b7b7b8e8bfb19301f39bef.cloudfront.net (CloudFront)
In the first attempt, amazon.com was actually blocking automated checks by looking at the user-agent in the header, so i had to trick it by changing the user-agent header. The response code was 503. Once I changed the user-agent, I am getting 405 – the web server does not like our HEAD HTTP (‘-I’) option.
2. Second attempt – use ‘-w’ option to write-out specific parameter.
curl has ‘-w’ option for defining specific parameter to write out to the screen or stdout. Some of the variables are content_type, size_header, http_code. In our case, we are interested in http_code, which will dump the numerical response code from the last HTTP transfer. Let us try it –
daniel@linubuvma:~$ curl -I -s -w "%{http_code}\n" -o /dev/null http://www.google.com 200
We use ‘-I’ to get only the header and redirect the header to /dev/null and only print http_code to stdout. This is by far the most efficient way of doing it, as we are not transferring the whole page. If the ‘-I’ option does not work though, for sites such as amazon.com, we can drop ‘-I’ as follows –
daniel@linubuvma:~$ curl -s -w "%{http_code}\n" -o /dev/null -A "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" https://www.amazon.com 200
This is very useful when are writing scripts to get only the HTTP status code.
References –
https://curl.haxx.se/docs/manpage.html
https://superuser.com/questions/272265/getting-curl-to-output-http-status-code