If you ask an average web developer if HTTP headers are dangerous, they will very like say they are not dangerous. If you ask them if HTTP headers can be trusted, they may say, "some, such as Host are reliable, whereas others like Referer are not". If you ask them if users can set HTTP headers themselves, many will say, "No way!".

Here is the truth: HTTP headers can be set by users and they can be very dangerous if you are using their values anywhere in the application logic, or are writing or storing (and eventually writing) their values anywhere.

Let's find out why blindly trusting HTTP headers is dangerous.

Do not trust HTTP headers#

Most of your users will be interacting with your website or application using a browser or a mobile app. These users, under normal circumstances, do not have the ability to set HTTP headers. However, some curious folks might use a command-line tool like curl, a REST client like Insomnia, or an interception proxy like Burp to interact with your web application, and they can set any HTTP header they like. Let's refer to this second category of folks as "hackers".

Your first thought might be, "Wouldn't these hackers be fooling themselves by sending random and incorrect request headers? The web application is going to send them improper responses if the headers are not right. Why would anyone want improper and incorrect responses?"

Incorrect and "invalid" headers are excellent vectors for hacking web applications. Hackers don't care if the response they got was a broken or an incomplete page, or if their request was rejected outright, as long as their payload was delivered and it did what it was supposed to do.

Did you think HTTP headers were "low-level" data in the HTTP prorocol, inaccessible to the user, and that they are safe? Think again. Here are some of the ways HTTP headers can cause you big trouble.

1. Erroneous data#

The least bad of the bad things that can happen out of blindly trusting HTTP headers is erroneous data. Depending on what is collected and the sample size; it may have none, little, or devastating affect.

A very simple example is the User-Agent header. There is not guarantee that the user is actually using the client they say they are using.

One can easily make a request like this:

GET / HTTP/1.1
User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Unless you verify the IP address of the client, there is no way you can say for sure the request was made by Googlebot.

Similarly, Date is not the time when the request was made, it is just what the client claims; the real time should be determined by the backend code.

2. Privilege escalation#

If the data from HTTP headers is used for enforcing access control systems, the odd erroneous data can turn malevolent.

X-Forwarded-For and Referer are sometimes used for implementing rudimentary access control systems. It is often be done due to ignorance, and sometimes as a form of security through obscurity. The outcome is always the same, and the impact varies according to what access was controlled.

An example of X-Forwarded-For being used for authorization is checking if the value is a certain IP address, or it is in a list of IP addresses, and then granting access to a resource.

if (req.headers['x-forwarded-for'] === '192.168.1.10') res.send(secret);
else res.status(404).send('Not Found');

This looks foolproof.

However, there is always a level of uncertainty in how your reverse proxy setting or web framework or the application logic will handle unexpected headers.

A reverse proxy acts as a gateway to one or more websites. It faces the Internet while the web servers are hidden behind it. Reverse proxies makes it possible to host multiple websites on a single server; and it helps with caching and managing load, among other things. nginx and HAProxy are two popular examples of reverse proxies.

i. What if the client unexpectedly sets the X-Forwarded-For header (normally set by the reverse proxy)?

GET / HTTP/1.1
X-Forwarded-For: 192.168.1.10

ii. What if the client sets the X-Forwarded-For header twice? The proxy or the framework you are using, or your application may strip off the first instance, the second one might still make it through.

GET / HTTP/1.1
X-Forwarded-For: 192.168.1.10
X-Forwarded-For: 192.168.1.10

Successsfully sneaking in X-Forwarded-For: 192.168.1.10 is effectively convincing your app that the user is making the request from 192.168.1.10.

Now your app will send the secret in response to this request, because the request came from 192.168.1.10.

You should rigorously test and check your application's behavior and all the components involved in determining the IP addresss of the user. Do not trust X-Forwarded-For unless you have tried every possible way to hack it.

Don't trust Referer either.

Normally, it refers to the path of the page which contained the link to the requested page. However, the Referer values is unreliable as any other header.

A request like this to your website:

GET / HTTP/1.1
Referer: https://www.cia.gov/news-information/blog

will make you think the CIA has links to your website on its blog. In fact, referrer spam is a thing.

Ever been asked by Quora to register or log in when you click on the link to a question on its website? Copy the title of the question, paste it on Google search, click on the link in the search result, the "blocked" page will load now. They are using the Refer header to determine where the user came from. I am sure, it is not a security lapse, it's more like a first-question-free-for-those-who-came-via-Google-search functionality.

When enabling URL rewrites, X-Original-URL and X-Rewrite-URL are used by some frameworks to determine the original URL of a request. In such cases a direct request to an access-controlled path like this may be denied:

GET /secrets/43234 HTTP/1.1
HOST: example.com

HTTP/1.1 403 Forbidden
...
Access is denied

However, a request like this may bypass the access control and lead to resource access.

GET /lolwat.html HTTP/1.1
HOST: example.com
X-Original-URL: /secrets/43234

HTTP/1.1 200 OK
...
FCKGW-RHQQ2-YXRKT-8TG6W-2B7Q8

/lolwat.html is not an access-controlled file (this file doesn't exist on your server in the first place), so this request will get through the access control component. However, if there is a bug in your framework or the application code where X-Original-URL overrides the actual request path /lolwat.html, the resource for /secrets/43234 will be served as the response for this request.

3. Cross-site scripting#

Now starts the obvious bad things.

HTTP headers are user inputs, treat them like how you would treat any user input - validate and sanitize.

Imagine, you have a comment system on your website and you show the IP address of anonymous commenters to discourage them from making offensive comments. The code looks like this:

const ip = req.headers['x-forwarded-for'];
const comment = encodeHTML(req.body.comment); // No XSS here!
const template = 'Anonymous coward from ' + ip + ': ' + comment;

The template when rendered looks like:

Anonymous coward from 8.8.4.4: Hello there!

What if someone made a request like this?

GET / HTTP/1.1
X-Forwarded-For: <script>alert(document.cookie)</script>

The assumned IP address value is not an IP address anymore, it is now a script tag, which opens up your website to XSS attacks.

Anonymous coward from <script>alert(document.cookie)</script>: Hello there!

It would be even more dangerous if the "IP address" was printed in the admin panel.

X-Forwarded-For is not the only header that can lead to XSS. Any header whose unsanitized value is printed on the website can lead to XSS.

Let's say you have this code, which generates the Open Graph tag for your website.

const template = '<meta property="og:image" content="https://' + host + '/images/website.jpg" />';

It looks safe enough. However, if there is a bug in how the value of host is derived, the host value can be set to an XSS payload:

GET / HTTP/1.1
Host: "><script>alert(document.cookie)</script>

and the rendered value will become:

<meta property="og:image" content="https://"><script>alert(document.cookie)</script>/images/website.jpg" />

"How could host ever be set to something that's not the hostname of the website?"

The value of host is first determined by the reverse proxy, then passed on to the underlying web framework, and then to any application logic before the value is set. Any bug in those three can result in host being set wrong.

How does your proxy, the web framework you are using, and your application logic handle a request like this?

GET / HTTP/1.1
Host: evil.com

This one?

GET / HTTP/1.1
Host: evil.com
Host: evil.com

This one?

GET / HTTP/1.1
X-Forwarded-Host: evil.com

How about this one?

GET / HTTP/1.1
X-Forwarded-Host: evil.com
X-Forwarded-Host: evil.com

And this one?

GET / HTTP/1.1
X-Host: evil.com

This one?

GET / HTTP/1.1
X-Host: example.com
X-Forwarded-Server: evil.com

Sneaking in custom hostname using Host and other functional variants of the header like X-Forwarded-Host, X-Forwarded-Server, X-Host etc., is called Host header attack. Host header vulnerability can lead to all sorts of bad things.

If your reverse proxy or the web framework you are using or your application has a misconfiguration or a bug somewhere, the value of host can be set to what a hacker wants.

"Alright, if that's what they want, let the hackers XSS themselves to their heart's content. It does not affect other users, so it's a non-issue."

Self-XSS is a sign that input is not being sanitized properly in the codebase, so SQL injection and XSS might be possible in other areas of the website; also it is often an indication of the possibility of web cache poisoning, a vulnerability with an even larger scope 😈

4. Web cache poisoning#

A web cache is a system which maintains copies of webpages of a website and serves them to clients at a much higher speed than what the web application can do by itself. This greatly improves the speed of the website and spares the computational resources on the web server for application logic, resulting in an overall great performance for the website.

Websites with huge traffic should, and mostly have caching systems or content delivery networks (CDN) serving many of their webpages. Even smaller websites deploy caching servers to significantly speed up their serving capabilities.

When a webpage is requested, the caching server checks for the page in its cache; if it is found, it will serve the page to the client, if not, it forwards the request to the web application which then generates the page, which then passes on the webpage to the cache. Depending on the caching criteria set by the application server, the caching server may or may not keep a copy of this generated page.

Now, imagine a hacker discovered a self-XSS on your homepage.

What if the homepage happens to be cached?

It is not self-XSS anymore. It is the XSS of the worst kind - the poisoned (with the XSS payload) webpage will be served by the cache server to everyone who visits the homepage.

Going back to the previous example; this code:

<meta property="og:image" content="https://"><script>alert(document.cookie)</script>/images/website.jpg" />

will be served to all your visitors.

5. Password reset token hijack#

A password reset link looks something like this:

https://example.com/reset-pasword?token=c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a

It may be generated by clicking on a "Forgot password" link and submitting the user's email. You can do that for any email address that exists in the website's user database.

What would happen if the password reset endpoint was vulnerable to Host header attack?

POST https://example.com/reset-password HTTP/1.1
Accept: */*
Content-Type: application/json
Host: evil.com

{"email": "victim@example.com"}

The website will send the password reset email as usual to the victim from the official email address, with all the official branding and formatting. It is a legit email from the company, except the link will be:

https://evil.com/reset-pasword?token=c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a

Now all the hacker has to do is listen for requests to /reset-pasword on his website and capture the password reset token. Then he can load https://example.com/reset-pasword?token=c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a to create a new password for the victim's account, effectively taking over the account.

6. Sign-up token hijack#

Similar to password reset token hijack is the sign-up token hijack.

A sign-up verification link may look something like this:

https://example.com/register/complete?token=c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a

If the sign-up endpoint was vulnerable to Host header attack, it would become like this:

https://evil.com/register/complete?token=c672b8d1ef56ed28ab87c3622c5114069bdd3ad7b8f9737498d0c01ecef0967a

The victim clicks on the "Click here to complete your registration" link and the registration token is handed over to the hacker.

7. CRLF injection#

So far the headers were all HTTP request headers. What about HTTP response headers? Are they safe?

They aren't safe either.

"But aren't response headers created and sent by the server?"

Response headers are sent by the server, but if any value, in part or whole, submitted by users is contained in the value of a response header, the application could be vulnerable to CRLF attacks if the values are not sanitized.

In HTTP protocol the CRLF (\r\n, encoded as %0D%0A in URLs) combination is used to mark the end of response a header, and two CRLFs to mark the beginning of the response body. A header value containing CRLF characters can be used to manipulate the response or even terminate it and create a new malicious one. This is known as CRLF injection attack.

Let's say a website stores the user's search term in a cookie.

http://example.com/search?t=cars

HTTP/1.1 200 OK
...
Set-Cookie: search=cars
...

Cookies are set via Set-Cookie HTTP response header, they are sent to the backend via the Cookie HTTP request header.

What if a hacker crafts this query and sends it to a user? http://example.com/search?t=x%0D%0AContent-Length%3A%200%0D%0A%0D%0AHTTP%2F1.1%20200%20OK%0D%0AContent-Type%3A%20text%2Fhtml%0D%0AContent-Length%3A%2039%0D%0A%0D%0A%3Cscript%3Ealert(document.cookie)%3C%2Fscript%3E?

This is how the response will play out:

HTTP/1.1 200 OK
...
Set-Cookie: search=x
Content-Length: 0

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 39

<script>alert(document.cookie)</script>

An XSS vulnerability achieved by HTTP response splitting!

Obvious XSS like this example is not the only way this vulnerability can be exploited. One can set the Location header to redirect users to malicious URLs, or set cookies to manipulate the behavior of the app, the possibilities are quite vast and dangerous.

Mitigation#

HTTP headers are user controller inputs. They pose the same risks as the inputs that come via user-submitted forms, and should be treated the same way.

Here are some things you can do to prevent and mitigate header related security issues:

  • Keep an active watch for any vulnerabilities discovered in all the third-part components of your system (reverse proxies, CDNs, web frameworks, libraries)
  • Update and patch affected components as soon as possible
  • Run automated and manual tests on your web infrastructure for all known HTTP header vulnerabilities
  • Re-run the tests whenever any setting has been changed, a component has been updated or replace
  • Keep updating the test cases with latest known vulnerabilities
  • Validate and sanitize all request headers that are read, whether they are written somewhere or not
  • Validate and sanitize all user inputs that are written in the response headers
  • If possible, avoid using HTTP header values in application logic and data
  • Do not derive the domain name / hostname from HTTP headers
  • Hardcode your domain name / hostname somewhere in the app config

Summary#

HTTP headers are user inputs, treat them like how you would treat any other user input - validate and sanitize them. DO NOT TRUST HTTP HEADERS!

References#

Tweet this | Share on LinkedIn |