All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.owasp.encoder.JavaScriptEncoder Maven / Gradle / Ivy

Go to download

The OWASP Encoders package is a collection of high-performance low-overhead contextual encoders, that when utilized correctly, is an effective tool in preventing Web Application security vulnerabilities such as Cross-Site Scripting.

There is a newer version: 1.3.1
Show newest version
// Copyright (c) 2012 Jeff Ichnowski
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions
// are met:
//
//     * Redistributions of source code must retain the above
//       copyright notice, this list of conditions and the following
//       disclaimer.
//
//     * Redistributions in binary form must reproduce the above
//       copyright notice, this list of conditions and the following
//       disclaimer in the documentation and/or other materials
//       provided with the distribution.
//
//     * Neither the name of the OWASP nor the names of its
//       contributors may be used to endorse or promote products
//       derived from this software without specific prior written
//       permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
// FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
// COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
// INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
// (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
// HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
// STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
// OF THE POSSIBILITY OF SUCH DAMAGE.
package org.owasp.encoder;

import java.nio.CharBuffer;
import java.nio.charset.CoderResult;

/**
 * JavaScriptEncoder -- An encoder for JavaScript string contexts.
 *
 * @author jeffi
 */
class JavaScriptEncoder extends Encoder {

    /**
     * Mode of operation constants for the JavaScriptEncoder.
     */
    enum Mode {
        /**
         * Standard encoding of JavaScript Strings. Escape sequences are chosen
         * according to what is the shortest sequence possible for the
         * character.
         */
        SOURCE,
        /**
         * Encoding for use in HTML attributes. Quote characters are escaped
         * using hex encodes instead of backslashes. The alternate would be to
         * use a sequence of encodes that would actually be longer. In this mode
         * double-quote is "\x22" and single-quote is "\x27". (In HTML
         * attributes the alternate would be encoding "\"" and "\'" with entity
         * escapes to "\"" and "\&39;").
         */
        ATTRIBUTE,
        /**
         * Encoding for use in HTML script blocks. The main concern here is
         * permaturely terminating a script block with a closing "</" inside
         * the string. This encoding escapes "/" as "\/" to prevent such
         * termination.
         */
        BLOCK,
        /**
         * Encodes for use in either HTML script attributes or blocks.
         * Essentially this is both special escapes from HTML_ATTRIBUTE and
         * HTML_CONTENT combined.
         */
        HTML,;
    }

    /**
     * The mode of operations--used for toString implementation.
     */
    private final Mode _mode;
    /**
     * True if quotation characters should be hex encoded. Hex encoding quotes
     * allows JavaScript to be included in XML attributes without additional
     * XML-based encoding.
     */
    private final boolean _hexEncodeQuotes;
    /**
     * An array of 4 32-bit integers used as bitmasks to check if a character
     * needs encoding or not. If the bit is set, the character is valid and does
     * not need encoding.
     */
    private final int[] _validMasks;
    /**
     * True if the output should only include ASCII characters. Valid non-ASCII
     * characters that would normally not be encoded, will be encoded.
     */
    private final boolean _asciiOnly;

    /**
     * Constructs a new JavaScriptEncoder for the specified contextual mode.
     *
     * @param mode the mode of operation
     * @param asciiOnly true if only ASCII characters should be included in the
     * output (all code-points outside the ASCII range will be encoded).
     */
    JavaScriptEncoder(Mode mode, boolean asciiOnly) {
        // TODO: after some testing it appears that an array of int masks
        // is faster than two longs, or an array of longs or an array of bytes
        // the other encoders based upon masks should be switched to ints.
        // (to be clear, it's much faster on 32-bit VMS, and just slightly
        // faster on 64-bit VMS)
        _mode = mode;

        // Note: this probably needs to be repeated everywhere this trick is
        // used, but here seems like as good a place as any.  According to
        // the Java spec (x << y) where x and y are integers, is evaluated
        // as (x << (y & 31)).  Or put another way, only the lower 5 bits
        // of the shift amount are considered.
        _validMasks = new int[]{
            0,
            -1 & ~((1 << '\'') | (1 << '\"')),
            -1 & ~((1 << '\\')),
            asciiOnly ? ~(1 << Unicode.DEL) : -1,};

        if (mode == Mode.BLOCK || mode == Mode.HTML) {
            // in " is escaped as "<\/script>" and "