6. Encoding

Encoding is an essential aspect of building widgets and web development in general for several reasons, primarily related to security and data integrity. If you are not careful, your widget can pass along incorrect values to a JavaScript library, resulting in a broken layout or even making it vulnerable to the injection of malicious scripts.

Encoding is automatically handled for you with the print statement; however, you still need to exercise caution. Common scenarios requiring encoding include displaying user input from a text field or configuring a JavaScript instance with the property values of the widget.

WEMscript offers three methods of encoding: HTML encoding, HTML attribute encoding, and JavaScript encoding. You can achieve these by calling the print statement with the first arguments html, attr, and js, respectively. There is also a fourth method: printing without encoding, often referred to as printing raw.

<p title="<? print attr "HTML attribute encoded text" ?>">
	<? print html "HTML encoded text" ?>
</p>
<? startupscript ?>
	const text = <? print js "JavaScript encoded text" ?>;
<? end ?>
<?	
	/* Printing raw can be very dangerous. */
	print "This can be dangerous"
?>

Shorthand Notations

Since you will frequently use the print statement, WEMscript provides a few shorthand notations to simplify your work. We have already used some of these, and you should be familiar with them:

Statement
Shorthand

print "value"

<?raw "value" ?>

print html "value"

<?= "value" ?>

print js "value"

<?js "value" ?>

print attr "value"

<?attr "value" ?>

The WEMscript above can be rewritten more concisely as follows:

<p title="<?attr "HTML attribute encoded text" ?>">
	<?= "HTML encoded text" ?>
</p>
<? startupscript ?>
	const text = <?js "JavaScript encoded text" ?>;
<? end ?>
<?/* Printing raw can be very dangerous. */?>
<?raw "This can be dangerous" ?>

From now on, we will use shorthand notation whenever possible.

Raw

The simplest yet most dangerous way to print output is to do so without any encoding. This method should be handled with extreme caution. Here is an example of how not to use raw printing:

Your name is: <?raw @Name ?>

The above example is very risky. Using raw printing can lead to errors if you are not careful, not to mention the security risks! For instance, if the property @Name is set to the value: Bob<script> runMaliciousScript(); </script>, the output would be vulnerable to XSS.

<!-- Whoops! -->
Your name is: Bob<script> runMaliciousScript(); </script>

We can mitigate this risk by using HTML encoding.

HTML

HTML has special characters that are reserved for specific meanings (e.g., < for opening tags, > for closing tags). If these characters are included in user input without encoding, they can disrupt the structure of the HTML document. We need to convert these characters into their corresponding HTML entities (e.g., < becomes &lt;, > becomes &gt;), preserving the intended content and ensuring it is displayed correctly. We can make the previous code safer by changing <?raw to <?=, as shown below:

Your name is: <?= @Name ?>

If a hacker attempts to insert <script> blocks, they will be printed as &lt;script&gt; and will not be parsed as script blocks but rather as the literal text "<script>".

<!-- Although a hacker tried to insert malicious code, it has been transformed into HTML entities that will be parsed as literal text by the browser. -->
Your name is: Bob&lt;script&gt;runMaliciousScript()&lt;/script&gt;

HTML Attribute

HTML attributes are similar to HTML and should always be encoded using either print attr "value" or <?attr "value" ?>. In the following example, we encode the title attribute:

<? var @title := "The title of this paragraph" ?>
<p title="<?attr @title ?>">I am a paragraph.</p>

RichText

Soon

JavaScript

Encoding in JavaScript is somewhat unique because you also have to consider the language settings of the portal in which the application is running. For instance, the number 3.14159 is written as 3,14159 in Dutch (note the comma). If we print a boolean value without encoding, we will receive the values yes and no in English, while in Dutch, we get ja and nee, respectively. All of these are invalid boolean values in JavaScript. Therefore, it is crucial to encode values correctly in JavaScript. Here is a bad example that demonstrates the consequences of not encoding JavaScript values.

<? startupscript ?>
	/* Using raw encoding in JavaScript is inadvisable. */
	const a = <?raw "text value" ?>;
	const b = <?raw 3.14159 ?>;
	const c = <?raw true ?>;
<? end ?>

This will translate to the following JavaScript code in the runtime if we run it with Dutch as the portal language setting.

	// Syntax error.
	const a = text value; 

	// Variable is incorrectly set to the value 3, and the fractional part is ignored.
	const b = 3,14159; 

	// Incorrectly assigning the variable `c` to the value of (possibly undefined) variable `ja`.
	const c = ja; 

As you can see, many issues arise in this code, and some errors may even go unnoticed! Note the comma operator (,) when trying to assign the variable b. This is not a syntax error; it is a valid operator that is rarely used in JavaScript.

Now, let’s consider what happens if we change the portal language to English.

	// Same syntax error.
	const a = text value; 

	// Edge case where the variable is correctly set.
	const b = 3.14159; 
	
	// Incorrectly assigning the variable `c` to the value of (possibly undefined) variable `yes`.
	const c = yes; 

Once again, we encounter multiple errors. However, this time the number variable b is set correctly by accident. Let’s fix this by properly encoding the JavaScript values:

<? startupscript ?>
	const a = <?js "text value" ?>;
	const b = <?js 3.14159 ?>;
	const c = <?js true ?>;
<? end ?>

This translates to:

	// Value correctly printed as a string.
	const a = "text value";

	// Value correctly printed as a number.
	const b = 3.14159;

	// Value correctly printed as a boolean.
	const c = true;

The values are now correctly set, regardless of the portal language settings in the context of JavaScript. Note that you do not need to include quotes for text values.

CSS (Hack)

You might expect that there is a similar method for CSS, such as <?css ?>, but unfortunately, WEMscript does not provide this. It is quite non-trivial. However, although it may seem hacky, there are cases where you can use <?js and <?=, but you need to be cautious. Additionally, since CSS is heavily unit-based (e.g., 20px, 4vw), you often need to manually add the units in most print statements.

Let’s look at the following example, assuming that @Width and @Height are both widget number properties:

<style>
	#example {
		width: <?js @Width ?>px;
		height: <?js @Height ?>px;
	}
</style>

Let’s set the @Width and @Height properties to 320 and 200, respectively. This will translate to:

<style>
	#example1 {
		width: 320px;
		height: 200px;
	}
</style>

This works. But what if the properties are of type unknownnumber? In that case, both values would be nullpx, resulting in an invalid CSS property value. This may need to be fixed, depending on the widget.

What if you want to set a more flexible value for a CSS property? For instance, a margin value like auto or 8px 16px 8px? We can use a text property for this case, but make sure to HTML encode it. Let’s consider the following example, assuming we have a @Margin widget text property:

<style>
	#example2 {
		margin: <?js @Margin ?>;
	}

	#example3 {
		margin: <?= @Margin ?>;
	}

	#example4 {
		margin: <?raw @Margin ?>;
	}
</style>

If we have the text value 16px 8px for our margin, it will translate to:

<style>
	#example2 {
		/* Invalid property value */
		margin: "16px 8px";
	}

	#example3 {
		/* Works */
		margin: 16px 8px;
	}

	#example4 {
		/* Works, but... */
		margin: 16px 8px; 
	}
</style>

Now, what if we think like a hacker and set the text value to auto;}</style><script>console.log('Hacked!')</script> as our margin?

<style>
	#example2 {
		/* Invalid property value */
		margin: "auto;}</style><script>console.log('Hacked!')</" + "script>";
	}

	#example3 {
		/* This will result in a CSS syntax error */
		margin: auto;}&lt;/style&gt;&lt;script&gt;console.log(&quot;Hacked!&quot;)&lt;/script&gt; 
	}

	#example4 {
		/* This is an XSS attack! */
		margin: auto;}</style><script>console.log('Hacked!')</script>;
	}
</style>

As you can see, once again, the <?raw method is very dangerous and must be used with caution. In this case, using <?= is the better option. While it can result in CSS syntax errors if an incorrect value is set, it is far preferable to leaving the widget vulnerable to XSS attacks!

Last updated

Was this helpful?