6. Encoding
Encoding is an essential aspect of building widgets and web development in general for several reasons, primarily related to security and data integrity. If you are not careful, your widget can pass along incorrect values to a JavaScript library, resulting in a broken layout or even making it vulnerable to the injection of malicious scripts.
Encoding is automatically handled for you with the print
statement; however, you still need to exercise caution. Common scenarios requiring encoding include displaying user input from a text field or configuring a JavaScript instance with the property values of the widget.
WEMscript offers three methods of encoding: HTML encoding, HTML attribute encoding, and JavaScript encoding. You can achieve these by calling the print
statement with the first arguments html
, attr
, and js
, respectively. There is also a fourth method: printing without encoding, often referred to as printing raw.
Shorthand Notations
Since you will frequently use the print
statement, WEMscript provides a few shorthand notations to simplify your work. We have already used some of these, and you should be familiar with them:
print "value"
<?raw "value" ?>
print html "value"
<?= "value" ?>
print js "value"
<?js "value" ?>
print attr "value"
<?attr "value" ?>
The WEMscript above can be rewritten more concisely as follows:
From now on, we will use shorthand notation whenever possible.
Raw
The simplest yet most dangerous way to print output is to do so without any encoding. This method should be handled with extreme caution. Here is an example of how not to use raw printing:
The above example is very risky. Using raw printing can lead to errors if you are not careful, not to mention the security risks! For instance, if the property @Name
is set to the value: Bob<script> runMaliciousScript(); </script>
, the output would be vulnerable to XSS.
We can mitigate this risk by using HTML encoding.
HTML
HTML has special characters that are reserved for specific meanings (e.g., <
for opening tags, >
for closing tags). If these characters are included in user input without encoding, they can disrupt the structure of the HTML document. We need to convert these characters into their corresponding HTML entities (e.g., <
becomes <
, >
becomes >
), preserving the intended content and ensuring it is displayed correctly. We can make the previous code safer by changing <?raw
to <?=
, as shown below:
If a hacker attempts to insert <script>
blocks, they will be printed as <script>
and will not be parsed as script blocks but rather as the literal text "<script>"
.
HTML Attribute
HTML attributes are similar to HTML and should always be encoded using either print attr "value"
or <?attr "value" ?>
. In the following example, we encode the title
attribute:
RichText
Soon
JavaScript
Encoding in JavaScript is somewhat unique because you also have to consider the language settings of the portal in which the application is running. For instance, the number 3.14159
is written as 3,14159
in Dutch (note the comma). If we print a boolean value without encoding, we will receive the values yes
and no
in English, while in Dutch, we get ja
and nee
, respectively. All of these are invalid boolean values in JavaScript. Therefore, it is crucial to encode values correctly in JavaScript. Here is a bad example that demonstrates the consequences of not encoding JavaScript values.
This will translate to the following JavaScript code in the runtime if we run it with Dutch as the portal language setting.
As you can see, many issues arise in this code, and some errors may even go unnoticed! Note the comma operator (,
) when trying to assign the variable b
. This is not a syntax error; it is a valid operator that is rarely used in JavaScript.
Now, let’s consider what happens if we change the portal language to English.
Once again, we encounter multiple errors. However, this time the number variable b
is set correctly by accident. Let’s fix this by properly encoding the JavaScript values:
This translates to:
The values are now correctly set, regardless of the portal language settings in the context of JavaScript. Note that you do not need to include quotes for text values.
CSS (Hack)
You might expect that there is a similar method for CSS, such as <?css ?>
, but unfortunately, WEMscript does not provide this. It is quite non-trivial. However, although it may seem hacky, there are cases where you can use <?js
and <?=
, but you need to be cautious. Additionally, since CSS is heavily unit-based (e.g., 20px
, 4vw
), you often need to manually add the units in most print statements.
Let’s look at the following example, assuming that @Width
and @Height
are both widget number properties:
Let’s set the @Width
and @Height
properties to 320
and 200
, respectively. This will translate to:
This works. But what if the properties are of type unknownnumber
? In that case, both values would be nullpx
, resulting in an invalid CSS property value. This may need to be fixed, depending on the widget.
What if you want to set a more flexible value for a CSS property? For instance, a margin value like auto
or 8px 16px 8px
? We can use a text property for this case, but make sure to HTML encode it. Let’s consider the following example, assuming we have a @Margin
widget text property:
If we have the text value 16px 8px
for our margin, it will translate to:
Now, what if we think like a hacker and set the text value to auto;}</style><script>console.log('Hacked!')</script>
as our margin?
As you can see, once again, the <?raw
method is very dangerous and must be used with caution. In this case, using <?=
is the better option. While it can result in CSS syntax errors if an incorrect value is set, it is far preferable to leaving the widget vulnerable to XSS attacks!
Last updated
Was this helpful?