blog.kdgregory.com: jsp

Showing posts with label jsp. Show all posts

Monday, October 8, 2012

Scriptlets Are Not (Inherently) Evil

JSP scriptlets have a bad reputation; everyone can point to their favorite example of scriptlet abuse. In my case, it was a 12,000 line (yes, three zeros) monstrosity that held a “mere” 1,000 lines or so of markup. The rest was a big if-else construct with plenty of embedded logic. But that example, and ones like it, are examples of bad programming, not indictments of scriptlets as a programming medium. They're no different than Java classes that have a single method with thousands of lines.

Monstrosities aside, I don't think there's a valid argument against scriptlets. Actually, when I Googled for “why are scriptlets bad,” most of the top-ranked pages defended them. As far as I can tell, the main arguments are that scriptlets encourage bad programmers to put business logic in the page, that they make the page untestable, that they limit reuse, and that web developers won't understand pages with embedded Java code. All of which seem to me like red, rotted herrings.

Don't get me wrong, I believe in the separation of concerns that underlies the MVC model. Actually, I believe in the separation of concerns found in what I call the “CSV” model: a lightweight Controller that interacts with business logic via a Service layer, and passes any returned data to a View for rendering. But after working with several alternative technologies and languages, I'm convinced that scriptlets are the best way to implement any programmatic constructs involved in view rendering.

And some amount of programmatic rendering resides in almost every view. One common example is populating a <select> element. On the surface this is an easy task: iterate over a list of values, and emit <option> elements for each. In the real world, it's more complex: the option value will come from a different field than the option text, you probably have a (possibly null) value that should be selected, and maybe you'll decorate different options with different classes or IDs. To handle this, you need a language that, if not Turing-complete, is very close.

I'm going to work through just such an example, comparing scriptlets and JSTL. Both examples use the same Spring controller, which stores two values in the request context: a list of employees in allEmployees, and the desired selection in selectedEmployee (which may be null). If you want to walk through the code, it's available here, and builds with Maven.

First up is the JSTL version. Nothing complex here: a forEach over the list of employees, and some EL to extract the bean values. Perhaps the worst thing about this code is that the HTML it generates looks ugly: line breaks carry through from the source JSP, so the <option> element gets split over three lines. You can always tell a JSP-generated page by the amount of incongruous whitespace it contains (if you're security-conscious, that might be an issue).

<select name="employeeSelector">
<c:forEach var="emp" items="${employees}">
    <option value="${emp.id}"
        <c:if test="${!(empty selectedEmployee) and (selectedEmployee eq emp)}"> selected </c:if>
    >${emp}</option>
</c:forEach>
</select>

Now for the scriptlet version. It's a few lines longer than the JSTL version, partly because I have to retrieve the two variables from the request context (something EL does for me automatically). I also chose to create a temporary variable to hold the “selected” flag; I could have used an inline “if” that matched the JSTL version, or I could have changed the JSTL version to use a temporary variable; I find that I have different programming styles depending on language.

<select name="employeeSelector">
    <%
    List<Employee> allEmployees = (List<Employee>)request.getAttribute(Constants.PARAM_ALL_EMPLOYEES);
    Employee selectedEmployee  = (Employee)request.getAttribute(Constants.PARAM_SELECTED_EMPLOYEE);
    for (Employee employee : allEmployees)
    {
        String selected = ObjectUtils.equals(employee, selectedEmployee) ? "selected" : "";
        %>
        <option value="<%=employee.getId()%>" <%=selected%>> <%=employee%> </option>
        <%
    }
    %>
</select>

Even though the scriptlet version has more lines of code, it seems cleaner to me: the code and markup are clearly delineated. Perhaps there's a web developer who will be confused by a Java “for” statement in the middle of the page, but is the JSTL “forEach” any better? Assuming that your company separates the roles of web developer and Java developer, it seems like a better approach to say “anything with angle brackets is yours, everything else is mine.”

Putting subjective measures aside, I think there are a few objective reasons that the scriptlet is better than JSTL. The first of these is how I retrieve data from the request context: I use constants to identify the objects, the same constants that I use in the controller to store them. While I have no inherent opposition to duck typing, there's a certain comfort level from knowing that my page won't compile if I misspell a parameter. Sometimes those misspellings are obvious, but if you happen to write your JSTL with “selectedEmploye” it will take rigorous testing (either manual or automated) to find the problem.

Another benefit to using scriptlets is that you can transparently call out to Java. Here, for example, I use Jakarta Commons to do a null-safe equality check.

A better example would be formatting: here I rely on Employee.toString() to produce something reasonable (and if you look at the example you may question my choice). But let's say the users want to see a different format, like “Last, First”; or they want different formats in different places. In JSTL, you'll be required to manually extract and format the fields, and that code will be copy/pasted everywhere that you want to display the employee name. Then you'll have to track down all of that copy/paste code when the users inevitably change their mind about what they want to see. Or you could add different formatting functions to the Employee object itself, breaking the separation of model and view. With a scriptlet, you can call a method like EmployeeFormattingUtils.selectionName().

OK, before I get comments, I should note that there are easier ways to implement the JSTL version, using 3rd-party tag libraries. The Spring Form Tags, for example, would reduce my code to a single line, assuming that my controller followed the Spring “command” pattern; But those libraries limit you to common cases, restricting your ability to customize your HTML. And although you can write your own taglibs, I haven't seen many projects that do that (in fact, the only ones I've seen are ones where I did it).

I want to finish with a punchline: everything I've just written is moot, because the web-app world is transitioning from server-side view rendering to client-side, using technologies such as Backbone and Spine. That change will come with its own perils, and history tends to repeat itself. I worry that five years from now we'll be dealing with multi-thousand-line JavaScript monstrosities, mixing business logic with view rendering.

Wednesday, August 17, 2011

Meta Content-Type is a Bad Idea

Following last week's posting about “text files,” I wanted to look at one of the most common ways to deliver text: the web. The HTTP protocol defines a Content-Type header, which specifies how a user agent (read: browser) should interpret the response body. The content type of an HTML document is text/html; breaking from other “text” types, its default character set is ISO-8859-1. However, you can specify the document's encoding as part of the Content-Type, and most websites do.

All well and good, except that an HTML document can specify its own encoding, using the http-equiv meta tag:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="fr" dir="ltr" xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Wikipédia, l'encyclopédie libre</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

Wikipedia does “meta” Content-Type about as well as you can: the page is delivered with a Content-Type header specifying UTF-8, and it's an XHTML document (which implies UTF-8 encoding in the absence of a prologue). The only questionable practice with this page is the location of the <title> tag: it contains UTF-8 content, but appears before the in-document Content-Type. But in this case the in-document content type specification is superfluous.

Not all non-English pages do as well. The Montreal Craigslist page, for example, specifies ISO-8859-1 in the HTTP response, but UTF-8 in the meta tag.^* It is a testament to browser developers adhering to Postel's Law that you can read the site at all.

From a “layered architecture” perspective, the embedded content-type declaration is ugly. You could argue that it self-describes a stand-alone document, much like the prologue in an XML document. But there's an important difference: the bytes of an XML prologue are rigidly specified; the parser doesn't need to know the encoding to read them. The <meta> tag can appear anywhere in the <head> of an HTML document. Including, as shown by Wikipedia, after content that requires knowledge of the encoding.

While writing this post, I did a quick search for a history of the embedded Content-Type specification. I turned up a W3C page that recommended always using it, but did not give a rationale. And I found a page that claimed specifying a character set in the HTTP response would “break older browsers.” As the page did not list those browsers, and did not appear to be written by someone involved in browser development, I'm not sure that I believe it.

For my personal website, I rely on the HTTP header, and don't use the meta tag. But I also limit myself to US-ASCII text, with HTML or numeric entities for anything that isn't ASCII. I'm not going to suggest that you remove the tag from your website (who knows, your biggest customer might have an “older browser”). But if you do use it, it should be the first thing in your <head>.

More important than whether the <meta> tag is present is that you actually get the encoding right, both in the page and in the HTTP headers.

With servlets, it's easy: the first line of your service method should be a call to ServletResponse.setContentType().

response.setContentType("text/html;charset=UTF-8");

This will set the Content-Type header and also configure the object returned by ServletResponse.getWriter(). Don't, under any circumstances, write HTML data via the object returned by ServletResponse.getOutputStream(); it exists for servlets that produce binary content.

With JSP, put the following two directives at the top of each page.

<%@page contentType="text/html"%>
<%@page pageEncoding="UTF-8"%>

These are translated into a call to ServletResponse.setContentType(), and are also used by the JSP container itself to parse the page. If, after reading this posting, you don't feel comfortable writing self-describing files, you can also use a JSP property group in your web.xml.

One final thing: if you do choose to specify content type via http-equiv, make sure that it matches what your server is putting in the HTTP response. Otherwise, you risk having your site used as an example by someone writing about encodings.

* The Paris Craigslist omits the <meta> declaration, but retains ISO-8859-1 in the HTTP response. Which explains why all of the ads say “EUR” rather than €.

Tuesday, October 27, 2009

Tag Reuse in JSPs

One thing that's always puzzled me about JSP taglibs is the mechanics of tag reuse — what happens when you invoke the same tag twice on a page. The Tag JavaDoc says the following:

After the doEndTag invocation, the tag handler is available for further invocations (and it is expected to have retained its properties).

This statement sent up a bunch of red flags when I first read it. And it's done the same for others: I've seen tag implementations that explicitly reset all attributes in doEndTag(). But if this was the required implementation, I'd expect to see it mentioned in the J2EE tutorial or sample code, and it isn't.

Moreover, the description makes no sense: you can't build a software system on components that will render arbitrary values depending on where they're invoked. My first hint came from examining the Java code generated by Tomcat:

private org.apache.jasper.runtime.TagHandlerPool _jspx_tagPool_prodList_form_visible_userId_operation_name_listName_htmlId_format;
private org.apache.jasper.runtime.TagHandlerPool _jspx_tagPool_prodList_form_visible_userId_timeout_operation_name_listName_htmlId_format;

Those are pretty long names. And interestingly, they include the parameters used in the tags. And they're different. That led me back to the JSP specifications, with a more focused search. And on page 2-51, we see the rules:

The JSP container may reuse classic tag handler instances for multiple occurrences of the corresponding custom action, in the same page or in different pages, but only if the same set of attributes are used for all occurrences.

I'd love to know the rationale behind this — and whether it was intentional or a “retro-specification” to cover decisions made in the reference implementation (Tomcat). To me, it seems like premature optimization: assume that it's expensive to instantiate a tag handler or set its attributes, so go to extra effort to pool those instances. If, instead, the specification required handlers to be explicitly instantiated per use it would drive developers toward lightweight handlers that collaborate with external services. Which in my opinion is a far better way to write them.

Tuesday, April 7, 2009

JSP tag handlers can't have an attribute named “class”

Recently I've been implementing some tag handlers that emit HTML. In them, I want to carry through some of the standard HTML attributes such as id and class. And I've discovered that Tomcat is quite happy to accept the following TLD entry, but at runtime complains that it's unable to find a setter method for the attribute:

<attribute>
  <name>class</name>
  <rtexprvalue>true</rtexprvalue>

  <type>java.lang.String</type>
</attribute>

Sun provides a schema definition for the JSP taglib deployment descriptor, and it defines the type tld-attributeType for attribute declarations. Looking at that type definition, we see this:

<xsd:element name="name"
   type="j2ee:java-identifierType"/>

And jumping to the common definitions schema, here's the definition of java-identifierType:

<xsd:complexType name="java-identifierType">
  <xsd:annotation>
    <xsd:documentation>

        The java-identifierType defines a Java identifier.
        The users of this type should further verify that
        the content does not contain Java reserved keywords.

    </xsd:documentation>
  </xsd:annotation>

  <xsd:simpleContent>
    <xsd:restriction base="j2ee:string">
      <xsd:pattern value="($|_|\p{L})(\p{L}|\p{Nd}|_|$)*"/>
    </xsd:restriction>
  </xsd:simpleContent>

</xsd:complexType>

Did you read the comment? That users of the type should perform keyword validation?!? I couldn't believe it, until I turned to the XML Schema docs and discovered that there's no way to exclude values: the enumeration facet enumerates legal values only.

So, two lessons to draw from this: first, XML schema is not only complex but incomplete, and second, you can't use class as a JSP tag attribute. Or any other Java keyword for that matter. At least not in a servlet container developed by people who have read the schema docs, because this restriction is not mentioned in the JSP specification text.

My solution? The ugly but descriptive htmlClass.

(originally posted on my website 19 Nov 08)

blog.kdgregory.com