Sunday 30 March 2014

Servlet API - Part 1

As you may have known, Java have strong support for Network programming. You can deal with lower level protocols like TCP/UDP or higher level protocol like HTTP. In this series of articles, I will only cover Java API for HTTP protocol.

Background

TCP/IP and HTTP

As usual, I will start with my favourite part first. HTTP is a protocol on top of TCP. It means that it is perfectly possible to use Java API for TCP (socket programming) to implement HTTP server. However, no one do that unless for study purpose or have too much time to burn. Using higher level API definitely makes coding simpler and much faster.

Still, if you need to deal with some other protocols without built-in support like FTP, you will need to use low level API to implement your application.

From Java 1.2, Java is split into 3 versions, J2SE, J2ME and J2EE, each version serve a special purpose. The J2SE is the core of Java, it can be used to develop desktop application. J2ME is the special library to develop mobile application. J2ME is already extinct by now. J2EE, our main focus today is the library to build web application.

J2EE

When you choose to download Java to your system, the package will always have JRE runtime included. It is J2SE.

In contrast, J2EE is not part of Java is the beginning. After seeing developers struggle to build their own tools for their application to meet industry requirement, Sun combine the most popular ideas and APIs in the market to create J2EE. It includes these technologies:
  • Java Server Pages
  • Enterprise Java Bean
  • JDBC
  • JMS
  • JNDI
  • Java Transaction API
  • Java Mail
J2EE is a set of API and standards rather than concrete implementation. Most of its contents are interface rather than class. If a server was built with the implementation of all J2EE APIs, it is a compliant J2EE application server. Up to today, the border between J2EE and J2SE is not so clear any more. For example, the default J2SE package already include JDBC and JNDI interfaces. You can see them in the rt.jar file (run time library).

For other APIs of J2EE, you need to include them in the class path to use.

Session

HTTP is a stateless protocol. It means the server generally do not remember who you are and what have you done. However, it is critical to have this feature if you implement authorization or due to business requirements. To achieve that, normally container will include a session cookie to the first response. That help container to identify user and create server side session. The session cookie for Java normally have the name JSESSIONID. To avoid space issue, the container will delete the server-side session if there is no request up to a certain amount of time. If this happen, the session cookie is not recognized any more and server will assign a new session cookie and session object.

Servlet API

HttpServlet is not part of core Java. Hence, to do servlet programming, you need to include Servlet API to the project classpath.  The most common way is to include the server runtime to your project. Any Java server should have ServletAPI implementation and API. If you do not want your project classpath to have any server runtime, you can manually include ServletAPI to classpath.

As mentioned above, J2EE was born with the goal of providing common interface for various vendor implementation. That why ServletAPI has nothing but a few interfaces, XML schema and some specific requirements. Servlet API started from version 2.2, gradually upgraded to 2.3, 2.4. 2.5 and totally revamped in version 3.0.

Servlet is a very primitive API, that why it is not so convenient to use. Rarely you see anyone using Servlet to render webpage unless the application is super simple. Any developers working with JavaEE should be familiar with framework build on top of ServletAPI like SpringMVC, Strut or JSF.

After getting tired with using OutputStream to render html content, JSP was introduced as Java version of Php script. Jsp makes creating html content is much simple to write. However, as Java is not dynamic language, Jsp file is converted to Servlet before serving first customer. Slowly, as Java world slowly adopt Ajax and RestAPI, contents is often delivered with Json format and server-side rendered HTML is used less often.

Servlet API 2.5 

You can Google and download servlet-api-2.5.jar to take a look at the content of the API. The jar file include 3 packages javax.servlet, javax.servlet.http, javax.servlet.resources. In the scope of single article, I will cover major interfaces that developer usually used to develop web application.

Filter and Servlet

The two most important interfaces to handle HTTP request are Servlet and Filter.


As the HTTP request come to web container from internet, the container generate a ServletRequest or HttpRequest object that contain the information of the request. Later, it use the returned object of type ServletResponse or HttpServletResponse to render HTTP response. In the above example, the container suppose to generate HttpServletRequest because we use HttpServlet as handler.

The main motivation of splitting Filter and Servlet as to have Servlet focusing on business logic and Filter to handle general concerns like logging, or security. HttpServlet include support for all HTTP method GET, POST, HEAD, PUT and DELETE. However, most people only implement GET and POST requests. This is surprise though because HTML 4.0 and XHTML 1.0 only support GET and POST (which mean you can not send other kinds of requests in old browsers like IE7).

Container use single instance to serve all request to the same URL, that why you need to ensure ThreadSafe when implementing Servlet. The Servlet API give you HttpServletRequest and HttpSession, both are method parameters and thread safe. If you choose to have other field variables, it is a must that these variables are thread safe as well.

Container make use of HTTP thread pools to serve request. If the container running out of threads, it will hold the requests waiting for the first available thread from the Thread pool. For example, the maximum amount of HTTP thread for Tomcat is 200 and we know that Tomcat can serve up to 200 concurrent requests.

As of Servlet API 2.5, the HTTP thread pool is fully occupied until the servlet and filter complete processing. Even if the thread sleeps, waiting for some resources, it is still not available for other request. Hence, if you let the Http Thread hang up, you effectively reduce the throughput of system.

Deployment Descriptor 

Deployment Descriptor is the fancy name for web.xml. Any Java web application must always have this file WEB-INF/web.xml. The container look for this file to know how load webapp. There are two other optional folders, WEB-INF/lib and WEB-INF/classes. Any jar files drop to WEB-INF/lib will be included to webapp classpath. Project source code and resources will be compiled and drop inside WEB-INF/classes folder. Hence, if you are worry about deployment process, here is the place to check.

Here is an empty deployment descriptor

<web-app xmlns="http://java.sun.com/xml/ns/javaee"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://java.sun.com/xml/ns/javaee 
       http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
       version="2.5">
<web-app>
  <display-name>Servlet 2.5 Web Application</display-name>
</web-app>
Each Servlet API have different templates for deployment descriptor and it is important to put this right. The package javax.servlet.resources contains all the schema definitions for Servlet API. You can use these schemas to validate the content of Deployment Descriptor file. In real life, developer rarely need to deal with these steps as popular IDE includes schema definitions for all versions of Servlet API. If you use Eclipse, can check this at Window/Preferences/XML/Xml Catalog

Normally, a standard deployment descriptor will contain important information like declarations of filter, servlet, welcome files, context listener, error pages, security constraints, context param... It is essential that you know about all of these concepts very well as they are the fundamentals for building java web application.

Frameworks

Servlet API is pretty simple, it is easy to understand but not so convenient to build application on top of it. For example, I created this code before for a very simple purpose:

String path = (request.getPathInfo()==null) ? "" : request.getPathInfo().replaceFirst("/", "");

This is just to handle the url like:

localhost:8080/twits/

and

localhost:8080/twits

In the above example, request.getPathInfo return "/" for the first URL and return null object for second url but for my application, both URLs should be identical. So, this line of code is simply boilerplate code that I need to add in just to cope with limit of API. Another well-known limit is the ability to read request parameters and request body.

Because of this, it is very soon that developers start building framework on top of Servlet API to cope with this limit. All the MVC framework deviate from Servelt API by letting a single servlet serve all requests for all URL. This  servlet will invoke some services to serve the request.

By the time I graduated, the most popular framework in the market is Struts. It create mapper concept. This mapper will automatically map request paramter to Form object, which is a Java bean. In the service's methods, the form was included as parameter to use. This is a pretty cool idea.

The Spring MVC even simplify things further by letting you choose any method parameter name and type. It will try to match the method parameter with any request parameter and automatically assign value. If the request parameter is different, you can override the mapping with annotation.

3 comments: