2009-11-21

Receiving email in Google App Engine + Spring

Here's my Spring Controller which, at the moment, receives an email and prints various elements of the message to the log. The instructions here are very good and I only adapted it slightly for Spring.

Firstly I added the inbound-services section to my appengine-web.xml file. Next I added a url mapping to my Spring config:

<bean id="publicUrlMapping" 
  class="org.springframework.web.servlet.handler.SimpleUrlHandlerMapping">
    <property name="mappings">
        <props>
            <prop key="/_ah/mail/*">mailController</prop>
        </props>
    </property>
</bean>

Here is the MailController class:

import java.io.InputStream;
import java.io.UnsupportedEncodingException;
import java.util.Properties;

import javax.activation.DataHandler;
import javax.activation.DataSource;
import javax.mail.Address;
import javax.mail.Part;
import javax.mail.Session;
import javax.mail.internet.MimeMessage;
import javax.mail.internet.MimeMultipart;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import org.apache.commons.io.IOUtils;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.springframework.web.servlet.ModelAndView;
import org.springframework.web.servlet.mvc.AbstractController;

public class MailController extends AbstractController {
    protected final Log logger = LogFactory.getLog(getClass());

    @Override
    protected ModelAndView handleRequestInternal(HttpServletRequest request, HttpServletResponse response) throws Exception {
        Properties props = new Properties(); 
        Session session = Session.getDefaultInstance(props, null); 
        MimeMessage message = new MimeMessage(session, request.getInputStream());

        Address[] from = message.getFrom();
        String fromAddress = "";
        if (from.length > 0) {
            fromAddress = from[0].toString();
        }

        String contentType = message.getContentType();
        InputStream inContent = null;
        logger.info("Message ContentType: ["+contentType+"]");

        if (contentType.indexOf("multipart") > -1) {
            //Need to get the first part only
            DataHandler dataHandler = message.getDataHandler();
            DataSource dataSource = dataHandler.getDataSource();
            MimeMultipart mimeMultipart = new MimeMultipart(dataSource);
            Part part = mimeMultipart.getBodyPart(0);            
            contentType = part.getContentType();
            logger.info("Part ContentType: ["+contentType+"]");
            inContent = (InputStream)part.getContent();
        } else {
            //Assume text/plain
            inContent = (InputStream)message.getContent();
        }
        
        String encoding = "";
        if (contentType.indexOf("charset=") > -1) {
            encoding = contentType.split("charset=")[1];
        }
        logger.info("Encoding: ["+encoding+"]");
        
        String content;
        try {
            content = IOUtils.toString(inContent, encoding.toUpperCase());
        } catch (UnsupportedEncodingException e) {
            content = IOUtils.toString(inContent);
        }
        
        logger.info("Received email from=["+fromAddress+"] subject=["+message.getSubject()+"]");
        logger.info("Content: "+content);
        
        return null;
    }
}

Note that on the Google page it says
The getContent() method returns an object that implements the Multipart interface. You can then call getCount() to determine the number of parts and getBodyPart(int index) to return a particular body part.
This doesn't appear to be quite true. If you call getContent() on MimeMessage it returns a ByteArrayInputStream of the raw bytes for the whole message. According to the documentation here an input stream should only be returned if the content type is unknown. I think this is a bug.

You can get around this by parsing the content type yourself as I have done in my example. If message.getContentType() returns a string containing "multipart" then I parse it as a multipart message, otherwise I assume it is "text/plain".

In order to extract a single part of the Multipart content you have to pass through a MimeMultipart object. It's here that you can call getCount() and extract the Parts that you want. In my example I just get the first part.

Calling getContent() on the Part still returns a stream of bytes so you have to convert it with the correct encoding. You can extract the encoding from the ContentType of the Part. I added a try..catch block around the conversion to a string in case the encoding was not recognized - in which case it falls back to the default.

It is vital that you determine whether you have multipart content or not. If you try to parse a "text/plain" message as a multipart then you may well encounter an error like this:
Nested in org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.OutOfMemoryError: Java heap space:
java.lang.OutOfMemoryError: Java heap space
 at java.util.Arrays.copyOf(Unknown Source)
 at java.io.ByteArrayOutputStream.write(Unknown Source)
 at javax.mail.internet.MimeMultipart.readTillFirstBoundary(MimeMultipart.java:244)
 at javax.mail.internet.MimeMultipart.parse(MimeMultipart.java:181)
 at javax.mail.internet.MimeMultipart.getBodyPart(MimeMultipart.java:114)
The readTillFirstBoundary method fails because in a "text/plain" message there are no boundaries!

Note that the development server always sends a multipart message with two parts: text/plain and text/html. GMail also sends emails in this format but lots of other servers just send text/plain.

2009-11-15

Converting from JAXB to Simple XML

Since JAXB doesn't work on Google App Engine but Simple XML does I have been converting my application. I'm using XML in quite a loose way; I have only annotated my classes for the elements and attributes which I want to extract from the larger XML document. JAXB is more forgiving when used in this way, however with a few extra parameters the same can be achieved with Simple XML. Here is a table of a few things I had to convert:

Description JAXB Simple XML
The RSS element that is being deserialized has a version attribute on it but I do not wish to model this in the Java class. I only want to deserialize the channel element
@XmlRootElement(name="rss")
static class Rss {    
 @XmlElement(name="channel")
 Channel channel;
}

@Root(name="rss", strict=false)
static class Rss {    
 @Element(name="channel")
 Channel channel;
}
The channel element contains a list of item elements without any wrapper element enclosing the whole list.
static class Channel {
 @XmlElement(name="item")
 List<Item> items;
}

@Root(strict=false)
static class Channel {
 @ElementList(entry="item", inline=true)
 List<Item> items;
}
A List with a wrapper element.
@XmlElementWrapper(name="customfieldvalues")
@XmlElement(name="customfieldvalue")
List<String> customfieldvalues;

@ElementList(entry="customfieldvalue", 
  required=false)
List<String> customfieldvalues;
Deserialize
JAXBContext jaxbContext = 
  JAXBContext.newInstance(Rss.class);
Unmarshaller unmarshaller = 
  jaxbContext.createUnmarshaller();
Rss rss = 
  (Rss) unmarshaller.unmarshal(in);

Serializer serializer = new Persister();
Rss rss = 
  serializer.read(Rss.class, in);

2009-11-08

Using Simple XML instead of JAXB on Google App Engine

In my previous post I said that I was going to try using a hack to get around the lack of support for JAXB in Google App Engine. Not only did this feel bad but also it didn't work for me, despite all the changes that had been made to circumvent the non-whitelisted classes. I still got various java.lang.NoClassDefFoundError exceptions.

So I decided to try Simple XML Serialization instead. This worked really well and caused no problems in the Local environment. However, when I deployed this to Google I was hit by the Sandbox limitations for Reflection. When Simple scans your classes to build its "schema" it calls setAccessible(true) on every method and constructor it finds all the way up the hierarchy to Object. This violates the sandbox restriction: "An application cannot reflect against any other classes not belonging to itself, and it can not use the setAccessible() method to circumvent these restrictions." App Engine throws a SecurityException when you try to call setAccessible(true) on one of its classes.

For my purposes, and probably the majority case, I do not need to serialize or deserialize to any non-public method of any superclasses other than my own. So given this I decided to absorb any SecurityExceptions thrown during the scanning process thus leaving those methods out of the "schema". Two minor changes are required to the source. The scan methods in org.simpleframework.xml.core.ClassScanner and org.simpleframework.xml.core.ConstructrorScanner both need a try..catch block added like so:
//ClassScanner
   private void scan(Class real, Class type) throws Exception {
      Method[] method = type.getDeclaredMethods();

      for(int i = 0; i < method.length; i++) {
         Method next = method[i];
         try { 
          if(!next.isAccessible()) {
             next.setAccessible(true);
          }
          scan(next);
        } catch (SecurityException e) {
   // Absorb this
        }

      }     
   }

//ConstructorScanner
   private void scan(Class type) throws Exception {
      Constructor[] array = type.getDeclaredConstructors();
      
      for(Constructor factory: array){
         ClassMap map = new ClassMap(type);
         
         try {
          if(!factory.isAccessible()) {
             factory.setAccessible(true);
          }
          scan(factory, map);
         } catch (SecurityException e) {
   // Absorb this
         }
          
      } 
   }
This works well now in the Local and Deployed environments. Perhaps Simple would benefit from a mode or an option on the @Root annotation to specify the depth of class scanning as an alternative to this work around. I will post this on the Simple mailing list and report back.

UPDATE:
The author of Simple has responded to my post on the mailing list saying that my fix above will be added to the next release.