Right Outer Join

2 September 2011

Cool Reporting on MongoDB

Filed under: iReport, JasperReports — Tags: , , , — mdahlman @ 16:24

Interesting Reporting on MongoDB

Before you can do interesting reporting, you need to do simple reporting. You have to walk before you can run. Please start with the article Simple Reporting on MongoDB. That article explains where to get the MongoDB connector for iReport and JasperReports Server. When you have completed that, then read this article more advanced and more interesting reporting techniques.

Simple Filters

Let’s start with a simple hard-coded filter.

{ 
  'collectionName' : 'accounts',
  'findQuery' : { 'name' : 'M & Y Takemura Communications, Ltd'  }
}

Filtering on a single value makes sense when that value is a complex document. There could be lots of information in that document to create a report. To make this truly interesting though, we need this query to be parameterized. So our first truly realistic sample query is the following:

{
  'collectionName' : 'accounts',
  'findQuery' : { 'name' : '$P{CUSTOMER_NAME}'  }
}

Field Selection and Dot Notation

Documents in MongoDB can be big. The queries above will return every field in the document. When this is not appropriate, it’s easy to create a query to return only the fields that you need.

Imagine the case where we have lots of users and we want a report showing which users had logged certain events. There could be lots and lots of events, but there’s only one particular event that I care about in this report. Rather than return the complete document for each user we instead return only the fields we care about.

{ 
  'collectionName': 'active_users',
  'findFields' : { 'id':1, 'date':1, 'events.ACHIEVEMENT_Minor':1 }
}

Enter this query into iReport’s query field, and then iReport is able to execute the query and return only the fields we’re interested in.

iReport Query Editor

The query editor makes it easy to test query syntax and to see query results

We could then combine this with a filter if we only want certain users returned.

{ 
  'collectionName': 'active_users',
  'findFields' : { 'id':1, 'date':1, 'events.ACHIEVEMENT_Minor':1 },
  'findQuery' : { 'events.ACHIEVEMENT_Major' : 1 }
}

Joins!

Joins? MongoDB doesn’t support joins. In general your data should be modeled so that you don’t need joins. If you need them, then perhaps you should be using a traditional RDBMS instead of MongoDB. But what about simple joins that are required in some types of reports? Subreports can be used in JasperReports to effectively join two collections together.

Imagine a collection that stores summary data by country and a collection that stores customers. I want to query the country collection to find my top countries for some time period and then query my customers collection to find the top customers in the those countries.

It’s simple in JasperReports. We create a query in the top level report to return the countries that I want. Then for each country I run a subreport that returns the customers from that country.  Here’s a simple version.

Parent query:

{ ‘collectionName’:’countries’ }

The subreport query finds all accounts that are in that country and whose name starts with “B”. (It’s a silly example. Why would I want companies that start with “B”? But I wanted to show the syntax for Regular Expressions in queries.)

{
	'collectionName':'accounts',
	'findFields':{'name':1,'phone_office':1,'billing_address_city':1,'billing_address_street':1,'billing_address_country':1},
	'sort':{'billing_address_city':1,'name':1},
	'findQueryRegEx':{'billing_address_country':'/$P{AccountCountry}/','name':'/^B /'}
}

The above set of two queries gives me exactly what I want. It has the drawback of running “N + 1″ queries. By that I mean that runs the subreport query once for each country. In some cases that could be bad. In these cases it could be even better to use the “IN” syntax so that I only need to run each query once.

I can use the same parent query. But rather than pass the countries one at a time to the subreport we can pass the complete list of countries as a Collection. Then the subreport can run a single query to get all of the accounts that I want. The working reports are attached below, but here’s the query to show the idea.

{
	'collectionName':'accounts',
	'findFields':{'name':1,'phone_office':1,'billing_address_city':1,'billing_address_street':1,'billing_address_country':1},
	'sort':{'billing_address_country':1,'billing_address_city':1,'name':1},
	'findQuery':{ 'billing_address_country': {'$in':$P{AccountCountryCollection}} }
}

Summary

MongoDB has more flexibility than most of today’s Big Data databases in its query syntax. That’s one thing that made it attractive as a reporting target for Jaspersoft. Hopefully this article gives a good idea of some of the possibilities that exist for reporting directly against a MongoDB data source.

Sample reports using all of the above queries are available here.

About these ads

14 Comments »

  1. How do i make “in between” date queries ?

    Like this mongo cmd line example

    db.accounts.find({‘name’ : ‘hugo’, ‘lastPayIn’: { $gte: new ISODate(’2011-09-15T00:00:00.000Z’) }, ‘firstPayIn’: { $lte: new ISODate(’2011-09-16T23:59:00.999Z’) } } ).sort( { lastPayIn : -1 })

    I try the following:

    { ‘collectionName’:’accounts’,
    ‘findQuery’:{
    ‘name’ : ‘hugo’,
    ‘lastPayIn’: { ‘$gte’: new ISODate(’2011-09-15T00:00:00.000Z’) },
    ‘firstPayIn’: { ‘$lte’: new ISODate(’2011-09-16T23:59:00.999Z’) }
    }
    }

    But this ending only in a JSONParseException……Could you help ?

    Comment by Norman — 16 September 2011 @ 08:35

    • Norman,
      The short answer is that there isn’t a quick solution to this. The text “new ISODate(’2011-09-15T00:00:00.000Z’)” needs to be quoted. Without the quotes you would expect a JSON parsing exception (and you get one!). But with the quotes you get a String comparison which is not what you want. Take a look at this thread:
      http://groups.google.com/group/mongodb-user/browse_thread/thread/4e899b576203592?pli=1

      The good news is that it’s clearly possible to run such a query using the Java driver. But it’s more difficult than doing it on the command line. We will need to expand the query wrapper in the MongoDB connector to handle other data types. I’ll take a look. I expect that we can add this soon.

      Regards,
      Matt

      Comment by mdahlman — 16 September 2011 @ 16:35

  2. Hi Matt,

    thanks for quick solution……

    ….in the meantime i build a solution with an intermediate step as i read my data from mongodb first and use a JRDataSource or a JRCsvDataSource as a report input data source.

    But for direct db access i will check your solution…….

    thanks
    Norman

    Comment by Norman — 22 September 2011 @ 11:43

  3. Hi,

    Does anyone knows what happened to recent version of MongoDB connector?

    v.0.5 was was working, but parameter support was limited.
    Now I can download recent v.0.9 and it doesn’t work with iReport.

    When I try even a simple MongoDB query I get error (when trying to preview the data):
    net.sf.jasperreports.engine.JRException: java.lang.NoClassDefFoundError: Could not initialize class org.apache.log4j.LogManager
    at com.jaspersoft.ireport.designer.data.ReportQueryDialog.populateDataPreview(ReportQueryDialog.java:2373)

    The next time I press preview I get another exception:
    net.sf.jasperreports.engine.JRException: java.lang.NoClassDefFoundError: Could not initialize class com.jaspersoft.mongodb.MongoDbQueryExecuter
    at com.jaspersoft.ireport.designer.data.ReportQueryDialog.populateDataPreview(ReportQueryDialog.java:2373)

    Any clues what happened to recent version of connector and how this can be solved?

    Regards,
    Taras

    Comment by Taras — 18 January 2012 @ 08:39

    • Yes, it’s a known problem. The log4j jar was removed because it wasn’t needed.

      But it turns out that it was needed. Sometimes.

      The bug sneaked through because that release of the connector was tested only in the iReport commercial edition, and it works fine there. We have updated things so that it once again works correctly in all versions of iReport. The updated release should be posted later today. Version 0.11 solves this issue (and adds MapReduce functionality!), and it has been posted to jasperforge.org.

      Comment by mdahlman — 18 January 2012 @ 11:21

  4. Hi Matt,

    We’ve been using the MongoDB connector, and loving it!! It’s been really helpful, and we are looking at rolling it out in a wider implementation.

    The one issue we see with the iReport interface is dealing with mongodb & arrays. Arrays are treated as fields within the same record, rather than a separate record.

    Example Document

    {‘category’: ‘mammals’,
    ‘animals’: ['dogs,'cats','hamsters']},
    {‘category’: ‘foods’,
    ‘fruits’: ['Apple','Pear','Banana','Grapes']}
    }

    in iReport

    {‘collectionName’: ‘Example’}

    the choice of fields would be something like.

    category
    animals.0.
    animals.1
    animals.2

    One obvious issue is the second document has 4 elements, Is there a better way to handle reporting on arrays in iReport? Would we have to write a jscript to re-organize the arrays so they appear more friendly?

    Thanks for your help.

    Erik

    Comment by Erik — 18 January 2012 @ 14:13

    • Yes, it’s an excellent question. In fact, I did a report like this recently. I’m planning to post it.

      The short answer is: yes, you can do it. There are a few different ways to handle it. Look for a new article covering this very soon.

      Comment by mdahlman — 18 January 2012 @ 15:04

      • Hi Matt,

        I am working with MongoDB and ireports now — and with array handling (counting values in an array to be exact). In the MongoDB shell 2.1.1 it is somewhat straight forward. I can return a count of processe[] per key by: db .rpt1.group({key: { feature: true, value:true },reduce: function(obj,prev) {var count = 0;for(k in obj.processes) count++;prev.count = count;},initial: { count: 0}}); or db.rpt1.find().forEach( function(feature) { print(feature.feature + ” ” + feature.value + ” ” + feature.processes.length); });

        translating that to ireports query is a bit more involved. If you have the example you refer to in your 1/18/12 comment — could post a link?

        Comment by Renea — 14 May 2012 @ 11:11

      • My comment on 18 Jan 2012 was referring to how to handle collections. I wrote an article about collections in February 2012.

        Your example calculates the value using MapReduce. That is supported in the Jaspersoft MongoDB connector as well. Refer to the online documentation for MongoDB; you can skip down to the mapReduce section.

        Comment by mdahlman — 14 May 2012 @ 14:25

  5. Matt,
    Thanks for the articles – great stuff.
    Q. Where is the nice query editor you have shown here?

    Comment by Roy — 24 February 2012 @ 15:14

    • Silly me. Right click on the designer -> Edit Query…

      Comment by Roy — 24 February 2012 @ 15:36

  6. Hello by seeing the title i was hoping i could see something on grouping and stuffs. i’m having challenges using group function in a client like mviewer or rockmongo.
    is it possible to show how to write report like that using group or other aggregate functions

    Comment by highjo — 21 June 2012 @ 11:20

    • I have found this article about the MongoDB Aggregation Framework to be among the most useful. I would love to write an article walking through a couple of grouping examples. But until I have time to do that, you should be able to use the linked aggregation framework article to see how to write a query. Then you can use that query in the Jaspersoft MongoDB connector.

      Comment by mdahlman — 21 June 2012 @ 12:43


RSS feed for comments on this post. TrackBack URI

Go on... leave a reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Silver is the New Black Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 36 other followers

%d bloggers like this: