Dec 05 2009

Making Up Data Part II: Using Universe Data

In Part I of this series I talked briefly about the need for report writers to sometimes “make up” data. In that post I showed how I could use the Web Intelligence Rich Client (or alternatively Desktop Intelligence) to import data from a spreadsheet in order to fill out holes in data. In this post I am going to show an equivalent solution using multiple data providers from a universe instead. I will redo the same example shown before (with a lot fewer screenshots since quite a bit of the process is the same). Because I am using a universe I can show two different possible solutions.

Filling Out A Calendar

My report requirement states that all months – whether they have sales data or not – must appear in my output. Because Web Intelligence cannot make up data to fill out the missing months, I need to have an alternative source. In the first post in this series I showed how to do that with a personal data provider as I built out the data in a spreadsheet. This time I am going to show how to do the same thing using only a universe. In order to provide a complete (clean) solution I am going to have to make a new object in my universe.

The first step is to create a query that returns some data. I am going to use the same data that I started with on the last post, which is six months of sales revenue data from the eFashion database. I am also limiting the data to the year 2001 in order to match the example from the last post.

Next I will create a second data provider from the calendar table that includes all of my month values. Since this query has no restrictions (conditions) and no measures (no joins to any fact table) it will be guaranteed to return every month. At this point in the last post I detailed how to merge (link) two data providers together. This time that step is not required. That’s the first advantage of using universe data to perform this step: the dimension objects that are shared across my two queries are already merged (linked). When I used personal data, I had to do the merge step myself.

Once I drag the merged month into my report block I will see all 12 month names. As I did last time I will also replace the existing Month Name detail object with the one that comes from my calendar data provider. The results from these operations are shown in the screenshot below. The “Discontinued” value comes from a custom format used in the eFashion universe. The universe was designed with the assumption that if any cell was empty (as opposed to zero) then there were no sales for that item, and it must have been discontinued.

I would like to insert a zero value into those “Discontinued” cells. I could make it look like it has a zero by changing the format. But if I want an average of that column to be impacted by the fact that I had zero sales for six months out of the year, I need to have an actual zero in the cell, not just a custom format. It’s at this point that I have gone as far as I can without talking to my universe designer.

Creating and Using Placeholder Objects

As a universe designer I can help solve this problem. I might create a special class called “Placeholder Objects” that includes new objects used to solve this type of problem. (I have also seen these objects called “dummy” objects because they are only going to work in very specific situations. They’re not “smart enough” to be used by themselves.) When I used personal data to solve this issue I created a new column in my spreadsheet and put a zero in for every month. In my universe I will create just one object with a zero value in the select clause and nothing in the tables. Note that this object will not parse; that is expected.

Since this object does not have a table associated with it I will need to make sure that it’s used with at least one “real” object in a query. As a universe designer I can’t enforce this. But if a report writer tried to use this object by itself the query would not run. I have seen some folks suggest that objects like this should be pointed to the DUAL table (in Oracle) or something similar. I disagree. The only benefit to doing this is that the object will parse. But at the same time I am fixing that issue I would be introducing an entirely new set of issues because the DUAL table isn’t typically going to join to any other table. It’s much more effective to leave this object in a state where it won’t parse than to deal with all of that mess.

Rebuilding the Query

Once I have created my dummy object and exported the updated universe I can fix the problem. In fact I can fix it a couple of different ways. First, I could mimic the solution from the last post by adding the zero object to my calendar-only data provider. Since I have already covered all of those steps I thought I would show a different technique for this post. I am going to use a UNION query.

A union requires that all of the queries (you can have two or more) match. Each query has to include the exact same number of objects, and the object types (character, date, number) have to match as well. For the scenario I am using for this post each query will have three objects. Both queries will include the Month and Month Name. The first query will also include the new Zero object. The second query will include the regular revenue object. Here are the two screen shots showing what this query looks like.

Query Panel Union Part 1

Query Panel Union Part 2

Notice that the first query does not have any conditions at all? The second query includes the condition that I created to drop out months (it only shows months with an “e” in their name) as well as a limit to show data only for 2001 as mentioned earlier. When I run the combined queries I see this output.

At this point the data looks great, and it was obtained with a minimal amount of fuss. All of the months show up as required. There are no merged dimensions here since everything is returned in a single data provider. However, there is one item that needs to be fixed. Check out the column headers… they’re not exactly “user friendly” at this point. What happened? It turns out that column headers from a union query are taken from the first pass of the data provider. As a result my column header says “Zero” instead of “Sales revenue” as I would expect. Fortunately Web Intelligence (the Rich Client in this case) makes this process extremely simple. All I have to do is open the query panel and drag “Query 2” in front of “Query 1” in the union area of the query panel and rerun. After rerunning the query I am left with a report block that is missing a measure. 😯 This is okay, because the object named “Zero” has been removed from the query, and a new object “Sales revenue” has appeared. All I have to do is drag that new measure onto the block and I see these results.

Notice that I have added an Average calculation in the footer to show that the zero values are being included during the calculation. The result obtained via this process is now essentially identical to that shown in the first post about making up data. 😎

Conclusion

So far I have shown two different techniques that can be used to “make up” data for a report: personal data and universe data. Which is better? As with many questions the answer is “It depends.” πŸ˜† Personal data has an advantage in that it can be built on the fly by a report user and does not require any universe maintenance. Universe data has an advantage in that the extra objects are reusable and are available to anyone. With proper training users can build queries that can include extra data that does not exist in the database.

Personal data requires me to know how to merge the results. With universe data the merge step (as long as I use the same objects) is performed automatically.

In this post I also showed two different techniques for getting the “filler data” into my report. With personal data I have to create two data providers and merge the results. With universe objects I have the option of creating a single data provider using a Union technique and my placeholder object.

The placeholder measure object that I created is not limited to this particular report. I can use (and reuse) it for any other report with similar requirements. Since the object is not tied to the calendar (or any other dimension) I can use it anywhere. It can be used to merge data providers or in a union query. I can create similar “dummy” objects for dates or text fields as needed. Ultimately I can decide if I want these special objects to be visible to all users or to a selected subset of “power user” report writers using the standard security process.

Next Time

For Part III of this post I am going to create a more complex example where data might be missing from more than one dimension (instead of just month as I have used in the last two posts.) I am also going to show cross tab and a section output because each of those can cause some interesting challenges as well. Maybe I’ll even do a sectioned cross tab. πŸ˜‰

Related Posts

14 Responses to “Making Up Data Part II: Using Universe Data”

  1. Comment by Thani

    Nicely placed post Dave. Somehow I feel merged dimension is better than Union Query method. Yes.., Avg doesn’t work well with first, but I think we can calculate it explicitly by (Sum/Count).And also for complex quires, I feel first method would work better. Just my thought, I may be wrong also :).

  2. Comment by stoppos

    For the parse part. I’ve run at the same problem some months back. They wanted a clear universe, without parsing error. My solution in ORACLE was:
    decode(calendar.month,’xxx’,0,0)
    Always 0, and no parsing error. Since it is the month it won’t slow down the query.

  3. Comment by Dave Rathbun

    Hi, stoppos, and thank you for your comment. Your solution works as long as the calendar able is already included in the query. Making placeholder objects “generic” means they can be used for any query, not just one that references a specific table. So for the specific example in this post your solution will provide an object that does parse.

    If my report writers are going to be using union queries a lot, however, I prefer the generic solution. True, it won’t parse, but it’s much more flexible.

  4. Comment by Roland

    Dave

    In my office we are follwing this post very closely. Our requirement is a sectioned crosstab and *NOTHING* we have tried will work.

    Do you know when you may be able to provide the steps to accomplish that, or is there another site of which you are aware that may provide this information?

    My director is really anxious to get this information and to date, I have been unable to provide the data he requires.

    Any assistance will be greatly appreciated.

    Roland
    aka REB01 (from BOB Board)

  5. Comment by Dave Rathbun

    Roland, I have been unable to finish a number of posts that are on the calendar due to work issues. I hope to get some new content posted soon but cannot commit to a date. I will push the follow up post to this one up on the schedule though.

  6. Comment by Roland

    Thank you sir. I appreciate all you do for the BOB community and certainly do not want you to feel pressured. That is not my intent. Just knowing this Blog is available in addition to the BOB board is a blessing.

    πŸ™‚

  7. Comment by Ravi

    Thanks Dave, this is a total different technique to which I’ve used in my report. Thanks for pointing me to this post.

    Ravi πŸ™‚

  8. Comment by Ravi

    Hi Dave

    This is the case when object is coming from Universe(like month obj). But if I have no option other than creating report level obj in WEBI and a ‘Grouping object’ is created like ‘0-3 months’, ‘4-6 months’,’7-9 months’,’10-12 months’ and ’13-24 months’

    If I get NULL values for some group say ‘7-9′ and ’10-12’ in the report and it isn’t showing that groups. I can see only the groups with values.

    How can I achieve that. Please advice.

    Thanks

    Ravi

  9. Comment by Jim Waterbury

    Hi Dave,

    As we can all see from your schedule, you’re a real busy guy! However, your closing statement in this post has really gotten my attention: “For Part III of this post I am going to create a more complex example where data might be missing from more than one dimension (instead of just month as I have used in the last two posts.) I am also going to show cross tab and a section output because each of those can cause some interesting challenges as well.”

    I’m losing my mind trying to work with a sectioned report that refuses to display my “missing” dates. It’s all the worse that the non-sectioned parts of the report are displaying all dates just fine.

    Ahy hope for Part III soon??

    Thanks,
    Jim

  10. Comment by Dave Rathbun

    Jim, I have some bad news. πŸ™‚ I never even started Part III of this post, primarily because I had forgotten that I even considered it to begin with. I am really trying hard to complete my recursive data series that documents my presentation from last fall… especially so I can get all of those posts done before the next conference comes around this fall.

    I’ll stick a stub post in my draft section so I don’t lose track of this, but I don’t have any idea when I might get back to it. Hope you can be patient. In the meantime if you have a very specific question you may try (if you have not already) posting it on BOB or SCN.

  11. Comment by Maruthi

    Awesome Post. πŸ™‚ thank you so much Dave.

  12. Comment by James

    Did you ever create a PART 3 to this post?

  13. Comment by Dave Rathbun

    No, I don’t think I got around to it yet.

  14. Comment by Spencer

    Hi Dave. Sorry to revive an old post. Do you have any tips on how to work with multiple dimensions on a WebI report?