<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>Dave's Adventures in Business Intelligence &#187; Universe Design</title>
	<atom:link href="http://www.dagira.com/category/design/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dagira.com</link>
	<description>...you are in a twisty maze of passageways, all different...</description>
	<lastBuildDate>Wed, 01 Feb 2012 17:26:45 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<!-- podcast_generator="podPress/8.8" -->
		<copyright>&#xA9; </copyright>
		<managingEditor>blogmaster@dagira.com ()</managingEditor>
		<webMaster>blogmaster@dagira.com()</webMaster>
		<category></category>
		<ttl>1440</ttl>
		<itunes:keywords></itunes:keywords>
		<itunes:subtitle></itunes:subtitle>
		<itunes:summary>...you are in a twisty maze of passageways, all different...</itunes:summary>
		<itunes:author></itunes:author>
		<itunes:category text="Society &amp; Culture"/>
		<itunes:owner>
			<itunes:name></itunes:name>
			<itunes:email>blogmaster@dagira.com</itunes:email>
		</itunes:owner>
		<itunes:block>No</itunes:block>
		<itunes:explicit>no</itunes:explicit>
		<itunes:image href="http://www.dagira.com/wp-content/plugins/podpress/images/powered_by_podpress_large.jpg" />
		<image>
			<url>http://www.dagira.com/wp-content/plugins/podpress/images/powered_by_podpress.jpg</url>
			<title>Dave's Adventures in Business Intelligence</title>
			<link>http://www.dagira.com</link>
			<width>144</width>
			<height>144</height>
		</image>
		<item>
		<title>Why Can&#8217;t I Validate Prompts?</title>
		<link>http://www.dagira.com/2011/12/21/why-cant-i-validate-prompts/</link>
		<comments>http://www.dagira.com/2011/12/21/why-cant-i-validate-prompts/#comments</comments>
		<pubDate>Wed, 21 Dec 2011 14:23:37 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Prompts]]></category>
		<category><![CDATA[Rants]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=436</guid>
		<description><![CDATA[One of the possible enhancements we have been requesting for years is the ability to validate prompts. (We&#8217;ve also been looking for the ever-so-popular ability to use a formula such as &#8220;Today()&#8221; as a default for a prompt but this is different.) If we had true cascading prompts in Web Intelligence that would eliminate one [...]]]></description>
			<content:encoded><![CDATA[<p>One of the possible enhancements we have been requesting for years is the ability to validate prompts. (We&#8217;ve also been looking for the ever-so-popular ability to use a formula such as &#8220;Today()&#8221; as a default for a prompt but this is different.) If we had true cascading prompts in Web Intelligence that would eliminate one use case for validated prompts but not all. I had someone comment on my blog recently asking about how to validate one prompt selection against another and that started me thinking&#8230; what would something like this look like if we did get it? <span id="more-436"></span></p>
<p>Here&#8217;s a really simple example: I want to ensure that a user enters an end date that is at or beyond the entered start date value. This validation rule is designed to avoid confusion since many databases will not return any rows for a &#8220;backwards&#8221; between operation. (It&#8217;s also a fairly common request, found as far back as 2006 in <a href="http://www.forumtopics.com/busobj/viewtopic.php?t=68674">this topic on BOB</a> and again <a href="http://www.forumtopics.com/busobj/viewtopic.php?t=102181">2008</a>.) I can think of two ways to make this work. First, enforce the prompt entry order so that the user cannot enter the second date before they enter the first date, and give me some way to reference their first entered date in the list of values (LOV) for the second date. By making the second prompt constrained I would force the user to pick from the list of dates, and by referencing the first date in the LOV I can ensure that the only dates that show up do actually occur after the first date. This is technically probably easier to implement, but not the preferred solution.</p>
<p>A second option would be to allow me to create validation rules that fire after all of the prompts have been entered. This is far more flexible as it allows the user to respond to prompts in any order they would like (as it works today) but the validation is done before the query is sent to the database. This could be far harder to implement, primarily because I envision some form of validation language (VB? Crystal?) would be required. Do they invent some new language, or try to implement something that already exists? What if the language is not supported on all platforms? The date example that I have used so far seems fairly trivial: the rule would simply be <code>end_date >= start_date</code> which doesn&#8217;t look that complicated. It looks like an expression rather than a language, but a language is more than that. A full-blown language can have a grammar, reserved words, and all sorts of rules that specify how the various components can be compared.</p>
<p>Does my prompt validation language allow looping structures? I might want to be able to loop through a list of selected items in the case where a prompt offers more than one value. &#8220;Make sure <strong>all</strong> of the end dates are greater than or equal to the latest selected start date&#8221; would be one example. First I have to parse the list of start date values to find the largest entry, and then I have to process the list of end dates to make sure that 100% of them exceed the largest start date.</p>
<p>Doesn&#8217;t look so simple any more, does it? <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Any prompt validation logic would also have to ensure that validation loops don&#8217;t exist. Here I have defined validation rules on each entry that can be true individually but can never be true collectively. This is actually a data entry or coding error but I need to be able to check and react to that.</p>
<ul>
<li>Rule 1 on the start date: Make sure that the start date is greater than the end date.</li>
<li>Rule 2 on the end date: Make sure that the end date is greater than the start date.</li>
</ul>
<p>Can the start date be greater than the end date at the same time the end date is greater than the start date? Probably not. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Since only one of those two rules could ever be true at a given time the query would never run. With three (or even more) prompts the potential complexity for validation loops only gets worse.</p>
<h3>How Might Simple Validation Logic Work?</h3>
<p>As a universe designer I am charged with creating components of code that ultimately will be placed together in some random fashion by a query writer. I have no idea what sort of questions they&#8217;re going to want to answer, nor should I be constrained by that concern. I should be able to make prompt objects that function perfectly fine by themselves, but also in combination with other prompts. The &#8220;start date / end date&#8221; example I started with is very simple. Consider this syntax:</p>
<p><code>some_table.some_date between @prompt('Enter Start Date','D',,mono,validate:&#038;1<&#038;2) and @prompt('Enter End Date','D',,mono,validate:&#038;1&lt;&#038;2)</code></p>
<p>What I have done here is supplement the "free/constrained/primary_key" option with a new feature: validate. The syntax breaks down like this:</p>
<table class="blogtable">
<tr>
<th>Component</th>
<th>Function</th>
</tr>
<tr class="alt">
<td>validate</td>
<td>supplements the current free/constrained/primary_key option and is followed by a simple validation rule</td>
</tr>
<tr>
<td>:</td>
<td>delimiter that denotes validation logic follows</td>
</tr>
<tr class="alt">
<td>&#038;1</td>
<td>references the first argument within the scope of this prompt object</td>
</tr>
<tr>
<td>&#038;2</td>
<td>references the second argument within the scope of this prompt object</td>
</tr>
</table>
<p>The &lt; character is providing a simple validation expression "less than" using a standard mathematical symbol. Other options might include &gt;, &gt;=, &lt;=, ==, and !=. These are all single-value operators that would not support a multi-selection prompt, and the scope of the &amp;1 and &amp;2 doesn't extend beyond this particular prompt object, but it's a start.</p>
<h3>Named Prompt Components</h3>
<p>In order to extend my validation scope beyond a single prompt I need to ensure that I have a unique name that I can reference. <em>(As an aside, this is why class names must be unique within a universe structure. The class name\object name combination must be unique within the universe in order to support the @Select() functionality.)</em> To do this, I might extend my new prompt syntax with the following:</p>
<table class="blogtable">
<tr>
<th>Component</th>
<th>Function</th>
</tr>
<tr>
<td>validate</td>
<td>supplements the current free/constrained/primary_key option and is followed by a more complex validation rule</td>
</tr>
<tr class="alt">
<td>:</td>
<td>delimiter that denotes validation logic follows</td>
</tr>
<tr>
<td>&#038;0</td>
<td>name of this prompt component</td>
</tr>
<tr class="alt">
<td>&#038;1</td>
<td>references the first argument within the scope of this prompt object</td>
</tr>
<tr>
<td>&#038;2</td>
<td>references the second argument within the scope of this prompt object</td>
</tr>
<tr class="alt">
<td>&#038;n:name</td>
<td>references the argument denoted by "name" that occurs somewhere else in the universe</td>
</tr>
</table>
<p>I have introduced two new arguments. The extra argument &amp;0 will allow me to define a unique name for this prompt, and &#038;n:name allows me to reference that value in another prompt. Now I can create two separate prompts that look like this:</p>
<p>Start date check<br />
<code>some_table.some_date >= @prompt('Enter Start Date','D',,mono,validate:&#038;0:start_date&#038;n:start_date<=&#038;n:end_date)</code></p>
<p>End date check<br />
<code>some_table.some_date <= @prompt('Enter End Date','D',,mono,validate:&#038;0:end_date&#038;n:start_date<=&#038;n:end_date)</code></p>
<p>Keeping in mind that it is entirely likely that I would have more than one start date or end date in my universe, I would have to use more verbose prompt names like <code>account_start_date</code> or <code>invoice_range_start_date</code> and so on. In the above example I have two prompts, each has a name, and each has a validation rule. By allowing prompts to have names I can reference the result of one prompt inside of another prompt.</p>
<h3>Event Triggers</h3>
<p>The next challenge could be to determine when the validation logic fires. Does it fire at the end of each prompt selection? I can't really see that working because all of the required values might not yet be defined (selected). Does the validation logic fire when the query is executed? That also might not make sense because if I have a chain of three prompts where 3 depends on 2 and 2 depends on 1 I should be able to trigger the validation as soon as any two values are present. In my simple "start date / end date" example it would be easy to say that the validation logic fires as soon as both values are present. I also haven't addressed how to create the message that is delivered to the user if the prompt validation fails...</p>
<p>Most of the complexity in this process comes from the fact that I'm trying to design a fully generic solution. I don't want to have to write new code for each new prompt screen that I might design, I want to create reusable logic and syntax that works across the entire universe. I think it's easy to see why this is such a complex question, and perhaps indicates why we don't have anything like it so far within the universe.</p>
<p>But being able to set up a date value like "today" as a dynamic default value... that should be easier to implement. I hope we see that... and soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2011/12/21/why-cant-i-validate-prompts/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Handling Conditions on Outer Joins</title>
		<link>http://www.dagira.com/2010/08/17/handling-conditions-on-outer-joins/</link>
		<comments>http://www.dagira.com/2010/08/17/handling-conditions-on-outer-joins/#comments</comments>
		<pubDate>Tue, 17 Aug 2010 21:25:44 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Join Techniques]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=310</guid>
		<description><![CDATA[I don&#8217;t like outer joins in my reporting universes. Never have. Sure, if I am creating a universe against an application system I might consider using outer joins because of the normalized nature of the data. But if I am reporting against a warehouse schema of some kind, I really prefer to use inner joins. [...]]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t like outer joins in my reporting universes. Never have. Sure, if I am creating a universe against an application system I might consider using outer joins because of the normalized nature of the data. But if I am reporting against a warehouse schema of some kind, I really prefer to use inner joins. That way I avoid any potential performance issues caused by outer joins, but more importantly I avoid questions about report data. That being said, outer joins do have a specific purpose, and if I need to use them in my universe I certainly can.</p>
<p>One of the biggest challenges with outer joins (other than potential performance issues) is explaining to a user why their query results changed because they added a condition to their query. Remember that users don&#8217;t (typically) look at the SQL, so they won&#8217;t know that I have created an outer join. It can be confusing. Fortunately I have options as to how my outer joins are executed, so once I determine their usage requirements I can change the way my universe behaves.</p>
<h3>Defining the Problem</h3>
<p>For this post I will am going to use a very simple universe with only three tables, shown here.</p>
<p><img src="/tips/outer_join_filters/ssg_universe.png" width="496" height="218" border="0" alt="Summit Sporting Goods Universe screen shot" title="Summit Sporting Goods screen shot" /></p>
<p>This universe joins a customer to an order, and an order to order lines. In my database I have one customer that does not yet have any orders. If I run a query against the current universe structure, this new customer will not show up. My requirement is to show all customers, whether they have orders or not. This must be true even if I put a condition on the order table. That&#8217;s where it gets tricky. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  <span id="more-310"></span></p>
<h3>ANSI_92 Parameter Setting</h3>
<p>In order to make this work, I have to make sure that my universe parameter setting for <code>ANSI_92</code> is set properly. From the <strong>File</strong> menu I will select <strong>Parameters</strong>, and on the tabbed dialog box that appears I will click on the <strong>Parameters</strong> tab. I need to set the <code>ANSI_92</code> parameter to &#8220;Yes&#8221; in order for this technique to work. Without this setting, my joins will be created using Oracle-specific syntax and that isn&#8217;t what I want here.</p>
<p><img src="/tips/outer_join_filters/parameters.png" width="517" height="366" border="0" alt="Universe parameters screen shot" title="Setting ANSI_92 parameter in a BusinessObjects universe" /></p>
<p>Once I have verified this setting I can close the parameter screen. Next I need to define my outer join.</p>
<h3>Defining an Outer Join</h3>
<p>The outer join definition is done the same way with all databases, but it&#8217;s not always done in the same direction. In some databases I have to check the primary table and in others I need to check the secondary or optional table. Rather than trying to remember which database does what, I will pick a side and check to make sure that the tiny &#8220;o&#8221; appears on the expected side of my join. For example, I want to reconfigure my existing inner join between customers and orders and make it an outer join as shown here.</p>
<p><img src="/tips/outer_join_filters/outer_join_checkbox.png" width="512" height="483" border="0" alt="Outer join checkbox screen shot" title="Marking an outer join in a BusinessObjects universe" /></p>
<p>The structure window looks a bit different now.</p>
<p><img src="/tips/outer_join_filters/outer_join_active.png" width="496" height="218" border="0" alt="Outer join screen shot" title="Outer join definition in a BusinessObjects universe" /></p>
<p>As mentioned earlier, the tiny &#8220;o&#8221; on the right side of the join denotes the optional side. If the &#8220;o&#8221; is on the wrong side, I will open the join properties window and select the other checkbox. </p>
<p>That&#8217;s how I create an outer join in my universe. Let me show how it works first, then I will show how I can break it.</p>
<h3>Outer Join Queries</h3>
<p>I have exported my outer join universe and created a very simple query. I have included the customer ID, name, and their order total. There is one customer (appropriately named &#8220;New Customer&#8221;) that has no orders and their order amount shows up blank because of the outer join setting.</p>
<p><img src="/tips/outer_join_filters/outer_join_results.png" width="491" height="265" border="0" alt="Query results from Web Intelligence using an outer join" title="Query results from Web Intelligence using an outer join" /></p>
<p>So far everything is working as I expected.</p>
<h3>Breaking an Outer Join</h3>
<p>An outer join is designed to allow rows that don&#8217;t completely match to still show up on my report. What happens if I apply a condition to the outer relationship? For example, I might want to add a condition to limit orders to a particular date range.</p>
<p><img src="/tips/outer_join_filters/query_conditions.png" width="452" height="266" border="0" alt="Query panel screen shot" title="Web Intelligence query panel with condition on outer join" /></p>
<p>Here are the results after running this query.</p>
<p><img src="/tips/outer_join_filters/condition_results.png" width="491" height="217" border="0" alt="Query results from conditions" title="Query results after placing a condition on an outer join object" /></p>
<p>What went wrong? (Answered on next page&#8230;)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/08/17/handling-conditions-on-outer-joins/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Too Many Objects? Too Many Rows? Try Prompting For Level of Detail</title>
		<link>http://www.dagira.com/2010/08/04/too-many-objects-too-many-rows-try-prompting-for-level-of-detail/</link>
		<comments>http://www.dagira.com/2010/08/04/too-many-objects-too-many-rows-try-prompting-for-level-of-detail/#comments</comments>
		<pubDate>Wed, 04 Aug 2010 23:06:39 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Prompts]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=308</guid>
		<description><![CDATA[A while back I was on a project where the users wanted to set up reports that initially displayed about six different dimension objects and a bunch of measures. They also wanted to have the flexibility of dragging a different set of dimension objects on the report and either adding to or replacing an existing [...]]]></description>
			<content:encoded><![CDATA[<p>A while back I was on a project where the users wanted to set up reports that initially displayed about six different dimension objects and a bunch of measures. They also wanted to have the flexibility of dragging a different set of dimension objects on the report and either adding to or replacing an existing dimension. The idea was good. The amount of data brought back was a problem. I was able to fix that with some interesting prompt objects in the universe.</p>
<h3>The Problem Definition</h3>
<p>For the example I will present in this post I will once again use my version of Island Resorts Marketing universe which I have converted to Oracle. I will create a report that initially shows the Resort and (for simplicity) a single measure (Revenue). The report will be designed to let the user drag on additional details like Service Line and Service. But I will design my objects in such a way that if the user doesn&#8217;t want to see the information at that level of detail they don&#8217;t incur the overhead (row count) simply because the object is present in the query. In order to accomplish this, I will prompt the user with a list that includes the tokens &#8216;Resort&#8217;, &#8216;Service Line&#8217;, and &#8216;Service.&#8217; The user will select the lowest level of detail they expect to use on the report. In this particular example the selections are hierarchical, meaning that selecting &#8216;Service Line&#8217; implies that the Resort data will also be present. There is another option &#8216;None&#8217; that can be selected if they want to deactivate the entire list.</p>
<p><em>Note that XI 3.1 offers a new feature called Query Stripping (in service pack 3) that works for BW and other OLAP queries and does this process automatically. It is not (yet) available for relational databases.</em> <span id="more-308"></span></p>
<p>My steps are:</p>
<ul>
<li>Create a derived table to provide values for my prompt</li>
<li>Create a custom LOV query to display the prompt values in the desired order</li>
<li>Create a prompt object that allows a user to select the desired level of detail</li>
<li>Create custom Level of Detail (LOD) versions of the impacted dimension objects</li>
<li>Build my report</li>
</ul>
<h3>Creating the Level of Detail Prompt</h3>
<p>First I have to create some data in my universe to feed my list of values (LOV) query. I have detailed this technique before. It&#8217;s quite simple to use a derived table and select against the DUAL table (in the case of Oracle) or equivalent and build any sort of list. Something like this will give me a list of values for the Resort level of detail (LOD) prompt.</p>
<pre>select 1 as LOD_Order, 'Resort'  as LOD_Resort from dual
union
select 2, 'Service Line' from dual
union
select 3, 'Service' from dual
union
select 4, 'None' from dual</pre>
<p>I have defined two columns for my derived table. The first is called LOD_Order and it will be used to order the prompt items in the way I expect to see them. The second column is the value that will populate my LOV. After creating this derived table I checked to see what the values were, and they came up as expected.</p>
<p><img src="/tips/level_of_detail/derived_table.png" width="224" height="198" border="0" alt="level of detail table rows" title="Level of Detail derived table rows" /></p>
<h3>Creating the Custom LOV Query</h3>
<p>The next step is to build the objects I need to create the list of values (LOV) that will appear in my prompt definition. I will create a class called &#8220;LOV&#8221; that includes objects for both the LOD_ORDER and the LOD_RESORT columns. Eventually these objects will be hidden, but I will need them to be visible in order to create my custom LOV query. The LOV will be built on the Resort LOD object, and initially it won&#8217;t look too special. It includes only the selected object. The trick is that I want to see the values in a specific order; that&#8217;s why I added the &#8220;order&#8221; column to my derived table. In order to create a sort on this object, I need to click on the Manage Sorts button shown here.</p>
<p><img src="/tips/level_of_detail/lov_query.png" width="542" height="163" border="0" alt="query panel screen shot" title="Sort Manager for the query panel definition of the custom LOV object" /></p>
<p>That button is only active if my database supports an order by clause that includes objects that do not appear in the select clause. If that button does <strong>not</strong> appear and I know that technique will work, I can update my designer parameter file as detailed in an earlier post titled, <a href="http://www.dagira.com/2010/03/04/sort_by_nono-very-confusing/">&#8220;SORT_BY_NO=NO? Very Confusing…&#8221;</a> which shows how to accomplish that. Clicking that button allows me to define a custom sort based on the LOD Order column.</p>
<p><img src="/tips/level_of_detail/lov_query_order.png" width="561" height="405" border="0" alt="Screen shot of Designer LOV screen" title="Adding a custom order object to my LOV query definition" /></p>
<p>When I run the query I see the rows in the order I expect.</p>
<p><img src="/tips/level_of_detail/lov_query_results.png" width="351" height="378" border="0" alt="Screen shot of Designer LOV results" title="Results of my custom LOV in the desired order" /></p>
<h3>Defining the Prompt</h3>
<p>I am going to reuse the prompt logic in multiple objects (Resort, Service Line and Service) so I am going to define it once and reference it using the @Select() function. Here&#8217;s the prompt:</p>
<pre>@Prompt('Please select desired Level of Detail','A','LOV\Resort LOD',mono,constrained)</pre>
<p>This is fairly standard syntax. The prompt text is defined, and the type of response is &#8216;A&#8217; for character data. The LOV definition points to the custom query I created in the last step. I only want to allow a user to select a single value, and because the data only works if it comes in as I expect I am constraining the user. They <strong>have</strong> to pick from my list in order to proceed.</p>
<h3>Building the Level of Detail Objects</h3>
<p>I will be able to save some effort by reusing the prompt definition using the @Select() function. Here&#8217;s what my new Resort object looks like:</p>
<pre>case @Select(Level of Detail\LOD Prompt)
when 'None' then 'Resort N/A'
else RESORT.resort
end</pre>
<p>Resort is the top of the list. That means if I pick Resort, Service Line, or Service, I want to see the resort values populated. This means I can take a shortcut. If the user selects &#8216;None&#8217; then I won&#8217;t show the resort values. If they pick anything else, I will.</p>
<p>The rest of my objects look very similar.</p>
<p>Service Line</p>
<pre>case @Select(Level of Detail\LOD Prompt)
when 'None' then 'Service Line N/A'
when 'Resort' then 'Service Line N/A'
else SERVICE_LINE.service_line
end</pre>
<p>Service</p>
<pre>case @Select(Level of Detail\LOD Prompt)
when 'None' then 'Service N/A'
when 'Resort' then 'Service N/A'
when 'Service Line' then 'Service N/A'
else SERVICE.service
end</pre>
<p>What I did was add a line of code to each of my objects, since each was at a lower level of detail. Note that I could have taken the opposite approach and defined my Service object like this:</p>
<pre>case @Select(Level of Detail\LOD Prompt)
when 'Service' then SERVICE.service
else 'Service N/A'
end</pre>
<h3>Building the Report</h3>
<p>I can now create a very simple report that includes all three of my new LOD objects and the Revenue. When I run the report at the Service level of detail it will show all of the detailed data, just as it would have before I created these custom objects. The results from Island Resorts end up being 27 rows of data.</p>
<p><img src="/tips/level_of_detail/report_detail.png" width="433" height="619" border="0" alt="Web Intelligence report at the lowest level of detail" title="Web Intelligence report at the lowest level of detail" /></p>
<p>If I rerun the exact same report at the Resort level of detail I get three rows. </p>
<p><img src="/tips/level_of_detail/report_summary.png" width="433" height="91" border="0" alt="Web Intelligence report at the resort level of detail" title="Web Intelligence report at the resort level of detail" /></p>
<p>The objects that I dropped from my level of detail are showing &#8220;N/A&#8221; values in the block. The prompt shows my four choices in the desired order because of the customizations I did for the LOV earlier.</p>
<p><img src="/tips/level_of_detail/report_prompt.png" width="511" height="444" border="0" alt="Report prompt screen" title="Web Intelligence report prompt screen" /></p>
<p>What is the true impact here? What have I really accomplished?</p>
<h3>Impact Analysis</h3>
<p>Sample databases like Island Resorts are great for fooling around because they&#8217;re small. They&#8217;re not so great for demonstrating techniques like this because the true impact is hard to determine. I went from 27 rows down to 3, not really a big deal, right? On a larger database with more combinations of dimension values, the impact could be far more significant. I might drop from 50,000 rows down to several hundred. And the beauty of it is that if the user doesn&#8217;t need the level of detail, they don&#8217;t pay the penalty of downloading 50,000 rows only to roll it up (via projection) on the report. If they later drag on one of the suppressed objects and see the &#8220;N/A&#8221; result, all the user has to do is refresh the report and select the new level of detail to get more rows.</p>
<p>My sample report screen shots included all of the objects at the same time in order to show what was really happening. When I used this technique for a real report, I only included objects that I wanted to see from the beginning. The other LOD objects were listed in the available objects. </p>
<h3>Alternate Solution</h3>
<p>The solution I outlined here works as desired if there is a rigorous hierarchy and the user is just selecting the lowest level of detail desired on the report. What I actually had to implement was a bit more complex and I will provide a sketch of the solution here. The actual requirement was based on different combinations of dimension objects. A matrix of choices might look like this:</p>
<pre>Resort (by itself)
Service Line (also by itself)
Service
Resort + Service Line
Resort + Service
Service Line + Service
Resort + Service Line + Service
None</pre>
<p>Instead of the simple case statement objects I showed earlier, I had to use a pattern matching function and look for the string &#8220;Resort&#8221; anywhere in the prompt result. If the user picked anything that included the Resort as an option, then the Resort object would return a value. Ultimately I created prompt &#8220;bit&#8221; objects in order to simplify the logic.</p>
<h3>Conclusion</h3>
<p>I thought this was an interesting technique to share at this point, what with the recent release of SP3 with the Query Stripping feature for cube data. It worked well for the project. In some cases this strategy allowed us to reduce the number of rows by an order of magnitude (10,000s of rows down to 1,000s or even hundreds) while retaining a large number of objects in the report. The biggest drawback of this solution is that my SQL code will still hit the requested tables, even if I am not really requesting any data. Going back to my earlier example, a true query stripping process would eliminate the resort, service line, and service tables if I selected &#8216;None&#8217; from the prompt. In my &#8220;level of detail&#8221; solution all of those tables are still present and therefore have an impact on the query performance. </p>
<p>The issue I was trying to solve in this case was related to the number of rows downloaded to the report. The queries ran fast enough that the query performance was not as big of a concern. It was my goal to provide a way to reduce the row counts without losing the flexibility to add more dimension objects to my report on the fly. This technique accomplished that.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/08/04/too-many-objects-too-many-rows-try-prompting-for-level-of-detail/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>Universe Models For Recursive Data Part III: Alias Versus Flattened</title>
		<link>http://www.dagira.com/2010/07/02/universe-models-for-recursive-data-part-iii-alias-versus-flattened/</link>
		<comments>http://www.dagira.com/2010/07/02/universe-models-for-recursive-data-part-iii-alias-versus-flattened/#comments</comments>
		<pubDate>Fri, 02 Jul 2010 11:45:03 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[2009 GBN - Dallas]]></category>
		<category><![CDATA[2010 Mastering ... Melbourne]]></category>
		<category><![CDATA[Recursive Data]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=284</guid>
		<description><![CDATA[This is the third of several posts that will review my presentation “Universe Models For Recursive Data” which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. As with my other presentations there is a PDF file [...]]]></description>
			<content:encoded><![CDATA[<p>This is the third of several posts that will review my presentation “Universe Models For Recursive Data” which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. As with my other presentations there is a PDF file that can be downloaded from my <a href="http://www.dagira.com/conference-presentations/">conference presentations page</a>. The first post introduced the concepts of recursive (as opposed to hierarchical) data and provided a couple of examples. The second post reviewed some of the different design challenges that I have seen in working with recursive data models. In this post I will introduce four different possible solutions and present a scorecard for each, showing how well it solves the issues presented in the prior post in this series. Links to both prior posts are presented at the end of this entry. I have also included Oracle SQL scripts that can be used to create and populate the tables used in this post.</p>
<p><em>This post will cover slides 22 through 30 from the presentation and will describe the first two solutions (one with two variations) outlined in the presentation.</em> <span id="more-284"></span></p>
<h3>Solution Options</h3>
<p>The four different solutions that I included in my presentation were: Universe aliases, Flattened structures (column or snowflake), Ancestor Model, and Depth First Tree Traversal. All of them work fine on a clean recursive hierarchy. Each of them partially works for at least some of the other challenges. Some of them present unique challenges (extra disk space requirements or lack of native drilling functionality) that will also be addressed. I am presenting the solutions in increasing order of complexity. This post will cover aliases and flattened structures (both versions). In the next post I plan to cover the ancestor model, and finally I will cover the depth first tree traversal in its own post. </p>
<h3>Universe Aliases</h3>
<p>This solution is the only one that can be completely self-contained within the universe. No DBA or ETL work is required. There are any number of ways to create an alias. I can:</p>
<ul>
<li>Right-click on a table and select Insert Alias</li>
<li>Select an existing table in my structure, then select Insert + Alias from the menu</li>
<li>Open my table browser and insert an existing table. An alias will automatically be created for me.</li>
<li>Select an existing table in my structure and click the &#8220;Insert Alias&#8221; toolbar button</li>
</ul>
<p>&#8230; and there are other ways to get aliases in my universe, especially if I have loops to resolve. The bottom line is that the process is quite simple.</p>
<p>Here&#8217;s what an alias looks like after it has been created and joined to an existing table in my structure.</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/alias_implementation.png" border="0" width="443" height="324" alt="screen shot of alias implementation in a BusinessObjects universe" title="Alias implementation in a BusinessObjects universe" /></p>
<p>The join can be a bit tricky. In this case, the employee row MGR_ID is joined to the manager row EMP_ID in order to make the relationship work. It might help to look at the raw data again from an earlier post.</p>
<p><img src="/tips/recursive_data/part_01_recursion_definition/pm_data.png" width="286" height="250" border="0" alt="raw data used to demonstrate recursion in a BusinessObjects universe" title="Raw data used to demonstrate recursion in a BusinessObjects universe" /></p>
<p>See how the recursive relationship is going to work after establishing this join? Field works for Ferrerez, and Ferrerez works for Noakes. Who does Noakes work for? His MGR_ID column is empty (NULL) implying that he does not have a manager. He owns the company. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h3>Pros of Alias Solution</h3>
<p>The primary advantage of this solution is that it is completely self-contained in the universe. No DBA or ETL work is required. That&#8217;s about it.</p>
<h3>Cons of Alias Solution</h3>
<p>There are several cons to this solution. It does not represent lateral relationships at all. I have to use outer joins in order to preserve those rows with missing keys (Noakes in this example). Both of these are important, but the most substantial drawback to this solution is that the depth is determined by the number of aliases that the universe designer creates. In the image shown above there is only one link: from manager to direct employee. How can I &mdash; in one step &mdash; determine my indirect reports? With only one level of alias, I can only report one level of my hierarchy. How many can I report with this structure?</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/multi_level_aliases.png" width="530" height="138" border="0" alt="screen shot of multi-level alias implementation in a BusinessObjects universe" title="Multi-level alias implementation in a BusinessObjects universe" /></p>
<p>With that structure I now have two outer joins, but I can report on three levels instead of just two.</p>
<p>How many alias levels do I create? Generally when I have seen this solution used (or used it myself) we resort to asking how many levels are required and then creating some number above that. If I need five, I will create seven. If I need seven, I will create ten.</p>
<p>That means, of course, if I have created ten levels and all of a sudden we have twelve I have to update my universe. That&#8217;s not a problem (as long as I keep up with things) but it&#8217;s certainly not desirable.</p>
<h3>Alias Scorecard</h3>
<p>Here&#8217;s the scorecard for the alias solution for each of the four scenarios I outlined earlier.</p>
<p><img src="/tips/recursive_data/scorecard_alias.png" width="600" height="297" border="0" alt="alias scorecard for handling recursive data" title="Alias scorecard for handling recursive data challenges in a BusinessObjects universe" /></p>
<p>Aliases are the easiest solution to implement but they don&#8217;t score well. Let&#8217;s move on to the next solution.</p>
<h3>Flattened Structure &#8211; Single Table Columns</h3>
<p>The next solution involves running either a SQL script or some form of ETL. I need to take the recursive table relationship and flatten it out much like I did with aliases, but this time in the database itself. The net result is that I will take data going down in rows:</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/table_rows.png" width="432" height="146" border="0" alt="data in tables is presented as rows" title="Data in relational tables consists of rows" /></p>
<p>and pivot it into columns in a table.</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/table_columns.png" width="456" height="144" border="0" alt="data for a hierarchy can be pivoted into columns" title="Data in a hierarchy can be pivoted into columns" /></p>
<p>The net result is all of my recursion is done during the script process and I end up with one table that contains everything (or every person in my case) stored at their specific level in the hierarchy. It easily allows me to drill because it creates a very natural hierarchy.</p>
<h3>Pros of Flattened Table Solution</h3>
<p>It handles unbalanced hierarchies much better than aliases because missing lower nodes are simply NULL in the table. That&#8217;s fine. This solution can also handle ragged hierarchies with a proper &#8220;plug node&#8221; strategy. If I have a lower level value (Divisional Director) that reports directly to the president (top level) then level 2 (Vice President) will be empty. I need to fill something in so I can drill properly. More important, that plug node has to tell me what the path is or else I cannot drill up properly. Suppose I had a director named Smith who reported directly to Noakes. The first column in my table would include Noakes. The third column would include Smith. The second column (the missing value due to the raggedness of my data) would contain Smith VP Not Assigned or something like that.</p>
<p>Flattened tables cannot handle lateral hierarchies at all because I can&#8217;t store two values in a single column.</p>
<h3>Cons of Flattened Table Solution</h3>
<p>As already mentioned, this solution cannot handle lateral hierarchies at all. It also requires DBA or ETL work if the number of hierarchy levels changes. My column names should reflect the position (node type) in the hierarchy. That&#8217;s not a problem unless my hierarchy levels change, then I might want to update my structures.</p>
<p>But by far the most critical issue with this solution is the fact that it requires DBA or ETL work if my levels ever change. Much like aliases when I have seen this solution implemented I generally see extra columns at the end of my table just to allow for future expansion.</p>
<h3>Flattened Table Scorecard</h3>
<p>Here is my scorecard for the Flattened Table solution.</p>
<p><img src="/tips/recursive_data/scorecard_flat_columns.png" width="600" height="297" border="0" alt="flattened columns scorecard for handling recursive data" title="Flattened columns scorecard for handling recursive data challenges in a BusinessObjects universe" /></p>
<h3>Flattened Structure &#8211; Snowflake Tables</h3>
<p>One thing that I noticed about the data for the flattened structure is that I repeat a lot of values. For example, Noakes is the &#8220;level 1 mgr&#8221; for every person in the company. It might seem to be more efficient to use a structure like this:</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/snowflake_structure.png" width="529" height="67" border="0" alt="screen shot of snowflake structure in a BusinessObjects universe" title="Snowflake structure for handling recursive data in a BusinessObjects universe" /></p>
<p>This would reduce my overall storage requirements because I would end up with a single row for the highest level table.</p>
<p>However, it also reintroduces the need for outer joins, which the initial flattened structure avoided. </p>
<h3>Pros of Flattened Snowflake Solution</h3>
<p>Because the tables get smaller as I get further up the tree (ultimately to a single-row table in my simple example) my overall storage requirement should be smaller as well. If I only need the top one or two levels, my queries should be very efficient. Finally, I think it would be easier to maintain as well. If a new level appears, I add a new table to my chain with the proper restrictions on the ETL for proper table population. </p>
<h3>Cons of Flattened Snowflake Solution</h3>
<p>Each of the solutions defined so far suffers from some form of this issue: I have to define a table (or column) for every possible level of my hierarchy. If I do not know what the total number of levels will be, I can try to anticipate and create extra tables to support future expansion. But that is not the best solution. Because these tables are maintained in the database, I have to talk to my DBA or ETL team when changes are required. Because the tables are joined I have to consider whether to use outer join to preserve depth on unbalanced hierarchies. And finally, the &#8220;plug node&#8221; strategy I outlined earlier becomes a &#8220;plug row&#8221; strategy in this case, and that&#8217;s substantially more complicated.</p>
<h3>Flattened Snowflake Scorecard</h3>
<p>Here is the scorecard for the flattened snowflake solution. In my opinion, it&#8217;s a slightly worse solution than the flattened table solution simply because of the join issue and the plug row concern.</p>
<p><img src="/tips/recursive_data/scorecard_flat_snowflake.png" width="600" height="297" border="0" alt="snowflake scorecard for handling recursive data" title="Snowflake scorecard for handling recursive data challenges in a BusinessObjects universe" /></p>
<h3>Next Time</h3>
<p>The solutions covered in this post are the least complex and therefore offer the least flexibility. They are easy to set up; in the case of aliases the entire solution can be built within the universe designer application. All of the other solutions require some sort of database scripting. In the next post I will talk about the ancestor model and how we used it at a manufacturing client. It has some definite advantages, and it handles just about all of the different challenges I have outlined. I don&#8217;t have to worry about plug nodes, and it handles both ragged and unbalanced hierarchies quite well. However it has an impact on disk usage and it can&#8217;t be drilled using the native functionality provided by BusinessObjects. Do the pros outweigh the cons? Come back soon and see for yourself. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_cool.gif' alt='8-)' class='wp-smiley' /> </p>
<p><strong>Related Links</strong></p>
<ul>
<li><a href="http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/">Universe Models for Recursive Data Part I: Introduction</a></li>
<li><a href="http://www.dagira.com/2010/06/25/universe-models-for-recursive-data-part-ii-design-challenges/">Universe Models for Recursive Data Part II: Design Challenges</a></li>
</ul>
<p><strong>Supplemental Material</strong><br />
Scripts to create and populate the basic HR table used for this presentation.</p>
<ul>
<li>Create table</p>
<pre>create table employee
(emp_id number(5) not null
,emp_lastname varchar(20)
,emp_firstname varchar(15)
,emp_dob date
,emp_address varchar(40)
,emp_area_code varchar(7)
,emp_town varchar(15)
,emp_phone varchar(18)
,showroom_id number(4)
,emp_start date
,emp_mgr_id number(5)
,emp_sex varchar(1)
,job_id number(4));

alter table employee add constraint emp_pk primary key (emp_id);
create index emp_dept on employee(dept_id);
create index emp_showroom on employee(showroom_id);
create index emp_mgr on employee(emp_mgr_id);
</pre>
</li>
<li>Populate table
<pre>
insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (101, 'Noakes', 'Nicholas', '12-MAR-48', '2356, Melrose Street', '30190', 'San Jose', '12-00-00-01', '01-JAN-91', NULL, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (102, 'Ferrerez', 'Ferdinand', '10-FEB-64', '25 Arcadia Avenue', '75897', 'Los Angeles', '22-55-56-32', '30-MAR-96', 101, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (103, 'Field', 'Felicity', '15-DEC-60', '12 Brasilia Street', '12014', 'Santa Barabara', '14-46-54-22', '26-MAR-95', 102, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (104, 'Fraser', 'Frank', '13-MAR-67', '45 Seaside Avenue', '75016', 'Los Angeles', '22-55-18-33', '13-DEC-91', 101, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (105, 'Snow', 'Sara', '03-OCT-65', 'Square Woodstock', '18000', 'San Jose', '14-34-34-30', '01-MAY-93', 101, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (106, 'Speed', 'Sonya', '03-DEC-70', '5, The Vale', '22000', 'San Jose', '14-32-39-43', '04-JUL-96', 105, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (107, 'Spencer', 'Steve', '01-NOV-64', 'Square Osaka', '33010', 'Los Angeles', '22-24-25-89', '16-APR-91', 105, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (108, 'Helen', 'Harrison', '01-AUG-66', 'Via Firenze', '38200', 'Los Angeles', '22-34-31-11', '13-MAY-94', 101, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (109, 'Thomas', 'Tom', '01-DEC-68', '11 Over Way', '24000', 'San Jose', '22-45-67-45', '20-DEC-95', 101, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (110, 'Thatcher', 'Terry', '03-OCT-50', 'Stars Parkway', '21000', 'San Jose', '12-11-11-09', '06-DEC-92', 109, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (111, 'Davis', 'Diana', '12-AUG-64', 'Rue Opera Sauvage', '92100', 'Los Angeles', '14-54-11-10', '22-SEP-92', 101, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (201, 'Pickworth', 'Paul', '12-FEB-51', '23 Las palmas road', '00316', 'New York', '12-24-26-44', '12-JAN-93', 101, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (202, 'Forest', 'Florence', '10-OCT-32', 'Rue des Lombards', '75100', 'New York', '22-54-11-10', '23-DEC-94', 201, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (203, 'Brown', 'Bella', '12-APR-59', 'Hollywood Blv', '36020', 'New York', '22-36-25-50', '03-FEB-92', 202, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (204, 'Porter', 'Pete', '15-NOV-57', 'Avd Torre De Embarra', '34100', 'New York', '14-44-11-66', '13-APR-92', 201, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (205, 'Irving', 'Ira', '12-FEB-64', '44 Beach avenue', '13000', 'New York', '12-56-55-20', '18-JUN-95', 204, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (206, 'Bailey', 'Ben', '12-JUN-57', '4 Palisades Drive', '75090', 'Long Island', '12-33-51-29', '01-DEC-90', 204, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (207, 'Duckworth', 'Dave', '09-SEP-66', 'Rue du grand temps', '75018', 'New York', '12-85-01-61', '04-NOV-93', 201, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (208, 'Ince', 'Ian', '10-AUG-53', 'Sunset Blvd', '31061', 'New York', '22-52-22-00', '04-DEC-95', 207, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (209, 'Hilary', 'Hibbs', '01-FEB-60', 'Sand Hill Road', '92800', 'New York', '12-54-11-10', '08-JUN-95', 202, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (301, 'Dagmar', 'Davinda', '12-APR-58', '12, The Crescent', 'SL1 1HG', 'Slough', '01628-764234', '24-JUN-95', 101, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (302, 'Presley', 'Percy', '30-OCT-62', '1 Jubilee Close', 'SL5 23F', 'Maidenhead', '01628-834582', '15-JUL-95', 301, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (303, 'Perry', 'Philippa', '24-FEB-71', '23 Rice Hill', 'SL3 12S', 'Maidenhead', '01628-567231', '28-SEP-96', 302, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (304, 'Hubert', 'Henri', '13-DEC-69', '5 Grand Lane', 'SL3 12S', 'Maidenhead', '01628-243535', '17-APR-96', 302, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (305, 'Adamson', 'Anita', '12-OCT-69', '24 Loose Lane', 'SL4 23D', 'Cookham', '01628-782364', '15-FEB-96', 301, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (306, 'Beaver', 'Bertie', '12-MAR-72', '223 Grange Hill', 'SL2 67E', 'Windsor', '01628-187632', '13-JAN-96', 305, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (307, 'Motson', 'Mervin', '22-DEC-74', '67 Blows Down', 'SL5 45G', 'Cookham', '01628-198371', '17-JUN-96', 305, 'M');
</pre>
</li>
</ul>
<p>Scripts to create and populate the flattened version of the HR table, Oracle syntax</p>
<ul>
<li>Create flattened table</p>
<pre>create table emp_flat
(emp_lvl_1 varchar2(20)
,emp_lvl_2 varchar2(20)
,emp_lvl_3 varchar2(20)
,emp_lvl_4 varchar2(20)
);</pre>
</li>
<li>Populate flattened table.<br />
Only four levels are supported.<br />
Starting point (Noakes) is hard-coded.</p>
<pre>insert into emp_flag (emp_lvl_1, emp_lvl_2, emp_lvl_3, emp_lvl_4)
select a.emp_lastname
,      b.emp_lastname
,      c.emp_lastname
,      d.emp_lastname
from employee a
,    employee b
,    employee c
,    employee d
where a.emp_id = b.emp_mgr_id(+)
and b.emp_id = c.emp_mgr_id(+)
and c.emp_id = d.emp_mgr_id(+)
and a.emp_id = 101;</pre>
</li>
<li>Create Snowflake Tables
<pre>create table emp_level_01
(emp_id number(5)
,emp_lvl_1 varchar2(20));

create table emp_level_02
(emp_id number(5)
,emp_mgr_id number(5)
,emp_lvl_2 varchar2(20));

create table emp_level_03
(emp_id number(5)
,emp_mgr_id number(5)
,emp_lvl_3 varchar2(20));

create table emp_level_04
(emp_id number(5)
,emp_mgr_id number(5)
,emp_lvl_4 varchar2(20));</pre>
</li>
<li>Populate snowflake tables<br />
Only four levels are built, each starting from the prior table.<br />
Starting point (Noakes) is hard-coded.</p>
<pre>insert into emp_level_01 (emp_id, emp_lvl_1)
select emp_id, emp_lastname
from employee
where emp_id = 101;

insert into emp_level_02 (emp_id, emp_mgr_id, emp_lvl_2)
select e.emp_id, e.emp_mgr_id, e.emp_lastname
from employee e, emp_level_01 e1
where e.emp_mgr_id = e1.emp_id;

insert into emp_level_03 (emp_id, emp_mgr_id, emp_lvl_3)
select e.emp_id, e.emp_mgr_id, e.emp_lastname
from employee e, emp_level_02 e2
where e.emp_mgr_id = e2.emp_id;

insert into emp_level_04 (emp_id, emp_mgr_id, emp_lvl_4)
select e.emp_id, e.emp_mgr_id, e.emp_lastname
from employee e, emp_level_03 e3
where e.emp_mgr_id = e3.emp_id;
</pre>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/07/02/universe-models-for-recursive-data-part-iii-alias-versus-flattened/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Universe Models For Recursive Data Part II: Design Challenges</title>
		<link>http://www.dagira.com/2010/06/25/universe-models-for-recursive-data-part-ii-design-challenges/</link>
		<comments>http://www.dagira.com/2010/06/25/universe-models-for-recursive-data-part-ii-design-challenges/#comments</comments>
		<pubDate>Sat, 26 Jun 2010 00:38:35 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[2009 GBN - Dallas]]></category>
		<category><![CDATA[2010 Mastering ... Melbourne]]></category>
		<category><![CDATA[Recursive Data]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=283</guid>
		<description><![CDATA[This is the second of several posts that will review my presentation “Universe Models For Recursive Data” which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. As with my other presentations there is a PDF file [...]]]></description>
			<content:encoded><![CDATA[<p>This is the second of several posts that will review my presentation “Universe Models For Recursive Data” which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. As with my other presentations there is a PDF file that can be downloaded from my <a href="http://www.dagira.com/conference-presentations/">conference presentations page</a>. The first post introduced the concepts of recursive (as opposed to hierarchical) data and provided a couple of examples. In this post I will review some of the different design challenges that I have seen in working with recursive data. </p>
<p>I decided to identify and cover four different examples of recursive data configurations. These included Clean, Unbalanced, Ragged, and Lateral. As I mentioned in the first post, I am going to use some basic human resources (HR) data for my examples. For this post, in order to show samples of each of the four challenges, I am going to represent my recursive data using a tree. The branches of the tree show the relationships between people. The nodes of the tree contain the information about each person. The data might include their name, hire date, and position (title) within the company. In order to properly interact with my recursive data I have to be able to work with both types of information: relationships and node data as well. If you are not sure what I mean, please continue reading, this will make more sense later on.</p>
<p><em>This post will cover slides 14 through 21 from the presentation and will describe each of the different recursive challenges that I identified.</em> <span id="more-283"></span></p>
<h3>Clean Hierarchy</h3>
<p>In my first example everything is very clean. Each branch of the tree has the same depth. Each branch follows the same path. There are no real challenges encountered in this hierarchy, pictured below.</p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_clean.png" width="537" height="320" border="0" alt="image of clean recursive hierarchy" title="Clean recursive hierarchy" /></p>
<p>Imagine that the top of the tree is the company president. The second level (the &#8220;B&#8221; nodes) represent vice presidents, and the third level (&#8221;C&#8221; nodes) represents divisional directors. When a hierarchy definition is very rigorous this is the type of tree I expect. For a very simple example let me suggest a product hierarchy instead of an HR chart for the moment. A product hierarchy for a food company might include a Brand Owner, the Brand, the Size, and finally the Flavor. The brand owner could be Beverages-R-Us, the brand could be Super Sports Drinks, the size is two liter bottle, and finally the flavor is Orange. Every product in the system is guaranteed to have all four of these attributes assigned, and they will all be in that exact order. </p>
<p>On the other hand, a human resources hierarchy is rarely as clean. Let me move on to some more interesting examples.</p>
<h3>Unbalanced Hierarchy</h3>
<p>An unbalanced hierarchy is one where the nodes are at inconsistent depths. Please review the tree shown below. </p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_unbalanced.png" width="455" height="320" border="0" alt="image of an unbalanced recursive hierarchy" title="Unbalanced recursive hierarchy with nodes at inconsistent depths" /></p>
<p>In the example shown above, there is one node (B1 in this case) that does not have any children while the rest of the nodes at that level (B2) do. If the A node is the company president, and the B nodes are vice presidents, it is entirely possible to have a position (perhaps &#8220;VP of Special Projects&#8221;) that does not have any additional people that report up to him or her. In that case the tree stops at the VP level and does not go down to the Divisional Director position.</p>
<p>Why is this a challenge? As will be seen later, one of the possible solutions to a recursive data scenario is to pivot the data (flatten) it into different columns. What happens to the missing nodes in this case?</p>
<h3>Ragged Hierarchy</h3>
<p>In the last example I suggested that there could be a VP of the company that does not have any direct employees. In the case of a Ragged hierarchy it&#8217;s slightly different. In this case I might see a Divisional Director who is reporting straight up to the company president without going through a VP.</p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_ragged.png" width="455" height="320" border="0" alt="image of a ragged recursive hierarchy" title="Ragged recursive hierarchy with nodes of inconsistent paths" /></p>
<p>Note that in the image above I am showing both an unbalanced node (B1) and a ragged node (C2). Let me focus on C2 for a moment. As I already mentioned, there is a relationship from that director position straight up to the president. It does not go through a vice president position. Why is this a challenge? Remember that earlier I mentioned there are two parts that I need to account for: the relationship and the position or node type. In this case the relationship only goes one step, but descends two levels (from president to director). I need to be able to represent both parts properly in whatever data model I come up with.</p>
<h3>Lateral Hierarchy</h3>
<p>If you have spent any time reviewing company organization charts you may have seen this type of relationship before: I am calling it a lateral (sideways) relationship.</p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_lateral.png" width="537" height="320" border="0" alt="image of a recursive hierarchy with lateral relationships" title="Recursive hierarchy with lateral relationships" /></p>
<p>It&#8217;s not uncommon to see a lateral relationship from one director to another director (C2 reporting to C1 in this example). This is one of the biggest challenges to most of the design ideas I will be sharing in my next post, because there are two things (people) occupying the same space (node type) on the tree).</p>
<h3>Merge / Diverge</h3>
<p>As I mentioned toward the beginning of the post that some scenarios are inherently cleaner than others because the relationships all have to exist. Unfortunately, it is quite likely to see a combination of issues. I have even seen challenges where a hierarchy does a merge / diverge relationship such as this:</p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_merge_diverge.png" width="455" height="443" border="0" alt="image of a recursive hierarchy with merge diverge relationships" title="Recursive hierarchy with relationships that merge and then diverge" /></p>
<p>SAP and other ERP vendors generally allow this sort of hierarchy to be built in order to provide the maximum flexibility to the client company. I have never tried to implement this in BusinessObjects because it simply does not work. There is no clear drill path. Suppose I drill from node B2 to C3, and then from C3 to D2. Now when I drill up, which path do I take? I can drill from D2 up to C3, and then from C3 I can drill up to either node A1 or B2. It&#8217;s ambiguous, and therefore our project team decided that we would not attempt to handle this at all. We instituted a business rule (an exception) that would kick out any hierarchy that included this sort of path.</p>
<p><em>This particular example was dropped from the presentation in the interest of time but I wanted to mention it here.</em></p>
<h3>Combinations</h3>
<p>Even without the merge / diverge issue, there are plenty of still challenges. For our project, a typical tree was both ragged and unbalanced. That meant that the solutions we discussed had to be able to handle both. We also had a number of lateral relationships that we needed to address. Our users wanted to be able to enter the tree by node type and drill by level. They wanted to see the entire tree presented as part of a prompt. And they wanted to be able to multi-select from those prompts&#8230; for any node at any level.</p>
<h3>Next Time</h3>
<p>Which solutions work the best? Do any solutions work for all of these different scenarios? My next post in this series will review each of the four solutions I outlined in my presentation and present a scorecard for each.</p>
<p><strong>Related Links</strong></p>
<ul>
<li><a href="http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/">Universe Models for Recursive Data Part I: Introduction</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/06/25/universe-models-for-recursive-data-part-ii-design-challenges/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Universe Models For Recursive Data Part I: Introduction</title>
		<link>http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/</link>
		<comments>http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/#comments</comments>
		<pubDate>Wed, 16 Jun 2010 18:38:36 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[2009 GBN - Dallas]]></category>
		<category><![CDATA[2010 Mastering ... Melbourne]]></category>
		<category><![CDATA[Recursive Data]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=282</guid>
		<description><![CDATA[This is the first of several posts that will review my presentation &#8220;Universe Models For Recursive Data&#8221; which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. After presenting it three times it seemed like an appropriate [...]]]></description>
			<content:encoded><![CDATA[<p>This is the first of several posts that will review my presentation &#8220;Universe Models For Recursive Data&#8221; which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. After presenting it three times it seemed like an appropriate time to (finally) get started writing up the blog posts. As with my other presentations there is a PDF file that can be downloaded from my <a href="http://www.dagira.com/conference-presentations/">conference presentations page</a>.</p>
<p><em>This post will cover slides 6 through 13 as a basic introduction of recursive data and challenges presented to universe designers.</em></p>
<h3>Defining Recursive Data</h3>
<p>Sometimes there is confusion about the distinction between hierarchical and recursive data. Hierarchical data does not present a big challenge for BusinessObjects. It can be something related to time (Year, Quarter, Month, Day), geography (Country, Region, State, City), or something more specific like an accounting structure (Business Unit, Account, Sub-Account). What makes this hierarchical structure work easily is that each element is stored in a different place. It could be in a different column in the same table (flattened) or even in different tables (snowflake). As long as I can drill from one column to another in the hierarchy everything works fine.</p>
<p>Self-referencing or recursive data may initially look like a hierarchy. The key difference is that all of the elements are stored in the same place. There are keys that relate one row in a table back to a different row in the same table. That&#8217;s how recursive data is different from hierarchical data.</p>
<p>Why is recursion is a problem for BusinessObjects? The language used &#8220;behind the curtain&#8221; is SQL, and SQL does not natively support recursion. Some database vendors offer extensions (for example the CONNECT BY PRIOR structure in Oracle) but these are not used by BusinessObjects.</p>
<p>How common is recursive data? It is certainly not unusual. Consider any of the following:</p>
<ul>
<li>Company organizational structure<br />
Object levels: President &#8211; Vice President &#8211; Director<br />
Object type: Person</li>
<li>Inventory BOM (Bill of Materials)<br />
Object levels: Product &#8211; Assembly &#8211; Sub-Assembly &#8211; Component<br />
Object type: Inventory item</li>
<li>Project Management<br />
Object levels: Project &#8211; Task &#8211; Sub-Task<br />
Object type: Project entry</li>
<li>Multi-Level Marketing (MLM)<br />
Object levels: Founder &#8211; Recruit &#8211; Recruit Level 2<br />
Object type: Person</li>
</ul>
<p>In each of the above examples the type of object (or node) type is the same at any level. For example, a company organization chart is made up of people. Some people are at different levels, and there are therefore relationships from one person to another. In order to show all of the relationships from the top of the company to the bottom (or the bottom to the top) I have to keep going back to the same table. That is recursion.</p>
<p>Because it&#8217;s easy to think about a company organizational structure I used that example for the rest of the presentation. </p>
<p><em>Note: The Motors database is used in the standard Universe Designer training course and will not be presented in its entirety in the download package for this presentation for copyright reasons. However, I will be providing the standard HR table and all of the modified versions used in this presentation.</em><span id="more-282"></span></p>
<h3>Example of Recursive Data Using Prestige Motors HR</h3>
<p>A picture will help at this point. Here is a screen shot from the Prestige Motors HR universe that I built for this presentation. Notice that there are two tables in the picture, but one is an alias of the other. In other words, I am really using the same table twice.</p>
<p><img src="/tips/recursive_data/part_01_recursion_definition/hr_relationships.png" border="0" width="398" height="371" alt="screen shot of recursive relationship in a BusinessObjects universe" title="Example of a recursive relationship in a BusinessObjects universe" /></p>
<p>The table on the left is the Employees table. I have aliased the table and called it Manager. The two tables are joined using the link from EMPLOYEE.EMP_MGR_ID to Manager.EMP_ID. Since this is really the same table twice, this join defines the relationship from any particular person to their immediate manager. It&#8217;s a recursive relationship from a person to a person.</p>
<p>Notice that in this case I have defined the join as an outer (optional) join? That&#8217;s because the top person in the company does not have a manager, and the relationship would fail in that case. I want to ensure that I return every person and their manager&#8230; even if that person does not have a manager. Here is a sample of some of the data to help show why this is important.</p>
<p><img src="/tips/recursive_data/part_01_recursion_definition/pm_data.png" border="0" width="286" height="250" alt="Sample data from HR table" title="Sample data from the Prestige Motors BusinessObjects universe showing recursive data" /></p>
<p>I can review the relationships manually if I want. I can look at the data (shown above) and determine that Pickworth works for Noakes. Davis and Ferrerez also work for Noakes. How am I making that determination? Each of those three folks has a manager ID of 101, and 101 is the employee id for Noakes.</p>
<p>Who does Noakes work for? The EMP_MGR_ID column is blank (null) for Noakes, which implies that he is at the top of the company organization chart.</p>
<p>Another way to see where people fall in the organization chart is to look at their level. Here is output from a report that I eventually will want to generate from my recursive data. It is shown in the format of a tree, with each person showing up as a node on the tree.</p>
<p><img src="/tips/recursive_data/part_01_recursion_definition/hr_tree.png" border="0" width="440" height="566" alt="Tree output from HR database table" title="Tree structured output from the Prestige Motors BusinessObjects universe showing recursive data" /></p>
<p>Noakes is at level 1. Davis, Ferrerez, and Pickworth are all at level 2. But the tree does not stop there. I have employees at level 3 and level 4 as well. </p>
<h3>Typical Recursion Questions</h3>
<p>This brings me to the set of questions that I need to be able to answer with my recursive data. I need to know:</p>
<p>Who do I work for?<br />
Who works for me?<br />
Who works at my same level and shares the same manager?<br />
Who is my manger&#8217;s manager? My manager&#8217;s manager&#8217;s manager?<br />
What is the total salary of my direct reports (people who work directly for me)?<br />
What is the total salary of my indirect reports (people who work for people who work for me)?</p>
<p>I am sure there are many more questions but these should serve as a starting point. Some of the questions only require one level of the hierarchy (who works for me, or who do I work for). Those are simple enough to answer, and in fact can be answered with the simple alias structure already shown in this post. But in order to traverse the tree for multiple levels I need a solution that is a but more robust.</p>
<h3>Next Time</h3>
<p>In the next post of this series I want to talk about some of the different challenges encountered when working with recursive data. Once I define the challenges I will be in a position to start talking about solutions. As a preview, here are the four types of hierarchies I will be talking about:</p>
<ul>
<li>Clean &#8211; a hierarchy with clean data, consistent node depths, and consistent node paths</li>
<li>Unbalanced- a hierarchy with inconsistent node depths</li>
<li>Ragged- a hierarchy with inconsistent node paths</li>
<li>Lateral- a hierarchy with sideways node paths</li>
</ul>
<p>If it is not clear what some of those mean, don&#8217;t be too concerned; I will be defining each with examples in the next post.</p>
<p>Finally, here is a preview of the various solutions I will talk about:</p>
<ul>
<li>Universe aliases</li>
<li>Flattened structures (columns or snowflake tables)</li>
<li>Ancestor / Descendant model</li>
<li>Depth first tree traversal</li>
</ul>
<p>And a few that I won&#8217;t:</p>
<ul>
<li>Oracle CONNECT BY PRIOR</li>
<li>Stored procedures</li>
</ul>
<p>Part II of this series will talk in more detail about each of the recursive challenges. After I detail the different challenges the next post will talk about the solutions. My plans for the final post for this series are to review the impact of each solution on the native drilling functionality and then to wrap things up.</p>
<p><strong>Related Links</strong></p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Bill_of_materials">Wikipedia on Inventory BOM</a> in case you are unfamiliar with the concept of inventory data</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Everything About Shortcut Joins</title>
		<link>http://www.dagira.com/2010/05/27/everything-about-shortcut-joins/</link>
		<comments>http://www.dagira.com/2010/05/27/everything-about-shortcut-joins/#comments</comments>
		<pubDate>Thu, 27 May 2010 11:30:44 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Join Techniques]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=273</guid>
		<description><![CDATA[There have been a number of posts recently in the Semantic Layer forum on BOB about shortcut joins. When will they be used? How many can be used? Why won’t this particular shortcut get used? Do I have to add shortcuts to contexts? Lots of questions.
I am going to try to clear up a couple [...]]]></description>
			<content:encoded><![CDATA[<p>There have been a number of posts recently in the Semantic Layer forum on BOB about shortcut joins. When will they be used? How many can be used? Why won’t this particular shortcut get used? Do I have to add shortcuts to contexts? Lots of questions.</p>
<p>I am going to try to clear up a couple of those questions now. First here is a summary of everything I need to know about shortcuts:</p>
<ul>
<li>Shortcut joins do not provide an alternate path.</li>
<li>Shortcut joins do provide a shorter path.</li>
</ul>
<p>By the end of this post I hope that the reader will understand the difference between those two statements. There are two rules for how and when a shortcut will be applied:</p>
<ul>
<li>A shortcut join will only be used if it eliminates tables from the query.</li>
<li>A shortcut join is applied after the SQL has been generated (meaning after a context selection has been made, if required).</li>
</ul>
<p>I will talk about these two items as well. But first, how do I create a shortcut join in my universe? <span id="more-273"></span></p>
<h3>Creating a Shortcut Join</h3>
<p>Creating a shortcut join is quite simple. All I have to do is double-click the particular join and mark the shortcut attribute.</p>
<p><img src="/tips/shortcut_joins/setting_shortcut.jpg" width="512" height="483" border="0" alt="screenshot of setting up a shortcut join" title="Setting up a shortcut join in a Business Objects universe" /></p>
<p>A shortcut join will appear in my universe structure as a dotted rather than a solid line, as shown here.</p>
<p><img src="/tips/shortcut_joins/shortcut_structure.jpg" width="409" height="223" border="0" alt="screenshot of a shortcut join" title="A Shortcut join in a Business Objects universe" /></p>
<p>The process used to create a shortcut join is quite simple. But was it appropriate to convert the join as shown above into a shortcut? Will it be used?</p>
<h3>Shortcut Path</h3>
<p>The sample shown above is from the Prestige Motors database. There is a direct relationship from the Country table to the Region table, and also from the Region table to the Client table. However there is also a direct relationship from the Country table to the Client table, as the COUNTRY_ID column has been denormalized into the Client record. (If you have never seen this database, a client is a customer.) The question at this point becomes, “Is this a shorter path or an alternate path?”</p>
<p>How do I tell the difference?</p>
<p>I believe the answer is simple. If I get the same result from both join paths then it’s a shortcut. If I get a different answer then it’s an alternate path.</p>
<h3>Alternate Path</h3>
<p>In the Prestige Motors database the COUNTRY_ID exists in the Showroom table as well as the Client and Region tables. I could create joins like this:</p>
<p><img src="/tips/shortcut_joins/showroom_join.jpg" width="601" height="367" border="0" alt="screenshot of the showroom join in the motors universe" title="Showroom joins from the Prestige Motors universe" /></p>
<p>In this case I have a join from Country to Client. I also have a join from Country to Showroom. Because of the relationship with the Sales table I now have a loop in my structure. By changing one of the two Country joins to a shortcut I can avoid the loop, like this:</p>
<p><img src="/tips/shortcut_joins/showroom_shortcut.jpg" width="601" height="367" border="0" alt="screenshot of the shortcut join to the showroom table" title="A Shortcut join from Country to Showroom" /></p>
<p>What has happened here? I have eliminated the loop from my structure and solved that problem, right? Perhaps, but it is the wrong solution. In this case, the shortcut is an alternate path rather than a shorter path. I can tell because (as I mentioned earlier) I will not get the same results from the two queries.</p>
<p>Suppose I want to combine Country, Showroom, and Sales. The longer path looks like this:</p>
<p><img src="/tips/shortcut_joins/join_path_1.jpg" width="601" height="367" border="0" alt="screenshot of the longer join path" title="Standard join path from Country to Showroom via the Sales table in the Prestige Motors universe" /></p>
<p>When I execute a query using this set of joins, I will get a list of showrooms that have had sales to customers, and I will get the country where the customer is located. Next I will run a query against this shorter path:</p>
<p><img src="/tips/shortcut_joins/join_path_2.jpg" width="601" height="367" border="0" alt="screenshot of the shorter join path" title="Shortcut join path from Country to Showroom in the Prestige Motors universe" /></p>
<p>When I execute a query using this path, I will get a completely different result set. I will get a list of showrooms, their sales, and the country where they are located. The customer tables never come into play, so I get a completely different result set. This is my indication that I do not have a proper shortcut join definition.</p>
<p>And interestingly enough, Web Intelligence will never use the shortcut for the query outlined above! It is smart enough to realize that the shortcut is not properly defined as it does not truly present a shorter path. It is an alternate path, and that’s not a valid application of a shortcut.</p>
<p>As a brief aside: How should this particular loop be resolved? Obviously a shortcut is not the answer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/05/27/everything-about-shortcut-joins/feed/</wfw:commentRss>
		<slash:comments>31</slash:comments>
		</item>
		<item>
		<title>Fixing Report Path For Adobe PDF Viewers</title>
		<link>http://www.dagira.com/2010/03/23/fixing-report-path-for-adobe-pdf-viewers/</link>
		<comments>http://www.dagira.com/2010/03/23/fixing-report-path-for-adobe-pdf-viewers/#comments</comments>
		<pubDate>Tue, 23 Mar 2010 16:01:42 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=248</guid>
		<description><![CDATA[We are using the OpenDocument() function to &#8220;drill&#8221; from one document to another. In order to make report maintenance easier I have created some objects in the universe that contain the proper syntax for the URL required to access this function, as well as one that contains the report path. This way even if we [...]]]></description>
			<content:encoded><![CDATA[<p>We are using the <code>OpenDocument()</code> function to &#8220;drill&#8221; from one document to another. In order to make report maintenance easier I have created some objects in the universe that contain the proper syntax for the URL required to access this function, as well as one that contains the report path. This way even if we change our folder names or structure I can change the universe and do not have to update every report on the project. This has worked very well for us.</p>
<p>Until my current project.</p>
<p>On this project the primary distribution channel was PDF sent via email. Our users said that the links were not working. And of course every time I tested by logging in to Infoview the links worked just fine. After further investigation by another team member, it seems that our Report Path (in the format <code>[Folder],[Sub Folder]</code> was being truncated at the comma. As a result, the <code>OpenDocument()</code> function was looking for the reports in <code>[Folder]</code> and ignoring the full path. That was a bit of a problem. <span id="more-248"></span></p>
<p>As mentioned, our Report Path is stored as an object in the universe. To avoid extra work on each report I had encoded the space in &#8220;Sub Folder&#8221; as Sub%20Folder when I created the object. Without this encoding the URL would not function as required. The other characters such as [ and ] and the , between the folder names were all presented as-is with no encoding. For one project that distributed and viewed their files purely through Infoview this worked great. But the URL was being truncated when sent to PDF. One small difference was that our folder structure for this project used _ instead of a space in the folder names, but it was all working in Infoview, so it must be okay, right?</p>
<p>Wrong. But it wasn&#8217;t the _ that was the problem.</p>
<p>Initially we thought that there was a problem with the PDF generation and investigated that path. However, the solution was ultimately discovered by <a href="http://www.linkedin.com/pub/brian-durning/4/a79/aa3">Brian Durning</a>. It seems that while Infoview was okay with a compound path <code>[Folder],[Sub_Folder]</code> for Adobe the comma had to be encoded in order to work. If not, the comma was taken as part of the data and Adobe stopped looking for additional path information at that point.</p>
<p>I updated the Report Path object from this:</p>
<p><code>'[Folder],[Sub_Folder]'</code></p>
<p>to this:</p>
<p><code>'[Folder]%2C[Sub%5FFolder]'</code></p>
<p>As an aside, these objects do not parse since they are just text strings. %2C is the hexidecimal code for a comma and %5F is the code for the _ character. Once this fix was in place all of our Adobe links worked. And if you are curious, our Infoview links continued to work just fine.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/03/23/fixing-report-path-for-adobe-pdf-viewers/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>SORT_BY_NO=NO? Very Confusing&#8230;</title>
		<link>http://www.dagira.com/2010/03/04/sort_by_nono-very-confusing/</link>
		<comments>http://www.dagira.com/2010/03/04/sort_by_nono-very-confusing/#comments</comments>
		<pubDate>Thu, 04 Mar 2010 18:33:25 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=245</guid>
		<description><![CDATA[This has to be the parameter with the worst. Name. Ever. But let me start at the beginning.
Some databases require you to use actual column names in an ORDER BY clause. Like this:
select first_name, last_name, phone
from employee
order by last_name, first_name
Other databases let you take a shorter approach and sort by the position of the column [...]]]></description>
			<content:encoded><![CDATA[<p>This has to be the parameter with the worst. Name. Ever. But let me start at the beginning.</p>
<p>Some databases require you to use actual column names in an ORDER BY clause. Like this:</p>
<p><code>select first_name, last_name, phone<br />
from employee<br />
order by last_name, first_name</code></p>
<p>Other databases let you take a shorter approach and sort by the position of the column in the select clause, Like this:</p>
<p><code>select first_name, last_name, phone<br />
from employee<br />
order by 2, 1</code></p>
<p>To be honest, I don&#8217;t like the shortcut. I would rather see explicit column names in my order by because that way I know exactly what is being sorted without having to refer back to the select clause. Another advantage is that if the objects in my select ever change, my order by is not affected.</p>
<p>There is a parameter found in the .PRM file for each database named SORT_BY_NO. When you see that name, what do you think it is? Every time I see it I assume that it is used to determine whether the SQL will contain numbers in the ORDER BY clause like <code>order by 2, 1</code> instead of <code>order by last_name, first_name</code>. But that&#8217;s not what it does at all. Instead of doing what I described above, this parameter is used to determine if a query can be sorted by a column that does not appear in the select clause. That makes sense, doesn&#8217;t it? <img src='http://www.dagira.com/wp-includes/images/smilies/icon_rolleyes.gif' alt=':roll:' class='wp-smiley' />  It should be called SORT_BY_IN_SELECT or something. But it&#8217;s not, and here&#8217;s how it works.<span id="more-245"></span></p>
<h3>Sorting By &#8220;Something Else&#8221;</h3>
<p>I have a period calendar table where the period names are P01, P02, and so on. The years are fiscal years 2009, 2010, 2011, and on from there as you would expect. The user expects to pick a period and year combination. However, they want to see it in that order&#8230; period, and then year. As a designer I can easily combine the two values together in the format PPP YYYY with a concatenation operation. But then the LOV displays in alphabetical rather than chronological order. So I see this:</p>
<p>P01 2008<br />
P01 2009<br />
P01 2010<br />
P01 2011<br />
P02 2008<br />
P02 2009<br />
&#8230;</p>
<p>Instead of this:</p>
<p>P01 2008<br />
P02 2008<br />
P03 2008<br />
&#8230;<br />
P01 2009<br />
P02 2009<br />
P03 2009<br />
&#8230;</p>
<p>This is not what the user expects or requires, but is easily solved by editing the LOV query and adding a custom ORDER BY clause like this:</p>
<p><code>select period || ' ' || year<br />
from fiscal_calendar<br />
order by period_start_date</code></p>
<p>Sorting by the period start date would cause the alphabetical list to be sorted chronologically instead. However, doing any sort of manual editing &#8211; even in a simple LOV query &#8211; is something I want to avoid. Any time I have to click the &#8220;do not regenerate SQL&#8221; option it leaves me open for problems later on. I could add the start date to my query and sort it normally. However, I don&#8217;t want to do that as it would clutter the display.</p>
<h3>Setting SORT_BY_NO=NO</h3>
<p>This parameter is found in the .PRM file that belongs to the database engine referenced by a universe. In the old days the format of the .PRM file was the same as that found in a Windows .INI file. Today they use an XML structure instead. By default the SORT_BY_NO parameter is set to YES, so the line in the file looks like this:</p>
<p><code>&lt;Parameter Name="SORT_BY_NO"&gt;YES&lt;/Parameter&gt;</code></p>
<p>Clear as mud, yes? no? <img src='http://www.dagira.com/wp-includes/images/smilies/icon_lol.gif' alt=':lol:' class='wp-smiley' />  What it means is that yes, it is true; I cannot sort by something that does not appear in the select clause.</p>
<p>First I need to determine if my database allows me to sort by a column that does not appear in the select; Oracle and Teradata both do and I imagine others do as well. Changing this parameter won&#8217;t do me any good if the database does not support the technique. Next, I can find this file on my computer where I have Designer installed. The actual location will vary based on the installed path. The file name will be DBNAME.PRM, or in my case <code>teradata.prm</code>. I opened the file with a simple text editor, found the line shown above, and changed the value from YES to NO. It&#8217;s now a double-negative. It says, if I can paraphrase:</p>
<p>&#8220;It is NOT TRUE that I CANNOT sort by something NOT in the select&#8221;</p>
<p>or rather</p>
<p>&#8220;I is TRUE that I CAN sort by something NOT in the select&#8221;</p>
<p>Very clear, I am sure.</p>
<h3>The Results</h3>
<p>Before this change was made the &#8220;manage sorts&#8221; button on the query panel for editing LOV definitions in Designer was never available. After making this change, saving the file, and restarting Designer, I can now click this button.</p>
<p><img src="/tips/sort_by_no/toolbar.jpg" alt="toolbar image" title="Sort by button on query panel toolbar" border="0" width="243" height="49" /></p>
<p>When I click that button I get a list of objects from my universe. These objects do not have to appear in the select clause but can now appear in the sort clause. Problem solved.</p>
<p>But only after I set SORT_BY_NO equal to NO rather than YES in my parameter file.</p>
<p>By finally writing this down as a blog post, I hope that I will remember this the next time I get a new laptop and won&#8217;t have to spend time searching for the parameter setting that allows me to do this. I hope it helps someone else as well, but mainly this one is just for me. It happens that way sometimes. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_cool.gif' alt='8-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/03/04/sort_by_nono-very-confusing/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Want To Crash Teradata? Give It Some LOV&#8230;</title>
		<link>http://www.dagira.com/2010/02/26/want-to-crash-teradata-give-it-some-lov/</link>
		<comments>http://www.dagira.com/2010/02/26/want-to-crash-teradata-give-it-some-lov/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 14:27:46 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=240</guid>
		<description><![CDATA[Five easy steps to crash your Teradata system:

Step 1: Upgrade to Teradata version 13
Step 2: Recognize that with this version a &#8220;distinct&#8221; query no longer returns sorted results
Step 3: On the advice of Teradata, reconfigure your box with the &#8220;regression&#8221; parameter that makes distinct queries behave the way they did in 6.2
Step 4: Send a [...]]]></description>
			<content:encoded><![CDATA[<p>Five easy steps to crash your Teradata system:</p>
<ul>
<li>Step 1: Upgrade to Teradata version 13</li>
<li>Step 2: Recognize that with this version a &#8220;distinct&#8221; query no longer returns sorted results</li>
<li>Step 3: On the advice of Teradata, reconfigure your box with the &#8220;regression&#8221; parameter that makes distinct queries behave the way they did in 6.2</li>
<li>Step 4: Send a Business Objects LOV query to the database that includes a DISTINCT keyword and a where clause with a couple of constant values</li>
<li>Step 5: Watch the system reboot</li>
</ul>
<p>That&#8217;s about what happened to us a few days ago. It wasn&#8217;t pretty. It took a long time to get our production box upgraded (and this after seeing development and Q/A roll through the upgrades with flying colors). Once the upgrade was finally completed, we had catch-up work as far as batch processing to do. Once that was complete the users got back into the system&#8230; only to see it sporadically reboot.</p>
<p>With a personal computer or laptop, a sporadic reboot is often a loose connection or faulty piece of hardware. We had not experienced anything like this on our database servers. Ultimately someone figured out that the following query was at fault:</p>
<p><code>select DISTINCT table.column FROM table WHERE table.column in ('A','B')</code></p>
<p>That&#8217;s a fairly innocuous query, isn&#8217;t it? At first someone thought the table was corrupt. Nope, it checks out fine. Next someone suggested that the data in the table was bad. Nope, I can query it just fine. Then we thought maybe the fact that there were some special characters in the where clause was the problem. Nope, they work fine too. Finally it was narrowed all the way down to the fact that we had a DISTINCT clause with the where clause and the regression parameter set on our database. <span id="more-240"></span></p>
<p>Once we had identified the specific cause, there were three factors to consider. First, we needed the where clause on the LOV in order to deliver the proper list of choices to the business. So that could not really change. The Teradata DBA team had set a parameter to make TD13 work like TD6.2 as far as providing a &#8220;distinct&#8221; and sorted LOV result. Could we work with that instead? If I removed the &#8220;distinct&#8221; from the LOV query then Teradata no longer rebooted. However, the results of the LOV were no longer sorted, and that presented a new challenge.</p>
<h3>Universe Parameters To The Rescue</h3>
<p>There are a couple of universe parameters that were used to solve this issue. Side note: it was back in 6.0 (I believe) that the ability to override universe parameter settings within the Designer application first appeared. Prior to that version any parameter changes were applied via a configuration file. Any changes made then affected every universe that used that parameter file (there was a different one for each database). Today I can change individual universes, which is good. So what changes did I make?</p>
<h3>Universe Parameter: DISTINCT</h3>
<p>The first change was to change the setting for the DISTINCT parameter. By default this parameter value is set to DISTINCT, which seems redundant. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  It essentially becomes DISTINCT=DISTINCT. However there is another option, that being GROUPBY. Note that there are no underscores or spaces in that phrase. With the setting updated to DISTINCT=GROUPBY my LOV queries no longer included the &#8220;distinct&#8221; keyword at the top of the query. Instead of this:</p>
<p><code>select DISTINCT table.column from table</code></p>
<p>I see this:</p>
<p><code>select table.column from table group by 1</code></p>
<p>There is another parameter that controls whether I see <code>group by 1</code> or <code>group by table.column</code> and that depends on whether a database supports that syntax or not.</p>
<p>This change solved the first half of the issue. Without a &#8220;distinct&#8221; the LOV queries would no longer cause the server to reboot. This was certainly a positive step. However, making this change had an important side effect: the LOV results were unique because of the GROUP BY clause, but they were no longer sorted.</p>
<h3>Universe Parameter: FORCE_SORTED_LOV</h3>
<p>The second parameter that I changed was FORCE_SORTED_LOV. The default value is <code>No</code> and I changed it to <code>Yes</code> instead. The theory behind this parameter is simple: it would force the LOV results to be sorted. Before I changed the parameter my LOV query looked like this:</p>
<p><code>select table.column from table group by 1</code></p>
<p>After applying this change, my LOV query looked like this:</p>
<p><code>select table.column from table group by 1</code></p>
<p>Hm. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_eek.gif' alt=':shock:' class='wp-smiley' />  Not what I expected. I was looking for an ORDER BY clause at the end of the query and I was not seeing one.</p>
<p>I went ahead and saved the universe and exported it to the Q/A system. We ran a few test queries (including one using the problem LOV definition mentioned above) and everything worked fine. Short LOV results were sorted. Long LOV result sets were also sorted, and they were still &#8220;paged&#8221; like I expected them to be. It still bothered me that I wasn&#8217;t seeing an ORDER BY clause in the LOV query definition, but obviously the sort was happening somewhere. We speculated that perhaps the Web Intelligence server was applying the sort which raised some concerns. LOV performance definitely suffered during the upgrade from 6.5 to XI and I didn&#8217;t want to introduce anything that would cause it to degrade further.</p>
<p>Fortunately one of the Teradata DBA team found the LOV queries in the logs, and despite the fact that no ORDER BY was showing up in the SQL when I viewed it via Designer, it was really there by the time Teradata got the query request.</p>
<h3>Wrap Up</h3>
<p>Ultimately the two universe parameters (DISTINCT and FORCE_SORTED_LOV) were used to change how LOV queries were generated, which allowed the Teradata DBA team to revert back to a standard version 13 installation (without the regression parameter turned on). I imagine that the Teradata engineering team is busy working on the bug and it will be fixed soon, so if you are using Teradata and looking at a version 13 upgrade at some point later this year you probably won&#8217;t encounter the same issue.</p>
<p>But it&#8217;s nice to know that we can help solve the issue by changing a few parameters on the universe side.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/02/26/want-to-crash-teradata-give-it-some-lov/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
	</channel>
</rss>

