<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>Dave's Adventures in Business Intelligence &#187; Universe Design</title>
	<atom:link href="http://www.dagira.com/category/design/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dagira.com</link>
	<description>...you are in a twisty maze of passageways, all different...</description>
	<lastBuildDate>Wed, 28 Jul 2010 13:13:17 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<!-- podcast_generator="podPress/8.8" -->
		<copyright>&#xA9; </copyright>
		<managingEditor>blogmaster@dagira.com ()</managingEditor>
		<webMaster>blogmaster@dagira.com()</webMaster>
		<category></category>
		<ttl>1440</ttl>
		<itunes:keywords></itunes:keywords>
		<itunes:subtitle></itunes:subtitle>
		<itunes:summary>...you are in a twisty maze of passageways, all different...</itunes:summary>
		<itunes:author></itunes:author>
		<itunes:category text="Society &amp; Culture"/>
		<itunes:owner>
			<itunes:name></itunes:name>
			<itunes:email>blogmaster@dagira.com</itunes:email>
		</itunes:owner>
		<itunes:block>No</itunes:block>
		<itunes:explicit>no</itunes:explicit>
		<itunes:image href="http://www.dagira.com/wp-content/plugins/podpress/images/powered_by_podpress_large.jpg" />
		<image>
			<url>http://www.dagira.com/wp-content/plugins/podpress/images/powered_by_podpress.jpg</url>
			<title>Dave's Adventures in Business Intelligence</title>
			<link>http://www.dagira.com</link>
			<width>144</width>
			<height>144</height>
		</image>
		<item>
		<title>Universe Models For Recursive Data Part III: Alias Versus Flattened</title>
		<link>http://www.dagira.com/2010/07/02/universe-models-for-recursive-data-part-iii-alias-versus-flattened/</link>
		<comments>http://www.dagira.com/2010/07/02/universe-models-for-recursive-data-part-iii-alias-versus-flattened/#comments</comments>
		<pubDate>Fri, 02 Jul 2010 11:45:03 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[2009 GBN - Dallas]]></category>
		<category><![CDATA[2010 Mastering ... Melbourne]]></category>
		<category><![CDATA[Recursive Data]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=284</guid>
		<description><![CDATA[This is the third of several posts that will review my presentation “Universe Models For Recursive Data” which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. As with my other presentations there is a PDF file [...]]]></description>
			<content:encoded><![CDATA[<p>This is the third of several posts that will review my presentation “Universe Models For Recursive Data” which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. As with my other presentations there is a PDF file that can be downloaded from my <a href="http://www.dagira.com/conference-presentations/">conference presentations page</a>. The first post introduced the concepts of recursive (as opposed to hierarchical) data and provided a couple of examples. The second post reviewed some of the different design challenges that I have seen in working with recursive data models. In this post I will introduce four different possible solutions and present a scorecard for each, showing how well it solves the issues presented in the prior post in this series. Links to both prior posts are presented at the end of this entry. I have also included Oracle SQL scripts that can be used to create and populate the tables used in this post.</p>
<p><em>This post will cover slides 22 through 30 from the presentation and will describe the first two solutions (one with two variations) outlined in the presentation.</em> <span id="more-284"></span></p>
<h3>Solution Options</h3>
<p>The four different solutions that I included in my presentation were: Universe aliases, Flattened structures (column or snowflake), Ancestor Model, and Depth First Tree Traversal. All of them work fine on a clean recursive hierarchy. Each of them partially works for at least some of the other challenges. Some of them present unique challenges (extra disk space requirements or lack of native drilling functionality) that will also be addressed. I am presenting the solutions in increasing order of complexity. This post will cover aliases and flattened structures (both versions). In the next post I plan to cover the ancestor model, and finally I will cover the depth first tree traversal in its own post. </p>
<h3>Universe Aliases</h3>
<p>This solution is the only one that can be completely self-contained within the universe. No DBA or ETL work is required. There are any number of ways to create an alias. I can:</p>
<ul>
<li>Right-click on a table and select Insert Alias</li>
<li>Select an existing table in my structure, then select Insert + Alias from the menu</li>
<li>Open my table browser and insert an existing table. An alias will automatically be created for me.</li>
<li>Select an existing table in my structure and click the &#8220;Insert Alias&#8221; toolbar button</li>
</ul>
<p>&#8230; and there are other ways to get aliases in my universe, especially if I have loops to resolve. The bottom line is that the process is quite simple.</p>
<p>Here&#8217;s what an alias looks like after it has been created and joined to an existing table in my structure.</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/alias_implementation.png" border="0" width="443" height="324" alt="screen shot of alias implementation in a BusinessObjects universe" title="Alias implementation in a BusinessObjects universe" /></p>
<p>The join can be a bit tricky. In this case, the employee row MGR_ID is joined to the manager row EMP_ID in order to make the relationship work. It might help to look at the raw data again from an earlier post.</p>
<p><img src="/tips/recursive_data/part_01_recursion_definition/pm_data.png" width="286" height="250" border="0" alt="raw data used to demonstrate recursion in a BusinessObjects universe" title="Raw data used to demonstrate recursion in a BusinessObjects universe" /></p>
<p>See how the recursive relationship is going to work after establishing this join? Field works for Ferrerez, and Ferrerez works for Noakes. Who does Noakes work for? His MGR_ID column is empty (NULL) implying that he does not have a manager. He owns the company. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h3>Pros of Alias Solution</h3>
<p>The primary advantage of this solution is that it is completely self-contained in the universe. No DBA or ETL work is required. That&#8217;s about it.</p>
<h3>Cons of Alias Solution</h3>
<p>There are several cons to this solution. It does not represent lateral relationships at all. I have to use outer joins in order to preserve those rows with missing keys (Noakes in this example). Both of these are important, but the most substantial drawback to this solution is that the depth is determined by the number of aliases that the universe designer creates. In the image shown above there is only one link: from manager to direct employee. How can I &mdash; in one step &mdash; determine my indirect reports? With only one level of alias, I can only report one level of my hierarchy. How many can I report with this structure?</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/multi_level_aliases.png" width="530" height="138" border="0" alt="screen shot of multi-level alias implementation in a BusinessObjects universe" title="Multi-level alias implementation in a BusinessObjects universe" /></p>
<p>With that structure I now have two outer joins, but I can report on three levels instead of just two.</p>
<p>How many alias levels do I create? Generally when I have seen this solution used (or used it myself) we resort to asking how many levels are required and then creating some number above that. If I need five, I will create seven. If I need seven, I will create ten.</p>
<p>That means, of course, if I have created ten levels and all of a sudden we have twelve I have to update my universe. That&#8217;s not a problem (as long as I keep up with things) but it&#8217;s certainly not desirable.</p>
<h3>Alias Scorecard</h3>
<p>Here&#8217;s the scorecard for the alias solution for each of the four scenarios I outlined earlier.</p>
<p><img src="/tips/recursive_data/scorecard_alias.png" width="600" height="297" border="0" alt="alias scorecard for handling recursive data" title="Alias scorecard for handling recursive data challenges in a BusinessObjects universe" /></p>
<p>Aliases are the easiest solution to implement but they don&#8217;t score well. Let&#8217;s move on to the next solution.</p>
<h3>Flattened Structure &#8211; Single Table Columns</h3>
<p>The next solution involves running either a SQL script or some form of ETL. I need to take the recursive table relationship and flatten it out much like I did with aliases, but this time in the database itself. The net result is that I will take data going down in rows:</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/table_rows.png" width="432" height="146" border="0" alt="data in tables is presented as rows" title="Data in relational tables consists of rows" /></p>
<p>and pivot it into columns in a table.</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/table_columns.png" width="456" height="144" border="0" alt="data for a hierarchy can be pivoted into columns" title="Data in a hierarchy can be pivoted into columns" /></p>
<p>The net result is all of my recursion is done during the script process and I end up with one table that contains everything (or every person in my case) stored at their specific level in the hierarchy. It easily allows me to drill because it creates a very natural hierarchy.</p>
<h3>Pros of Flattened Table Solution</h3>
<p>It handles unbalanced hierarchies much better than aliases because missing lower nodes are simply NULL in the table. That&#8217;s fine. This solution can also handle ragged hierarchies with a proper &#8220;plug node&#8221; strategy. If I have a lower level value (Divisional Director) that reports directly to the president (top level) then level 2 (Vice President) will be empty. I need to fill something in so I can drill properly. More important, that plug node has to tell me what the path is or else I cannot drill up properly. Suppose I had a director named Smith who reported directly to Noakes. The first column in my table would include Noakes. The third column would include Smith. The second column (the missing value due to the raggedness of my data) would contain Smith VP Not Assigned or something like that.</p>
<p>Flattened tables cannot handle lateral hierarchies at all because I can&#8217;t store two values in a single column.</p>
<h3>Cons of Flattened Table Solution</h3>
<p>As already mentioned, this solution cannot handle lateral hierarchies at all. It also requires DBA or ETL work if the number of hierarchy levels changes. My column names should reflect the position (node type) in the hierarchy. That&#8217;s not a problem unless my hierarchy levels change, then I might want to update my structures.</p>
<p>But by far the most critical issue with this solution is the fact that it requires DBA or ETL work if my levels ever change. Much like aliases when I have seen this solution implemented I generally see extra columns at the end of my table just to allow for future expansion.</p>
<h3>Flattened Table Scorecard</h3>
<p>Here is my scorecard for the Flattened Table solution.</p>
<p><img src="/tips/recursive_data/scorecard_flat_columns.png" width="600" height="297" border="0" alt="flattened columns scorecard for handling recursive data" title="Flattened columns scorecard for handling recursive data challenges in a BusinessObjects universe" /></p>
<h3>Flattened Structure &#8211; Snowflake Tables</h3>
<p>One thing that I noticed about the data for the flattened structure is that I repeat a lot of values. For example, Noakes is the &#8220;level 1 mgr&#8221; for every person in the company. It might seem to be more efficient to use a structure like this:</p>
<p><img src="/tips/recursive_data/part_03_alias_flat/snowflake_structure.png" width="529" height="67" border="0" alt="screen shot of snowflake structure in a BusinessObjects universe" title="Snowflake structure for handling recursive data in a BusinessObjects universe" /></p>
<p>This would reduce my overall storage requirements because I would end up with a single row for the highest level table.</p>
<p>However, it also reintroduces the need for outer joins, which the initial flattened structure avoided. </p>
<h3>Pros of Flattened Snowflake Solution</h3>
<p>Because the tables get smaller as I get further up the tree (ultimately to a single-row table in my simple example) my overall storage requirement should be smaller as well. If I only need the top one or two levels, my queries should be very efficient. Finally, I think it would be easier to maintain as well. If a new level appears, I add a new table to my chain with the proper restrictions on the ETL for proper table population. </p>
<h3>Cons of Flattened Snowflake Solution</h3>
<p>Each of the solutions defined so far suffers from some form of this issue: I have to define a table (or column) for every possible level of my hierarchy. If I do not know what the total number of levels will be, I can try to anticipate and create extra tables to support future expansion. But that is not the best solution. Because these tables are maintained in the database, I have to talk to my DBA or ETL team when changes are required. Because the tables are joined I have to consider whether to use outer join to preserve depth on unbalanced hierarchies. And finally, the &#8220;plug node&#8221; strategy I outlined earlier becomes a &#8220;plug row&#8221; strategy in this case, and that&#8217;s substantially more complicated.</p>
<h3>Flattened Snowflake Scorecard</h3>
<p>Here is the scorecard for the flattened snowflake solution. In my opinion, it&#8217;s a slightly worse solution than the flattened table solution simply because of the join issue and the plug row concern.</p>
<p><img src="/tips/recursive_data/scorecard_flat_snowflake.png" width="600" height="297" border="0" alt="snowflake scorecard for handling recursive data" title="Snowflake scorecard for handling recursive data challenges in a BusinessObjects universe" /></p>
<h3>Next Time</h3>
<p>The solutions covered in this post are the least complex and therefore offer the least flexibility. They are easy to set up; in the case of aliases the entire solution can be built within the universe designer application. All of the other solutions require some sort of database scripting. In the next post I will talk about the ancestor model and how we used it at a manufacturing client. It has some definite advantages, and it handles just about all of the different challenges I have outlined. I don&#8217;t have to worry about plug nodes, and it handles both ragged and unbalanced hierarchies quite well. However it has an impact on disk usage and it can&#8217;t be drilled using the native functionality provided by BusinessObjects. Do the pros outweigh the cons? Come back soon and see for yourself. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_cool.gif' alt='8-)' class='wp-smiley' /> </p>
<p><strong>Related Links</strong></p>
<ul>
<li><a href="http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/">Universe Models for Recursive Data Part I: Introduction</a></li>
<li><a href="http://www.dagira.com/2010/06/25/universe-models-for-recursive-data-part-ii-design-challenges/">Universe Models for Recursive Data Part II: Design Challenges</a></li>
</ul>
<p><strong>Supplemental Material</strong><br />
Scripts to create and populate the basic HR table used for this presentation.</p>
<ul>
<li>Create table</p>
<pre>create table employee
(emp_id number(5) not null
,emp_lastname varchar(20)
,emp_firstname varchar(15)
,emp_dob date
,emp_address varchar(40)
,emp_area_code varchar(7)
,emp_town varchar(15)
,emp_phone varchar(18)
,showroom_id number(4)
,emp_start date
,emp_mgr_id number(5)
,emp_sex varchar(1)
,job_id number(4));

alter table employee add constraint emp_pk primary key (emp_id);
create index emp_dept on employee(dept_id);
create index emp_showroom on employee(showroom_id);
create index emp_mgr on employee(emp_mgr_id);
</pre>
</li>
<li>Populate table
<pre>
insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (101, 'Noakes', 'Nicholas', '12-MAR-48', '2356, Melrose Street', '30190', 'San Jose', '12-00-00-01', '01-JAN-91', NULL, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (102, 'Ferrerez', 'Ferdinand', '10-FEB-64', '25 Arcadia Avenue', '75897', 'Los Angeles', '22-55-56-32', '30-MAR-96', 101, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (103, 'Field', 'Felicity', '15-DEC-60', '12 Brasilia Street', '12014', 'Santa Barabara', '14-46-54-22', '26-MAR-95', 102, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (104, 'Fraser', 'Frank', '13-MAR-67', '45 Seaside Avenue', '75016', 'Los Angeles', '22-55-18-33', '13-DEC-91', 101, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (105, 'Snow', 'Sara', '03-OCT-65', 'Square Woodstock', '18000', 'San Jose', '14-34-34-30', '01-MAY-93', 101, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (106, 'Speed', 'Sonya', '03-DEC-70', '5, The Vale', '22000', 'San Jose', '14-32-39-43', '04-JUL-96', 105, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (107, 'Spencer', 'Steve', '01-NOV-64', 'Square Osaka', '33010', 'Los Angeles', '22-24-25-89', '16-APR-91', 105, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (108, 'Helen', 'Harrison', '01-AUG-66', 'Via Firenze', '38200', 'Los Angeles', '22-34-31-11', '13-MAY-94', 101, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (109, 'Thomas', 'Tom', '01-DEC-68', '11 Over Way', '24000', 'San Jose', '22-45-67-45', '20-DEC-95', 101, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (110, 'Thatcher', 'Terry', '03-OCT-50', 'Stars Parkway', '21000', 'San Jose', '12-11-11-09', '06-DEC-92', 109, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (111, 'Davis', 'Diana', '12-AUG-64', 'Rue Opera Sauvage', '92100', 'Los Angeles', '14-54-11-10', '22-SEP-92', 101, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (201, 'Pickworth', 'Paul', '12-FEB-51', '23 Las palmas road', '00316', 'New York', '12-24-26-44', '12-JAN-93', 101, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (202, 'Forest', 'Florence', '10-OCT-32', 'Rue des Lombards', '75100', 'New York', '22-54-11-10', '23-DEC-94', 201, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (203, 'Brown', 'Bella', '12-APR-59', 'Hollywood Blv', '36020', 'New York', '22-36-25-50', '03-FEB-92', 202, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (204, 'Porter', 'Pete', '15-NOV-57', 'Avd Torre De Embarra', '34100', 'New York', '14-44-11-66', '13-APR-92', 201, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (205, 'Irving', 'Ira', '12-FEB-64', '44 Beach avenue', '13000', 'New York', '12-56-55-20', '18-JUN-95', 204, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (206, 'Bailey', 'Ben', '12-JUN-57', '4 Palisades Drive', '75090', 'Long Island', '12-33-51-29', '01-DEC-90', 204, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (207, 'Duckworth', 'Dave', '09-SEP-66', 'Rue du grand temps', '75018', 'New York', '12-85-01-61', '04-NOV-93', 201, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (208, 'Ince', 'Ian', '10-AUG-53', 'Sunset Blvd', '31061', 'New York', '22-52-22-00', '04-DEC-95', 207, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (209, 'Hilary', 'Hibbs', '01-FEB-60', 'Sand Hill Road', '92800', 'New York', '12-54-11-10', '08-JUN-95', 202, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (301, 'Dagmar', 'Davinda', '12-APR-58', '12, The Crescent', 'SL1 1HG', 'Slough', '01628-764234', '24-JUN-95', 101, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (302, 'Presley', 'Percy', '30-OCT-62', '1 Jubilee Close', 'SL5 23F', 'Maidenhead', '01628-834582', '15-JUL-95', 301, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (303, 'Perry', 'Philippa', '24-FEB-71', '23 Rice Hill', 'SL3 12S', 'Maidenhead', '01628-567231', '28-SEP-96', 302, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (304, 'Hubert', 'Henri', '13-DEC-69', '5 Grand Lane', 'SL3 12S', 'Maidenhead', '01628-243535', '17-APR-96', 302, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (305, 'Adamson', 'Anita', '12-OCT-69', '24 Loose Lane', 'SL4 23D', 'Cookham', '01628-782364', '15-FEB-96', 301, 'F');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (306, 'Beaver', 'Bertie', '12-MAR-72', '223 Grange Hill', 'SL2 67E', 'Windsor', '01628-187632', '13-JAN-96', 305, 'M');

insert into employee (emp_id, emp_lastname, emp_firstname, emp_dob, emp_address, emp_area_code, emp_town, emp_phone, emp_start, emp_mgr_id, emp_sex)
values (307, 'Motson', 'Mervin', '22-DEC-74', '67 Blows Down', 'SL5 45G', 'Cookham', '01628-198371', '17-JUN-96', 305, 'M');
</pre>
</li>
</ul>
<p>Scripts to create and populate the flattened version of the HR table, Oracle syntax</p>
<ul>
<li>Create flattened table</p>
<pre>create table emp_flat
(emp_lvl_1 varchar2(20)
,emp_lvl_2 varchar2(20)
,emp_lvl_3 varchar2(20)
,emp_lvl_4 varchar2(20)
);</pre>
</li>
<li>Populate flattened table.<br />
Only four levels are supported.<br />
Starting point (Noakes) is hard-coded.</p>
<pre>insert into emp_flag (emp_lvl_1, emp_lvl_2, emp_lvl_3, emp_lvl_4)
select a.emp_lastname
,      b.emp_lastname
,      c.emp_lastname
,      d.emp_lastname
from employee a
,    employee b
,    employee c
,    employee d
where a.emp_id = b.emp_mgr_id(+)
and b.emp_id = c.emp_mgr_id(+)
and c.emp_id = d.emp_mgr_id(+)
and a.emp_id = 101;</pre>
</li>
<li>Create Snowflake Tables
<pre>create table emp_level_01
(emp_id number(5)
,emp_lvl_1 varchar2(20));

create table emp_level_02
(emp_id number(5)
,emp_mgr_id number(5)
,emp_lvl_2 varchar2(20));

create table emp_level_03
(emp_id number(5)
,emp_mgr_id number(5)
,emp_lvl_3 varchar2(20));

create table emp_level_04
(emp_id number(5)
,emp_mgr_id number(5)
,emp_lvl_4 varchar2(20));</pre>
</li>
<li>Populate snowflake tables<br />
Only four levels are built, each starting from the prior table.<br />
Starting point (Noakes) is hard-coded.</p>
<pre>insert into emp_level_01 (emp_id, emp_lvl_1)
select emp_id, emp_lastname
from employee
where emp_id = 101;

insert into emp_level_02 (emp_id, emp_mgr_id, emp_lvl_2)
select e.emp_id, e.emp_mgr_id, e.emp_lastname
from employee e, emp_level_01 e1
where e.emp_mgr_id = e1.emp_id;

insert into emp_level_03 (emp_id, emp_mgr_id, emp_lvl_3)
select e.emp_id, e.emp_mgr_id, e.emp_lastname
from employee e, emp_level_02 e2
where e.emp_mgr_id = e2.emp_id;

insert into emp_level_04 (emp_id, emp_mgr_id, emp_lvl_4)
select e.emp_id, e.emp_mgr_id, e.emp_lastname
from employee e, emp_level_03 e3
where e.emp_mgr_id = e3.emp_id;
</pre>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/07/02/universe-models-for-recursive-data-part-iii-alias-versus-flattened/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Universe Models For Recursive Data Part II: Design Challenges</title>
		<link>http://www.dagira.com/2010/06/25/universe-models-for-recursive-data-part-ii-design-challenges/</link>
		<comments>http://www.dagira.com/2010/06/25/universe-models-for-recursive-data-part-ii-design-challenges/#comments</comments>
		<pubDate>Sat, 26 Jun 2010 00:38:35 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[2009 GBN - Dallas]]></category>
		<category><![CDATA[2010 Mastering ... Melbourne]]></category>
		<category><![CDATA[Recursive Data]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=283</guid>
		<description><![CDATA[This is the second of several posts that will review my presentation “Universe Models For Recursive Data” which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. As with my other presentations there is a PDF file [...]]]></description>
			<content:encoded><![CDATA[<p>This is the second of several posts that will review my presentation “Universe Models For Recursive Data” which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. As with my other presentations there is a PDF file that can be downloaded from my <a href="http://www.dagira.com/conference-presentations/">conference presentations page</a>. The first post introduced the concepts of recursive (as opposed to hierarchical) data and provided a couple of examples. In this post I will review some of the different design challenges that I have seen in working with recursive data. </p>
<p>I decided to identify and cover four different examples of recursive data configurations. These included Clean, Unbalanced, Ragged, and Lateral. As I mentioned in the first post, I am going to use some basic human resources (HR) data for my examples. For this post, in order to show samples of each of the four challenges, I am going to represent my recursive data using a tree. The branches of the tree show the relationships between people. The nodes of the tree contain the information about each person. The data might include their name, hire date, and position (title) within the company. In order to properly interact with my recursive data I have to be able to work with both types of information: relationships and node data as well. If you are not sure what I mean, please continue reading, this will make more sense later on.</p>
<p><em>This post will cover slides 14 through 21 from the presentation and will describe each of the different recursive challenges that I identified.</em> <span id="more-283"></span></p>
<h3>Clean Hierarchy</h3>
<p>In my first example everything is very clean. Each branch of the tree has the same depth. Each branch follows the same path. There are no real challenges encountered in this hierarchy, pictured below.</p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_clean.png" width="537" height="320" border="0" alt="image of clean recursive hierarchy" title="Clean recursive hierarchy" /></p>
<p>Imagine that the top of the tree is the company president. The second level (the &#8220;B&#8221; nodes) represent vice presidents, and the third level (&#8221;C&#8221; nodes) represents divisional directors. When a hierarchy definition is very rigorous this is the type of tree I expect. For a very simple example let me suggest a product hierarchy instead of an HR chart for the moment. A product hierarchy for a food company might include a Brand Owner, the Brand, the Size, and finally the Flavor. The brand owner could be Beverages-R-Us, the brand could be Super Sports Drinks, the size is two liter bottle, and finally the flavor is Orange. Every product in the system is guaranteed to have all four of these attributes assigned, and they will all be in that exact order. </p>
<p>On the other hand, a human resources hierarchy is rarely as clean. Let me move on to some more interesting examples.</p>
<h3>Unbalanced Hierarchy</h3>
<p>An unbalanced hierarchy is one where the nodes are at inconsistent depths. Please review the tree shown below. </p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_unbalanced.png" width="455" height="320" border="0" alt="image of an unbalanced recursive hierarchy" title="Unbalanced recursive hierarchy with nodes at inconsistent depths" /></p>
<p>In the example shown above, there is one node (B1 in this case) that does not have any children while the rest of the nodes at that level (B2) do. If the A node is the company president, and the B nodes are vice presidents, it is entirely possible to have a position (perhaps &#8220;VP of Special Projects&#8221;) that does not have any additional people that report up to him or her. In that case the tree stops at the VP level and does not go down to the Divisional Director position.</p>
<p>Why is this a challenge? As will be seen later, one of the possible solutions to a recursive data scenario is to pivot the data (flatten) it into different columns. What happens to the missing nodes in this case?</p>
<h3>Ragged Hierarchy</h3>
<p>In the last example I suggested that there could be a VP of the company that does not have any direct employees. In the case of a Ragged hierarchy it&#8217;s slightly different. In this case I might see a Divisional Director who is reporting straight up to the company president without going through a VP.</p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_ragged.png" width="455" height="320" border="0" alt="image of a ragged recursive hierarchy" title="Ragged recursive hierarchy with nodes of inconsistent paths" /></p>
<p>Note that in the image above I am showing both an unbalanced node (B1) and a ragged node (C2). Let me focus on C2 for a moment. As I already mentioned, there is a relationship from that director position straight up to the president. It does not go through a vice president position. Why is this a challenge? Remember that earlier I mentioned there are two parts that I need to account for: the relationship and the position or node type. In this case the relationship only goes one step, but descends two levels (from president to director). I need to be able to represent both parts properly in whatever data model I come up with.</p>
<h3>Lateral Hierarchy</h3>
<p>If you have spent any time reviewing company organization charts you may have seen this type of relationship before: I am calling it a lateral (sideways) relationship.</p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_lateral.png" width="537" height="320" border="0" alt="image of a recursive hierarchy with lateral relationships" title="Recursive hierarchy with lateral relationships" /></p>
<p>It&#8217;s not uncommon to see a lateral relationship from one director to another director (C2 reporting to C1 in this example). This is one of the biggest challenges to most of the design ideas I will be sharing in my next post, because there are two things (people) occupying the same space (node type) on the tree).</p>
<h3>Merge / Diverge</h3>
<p>As I mentioned toward the beginning of the post that some scenarios are inherently cleaner than others because the relationships all have to exist. Unfortunately, it is quite likely to see a combination of issues. I have even seen challenges where a hierarchy does a merge / diverge relationship such as this:</p>
<p><img src="/tips/recursive_data/part_02_design_challenges/tree_merge_diverge.png" width="455" height="443" border="0" alt="image of a recursive hierarchy with merge diverge relationships" title="Recursive hierarchy with relationships that merge and then diverge" /></p>
<p>SAP and other ERP vendors generally allow this sort of hierarchy to be built in order to provide the maximum flexibility to the client company. I have never tried to implement this in BusinessObjects because it simply does not work. There is no clear drill path. Suppose I drill from node B2 to C3, and then from C3 to D2. Now when I drill up, which path do I take? I can drill from D2 up to C3, and then from C3 I can drill up to either node A1 or B2. It&#8217;s ambiguous, and therefore our project team decided that we would not attempt to handle this at all. We instituted a business rule (an exception) that would kick out any hierarchy that included this sort of path.</p>
<p><em>This particular example was dropped from the presentation in the interest of time but I wanted to mention it here.</em></p>
<h3>Combinations</h3>
<p>Even without the merge / diverge issue, there are plenty of still challenges. For our project, a typical tree was both ragged and unbalanced. That meant that the solutions we discussed had to be able to handle both. We also had a number of lateral relationships that we needed to address. Our users wanted to be able to enter the tree by node type and drill by level. They wanted to see the entire tree presented as part of a prompt. And they wanted to be able to multi-select from those prompts&#8230; for any node at any level.</p>
<h3>Next Time</h3>
<p>Which solutions work the best? Do any solutions work for all of these different scenarios? My next post in this series will review each of the four solutions I outlined in my presentation and present a scorecard for each.</p>
<p><strong>Related Links</strong></p>
<ul>
<li><a href="http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/">Universe Models for Recursive Data Part I: Introduction</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/06/25/universe-models-for-recursive-data-part-ii-design-challenges/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Universe Models For Recursive Data Part I: Introduction</title>
		<link>http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/</link>
		<comments>http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/#comments</comments>
		<pubDate>Wed, 16 Jun 2010 18:38:36 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[2009 GBN - Dallas]]></category>
		<category><![CDATA[2010 Mastering ... Melbourne]]></category>
		<category><![CDATA[Recursive Data]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=282</guid>
		<description><![CDATA[This is the first of several posts that will review my presentation &#8220;Universe Models For Recursive Data&#8221; which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. After presenting it three times it seemed like an appropriate [...]]]></description>
			<content:encoded><![CDATA[<p>This is the first of several posts that will review my presentation &#8220;Universe Models For Recursive Data&#8221; which was originally presented at the 2009 GBN conference, then at the North Texas / Oklahoma ASUG chapter meeting, and finally at the Mastering BusinessObjects conference in Melbourne. After presenting it three times it seemed like an appropriate time to (finally) get started writing up the blog posts. As with my other presentations there is a PDF file that can be downloaded from my <a href="http://www.dagira.com/conference-presentations/">conference presentations page</a>.</p>
<p><em>This post will cover slides 6 through 13 as a basic introduction of recursive data and challenges presented to universe designers.</em></p>
<h3>Defining Recursive Data</h3>
<p>Sometimes there is confusion about the distinction between hierarchical and recursive data. Hierarchical data does not present a big challenge for BusinessObjects. It can be something related to time (Year, Quarter, Month, Day), geography (Country, Region, State, City), or something more specific like an accounting structure (Business Unit, Account, Sub-Account). What makes this hierarchical structure work easily is that each element is stored in a different place. It could be in a different column in the same table (flattened) or even in different tables (snowflake). As long as I can drill from one column to another in the hierarchy everything works fine.</p>
<p>Self-referencing or recursive data may initially look like a hierarchy. The key difference is that all of the elements are stored in the same place. There are keys that relate one row in a table back to a different row in the same table. That&#8217;s how recursive data is different from hierarchical data.</p>
<p>Why is recursion is a problem for BusinessObjects? The language used &#8220;behind the curtain&#8221; is SQL, and SQL does not natively support recursion. Some database vendors offer extensions (for example the CONNECT BY PRIOR structure in Oracle) but these are not used by BusinessObjects.</p>
<p>How common is recursive data? It is certainly not unusual. Consider any of the following:</p>
<ul>
<li>Company organizational structure<br />
Object levels: President &#8211; Vice President &#8211; Director<br />
Object type: Person</li>
<li>Inventory BOM (Bill of Materials)<br />
Object levels: Product &#8211; Assembly &#8211; Sub-Assembly &#8211; Component<br />
Object type: Inventory item</li>
<li>Project Management<br />
Object levels: Project &#8211; Task &#8211; Sub-Task<br />
Object type: Project entry</li>
<li>Multi-Level Marketing (MLM)<br />
Object levels: Founder &#8211; Recruit &#8211; Recruit Level 2<br />
Object type: Person</li>
</ul>
<p>In each of the above examples the type of object (or node) type is the same at any level. For example, a company organization chart is made up of people. Some people are at different levels, and there are therefore relationships from one person to another. In order to show all of the relationships from the top of the company to the bottom (or the bottom to the top) I have to keep going back to the same table. That is recursion.</p>
<p>Because it&#8217;s easy to think about a company organizational structure I used that example for the rest of the presentation. </p>
<p><em>Note: The Motors database is used in the standard Universe Designer training course and will not be presented in its entirety in the download package for this presentation for copyright reasons. However, I will be providing the standard HR table and all of the modified versions used in this presentation.</em><span id="more-282"></span></p>
<h3>Example of Recursive Data Using Prestige Motors HR</h3>
<p>A picture will help at this point. Here is a screen shot from the Prestige Motors HR universe that I built for this presentation. Notice that there are two tables in the picture, but one is an alias of the other. In other words, I am really using the same table twice.</p>
<p><img src="/tips/recursive_data/part_01_recursion_definition/hr_relationships.png" border="0" width="398" height="371" alt="screen shot of recursive relationship in a BusinessObjects universe" title="Example of a recursive relationship in a BusinessObjects universe" /></p>
<p>The table on the left is the Employees table. I have aliased the table and called it Manager. The two tables are joined using the link from EMPLOYEE.EMP_MGR_ID to Manager.EMP_ID. Since this is really the same table twice, this join defines the relationship from any particular person to their immediate manager. It&#8217;s a recursive relationship from a person to a person.</p>
<p>Notice that in this case I have defined the join as an outer (optional) join? That&#8217;s because the top person in the company does not have a manager, and the relationship would fail in that case. I want to ensure that I return every person and their manager&#8230; even if that person does not have a manager. Here is a sample of some of the data to help show why this is important.</p>
<p><img src="/tips/recursive_data/part_01_recursion_definition/pm_data.png" border="0" width="286" height="250" alt="Sample data from HR table" title="Sample data from the Prestige Motors BusinessObjects universe showing recursive data" /></p>
<p>I can review the relationships manually if I want. I can look at the data (shown above) and determine that Pickworth works for Noakes. Davis and Ferrerez also work for Noakes. How am I making that determination? Each of those three folks has a manager ID of 101, and 101 is the employee id for Noakes.</p>
<p>Who does Noakes work for? The EMP_MGR_ID column is blank (null) for Noakes, which implies that he is at the top of the company organization chart.</p>
<p>Another way to see where people fall in the organization chart is to look at their level. Here is output from a report that I eventually will want to generate from my recursive data. It is shown in the format of a tree, with each person showing up as a node on the tree.</p>
<p><img src="/tips/recursive_data/part_01_recursion_definition/hr_tree.png" border="0" width="440" height="566" alt="Tree output from HR database table" title="Tree structured output from the Prestige Motors BusinessObjects universe showing recursive data" /></p>
<p>Noakes is at level 1. Davis, Ferrerez, and Pickworth are all at level 2. But the tree does not stop there. I have employees at level 3 and level 4 as well. </p>
<h3>Typical Recursion Questions</h3>
<p>This brings me to the set of questions that I need to be able to answer with my recursive data. I need to know:</p>
<p>Who do I work for?<br />
Who works for me?<br />
Who works at my same level and shares the same manager?<br />
Who is my manger&#8217;s manager? My manager&#8217;s manager&#8217;s manager?<br />
What is the total salary of my direct reports (people who work directly for me)?<br />
What is the total salary of my indirect reports (people who work for people who work for me)?</p>
<p>I am sure there are many more questions but these should serve as a starting point. Some of the questions only require one level of the hierarchy (who works for me, or who do I work for). Those are simple enough to answer, and in fact can be answered with the simple alias structure already shown in this post. But in order to traverse the tree for multiple levels I need a solution that is a but more robust.</p>
<h3>Next Time</h3>
<p>In the next post of this series I want to talk about some of the different challenges encountered when working with recursive data. Once I define the challenges I will be in a position to start talking about solutions. As a preview, here are the four types of hierarchies I will be talking about:</p>
<ul>
<li>Clean &#8211; a hierarchy with clean data, consistent node depths, and consistent node paths</li>
<li>Unbalanced- a hierarchy with inconsistent node depths</li>
<li>Ragged- a hierarchy with inconsistent node paths</li>
<li>Lateral- a hierarchy with sideways node paths</li>
</ul>
<p>If it is not clear what some of those mean, don&#8217;t be too concerned; I will be defining each with examples in the next post.</p>
<p>Finally, here is a preview of the various solutions I will talk about:</p>
<ul>
<li>Universe aliases</li>
<li>Flattened structures (columns or snowflake tables)</li>
<li>Ancestor / Descendant model</li>
<li>Depth first tree traversal</li>
</ul>
<p>And a few that I won&#8217;t:</p>
<ul>
<li>Oracle CONNECT BY PRIOR</li>
<li>Stored procedures</li>
</ul>
<p>Part II of this series will talk in more detail about each of the recursive challenges. After I detail the different challenges the next post will talk about the solutions. My plans for the final post for this series are to review the impact of each solution on the native drilling functionality and then to wrap things up.</p>
<p><strong>Related Links</strong></p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Bill_of_materials">Wikipedia on Inventory BOM</a> in case you are unfamiliar with the concept of inventory data</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/06/16/universe-models-for-recursive-data-part-i-introduction/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Everything About Shortcut Joins</title>
		<link>http://www.dagira.com/2010/05/27/everything-about-shortcut-joins/</link>
		<comments>http://www.dagira.com/2010/05/27/everything-about-shortcut-joins/#comments</comments>
		<pubDate>Thu, 27 May 2010 11:30:44 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Join Techniques]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=273</guid>
		<description><![CDATA[There have been a number of posts recently in the Semantic Layer forum on BOB about shortcut joins. When will they be used? How many can be used? Why won’t this particular shortcut get used? Do I have to add shortcuts to contexts? Lots of questions.
I am going to try to clear up a couple [...]]]></description>
			<content:encoded><![CDATA[<p>There have been a number of posts recently in the Semantic Layer forum on BOB about shortcut joins. When will they be used? How many can be used? Why won’t this particular shortcut get used? Do I have to add shortcuts to contexts? Lots of questions.</p>
<p>I am going to try to clear up a couple of those questions now. First here is a summary of everything I need to know about shortcuts:</p>
<ul>
<li>Shortcut joins do not provide an alternate path.</li>
<li>Shortcut joins do provide a shorter path.</li>
</ul>
<p>By the end of this post I hope that the reader will understand the difference between those two statements. There are two rules for how and when a shortcut will be applied:</p>
<ul>
<li>A shortcut join will only be used if it eliminates tables from the query.</li>
<li>A shortcut join is applied after the SQL has been generated (meaning after a context selection has been made, if required).</li>
</ul>
<p>I will talk about these two items as well. But first, how do I create a shortcut join in my universe? <span id="more-273"></span></p>
<h3>Creating a Shortcut Join</h3>
<p>Creating a shortcut join is quite simple. All I have to do is double-click the particular join and mark the shortcut attribute.</p>
<p><img src="/tips/shortcut_joins/setting_shortcut.jpg" width="512" height="483" border="0" alt="screenshot of setting up a shortcut join" title="Setting up a shortcut join in a Business Objects universe" /></p>
<p>A shortcut join will appear in my universe structure as a dotted rather than a solid line, as shown here.</p>
<p><img src="/tips/shortcut_joins/shortcut_structure.jpg" width="409" height="223" border="0" alt="screenshot of a shortcut join" title="A Shortcut join in a Business Objects universe" /></p>
<p>The process used to create a shortcut join is quite simple. But was it appropriate to convert the join as shown above into a shortcut? Will it be used?</p>
<h3>Shortcut Path</h3>
<p>The sample shown above is from the Prestige Motors database. There is a direct relationship from the Country table to the Region table, and also from the Region table to the Client table. However there is also a direct relationship from the Country table to the Client table, as the COUNTRY_ID column has been denormalized into the Client record. (If you have never seen this database, a client is a customer.) The question at this point becomes, “Is this a shorter path or an alternate path?”</p>
<p>How do I tell the difference?</p>
<p>I believe the answer is simple. If I get the same result from both join paths then it’s a shortcut. If I get a different answer then it’s an alternate path.</p>
<h3>Alternate Path</h3>
<p>In the Prestige Motors database the COUNTRY_ID exists in the Showroom table as well as the Client and Region tables. I could create joins like this:</p>
<p><img src="/tips/shortcut_joins/showroom_join.jpg" width="601" height="367" border="0" alt="screenshot of the showroom join in the motors universe" title="Showroom joins from the Prestige Motors universe" /></p>
<p>In this case I have a join from Country to Client. I also have a join from Country to Showroom. Because of the relationship with the Sales table I now have a loop in my structure. By changing one of the two Country joins to a shortcut I can avoid the loop, like this:</p>
<p><img src="/tips/shortcut_joins/showroom_shortcut.jpg" width="601" height="367" border="0" alt="screenshot of the shortcut join to the showroom table" title="A Shortcut join from Country to Showroom" /></p>
<p>What has happened here? I have eliminated the loop from my structure and solved that problem, right? Perhaps, but it is the wrong solution. In this case, the shortcut is an alternate path rather than a shorter path. I can tell because (as I mentioned earlier) I will not get the same results from the two queries.</p>
<p>Suppose I want to combine Country, Showroom, and Sales. The longer path looks like this:</p>
<p><img src="/tips/shortcut_joins/join_path_1.jpg" width="601" height="367" border="0" alt="screenshot of the longer join path" title="Standard join path from Country to Showroom via the Sales table in the Prestige Motors universe" /></p>
<p>When I execute a query using this set of joins, I will get a list of showrooms that have had sales to customers, and I will get the country where the customer is located. Next I will run a query against this shorter path:</p>
<p><img src="/tips/shortcut_joins/join_path_2.jpg" width="601" height="367" border="0" alt="screenshot of the shorter join path" title="Shortcut join path from Country to Showroom in the Prestige Motors universe" /></p>
<p>When I execute a query using this path, I will get a completely different result set. I will get a list of showrooms, their sales, and the country where they are located. The customer tables never come into play, so I get a completely different result set. This is my indication that I do not have a proper shortcut join definition.</p>
<p>And interestingly enough, Web Intelligence will never use the shortcut for the query outlined above! It is smart enough to realize that the shortcut is not properly defined as it does not truly present a shorter path. It is an alternate path, and that’s not a valid application of a shortcut.</p>
<p>As a brief aside: How should this particular loop be resolved? Obviously a shortcut is not the answer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/05/27/everything-about-shortcut-joins/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Fixing Report Path For Adobe PDF Viewers</title>
		<link>http://www.dagira.com/2010/03/23/fixing-report-path-for-adobe-pdf-viewers/</link>
		<comments>http://www.dagira.com/2010/03/23/fixing-report-path-for-adobe-pdf-viewers/#comments</comments>
		<pubDate>Tue, 23 Mar 2010 16:01:42 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=248</guid>
		<description><![CDATA[We are using the OpenDocument() function to &#8220;drill&#8221; from one document to another. In order to make report maintenance easier I have created some objects in the universe that contain the proper syntax for the URL required to access this function, as well as one that contains the report path. This way even if we [...]]]></description>
			<content:encoded><![CDATA[<p>We are using the <code>OpenDocument()</code> function to &#8220;drill&#8221; from one document to another. In order to make report maintenance easier I have created some objects in the universe that contain the proper syntax for the URL required to access this function, as well as one that contains the report path. This way even if we change our folder names or structure I can change the universe and do not have to update every report on the project. This has worked very well for us.</p>
<p>Until my current project.</p>
<p>On this project the primary distribution channel was PDF sent via email. Our users said that the links were not working. And of course every time I tested by logging in to Infoview the links worked just fine. After further investigation by another team member, it seems that our Report Path (in the format <code>[Folder],[Sub Folder]</code> was being truncated at the comma. As a result, the <code>OpenDocument()</code> function was looking for the reports in <code>[Folder]</code> and ignoring the full path. That was a bit of a problem. <span id="more-248"></span></p>
<p>As mentioned, our Report Path is stored as an object in the universe. To avoid extra work on each report I had encoded the space in &#8220;Sub Folder&#8221; as Sub%20Folder when I created the object. Without this encoding the URL would not function as required. The other characters such as [ and ] and the , between the folder names were all presented as-is with no encoding. For one project that distributed and viewed their files purely through Infoview this worked great. But the URL was being truncated when sent to PDF. One small difference was that our folder structure for this project used _ instead of a space in the folder names, but it was all working in Infoview, so it must be okay, right?</p>
<p>Wrong. But it wasn&#8217;t the _ that was the problem.</p>
<p>Initially we thought that there was a problem with the PDF generation and investigated that path. However, the solution was ultimately discovered by <a href="http://www.linkedin.com/pub/brian-durning/4/a79/aa3">Brian Durning</a>. It seems that while Infoview was okay with a compound path <code>[Folder],[Sub_Folder]</code> for Adobe the comma had to be encoded in order to work. If not, the comma was taken as part of the data and Adobe stopped looking for additional path information at that point.</p>
<p>I updated the Report Path object from this:</p>
<p><code>'[Folder],[Sub_Folder]'</code></p>
<p>to this:</p>
<p><code>'[Folder]%2C[Sub%5FFolder]'</code></p>
<p>As an aside, these objects do not parse since they are just text strings. %2C is the hexidecimal code for a comma and %5F is the code for the _ character. Once this fix was in place all of our Adobe links worked. And if you are curious, our Infoview links continued to work just fine.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/03/23/fixing-report-path-for-adobe-pdf-viewers/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>SORT_BY_NO=NO? Very Confusing&#8230;</title>
		<link>http://www.dagira.com/2010/03/04/sort_by_nono-very-confusing/</link>
		<comments>http://www.dagira.com/2010/03/04/sort_by_nono-very-confusing/#comments</comments>
		<pubDate>Thu, 04 Mar 2010 18:33:25 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=245</guid>
		<description><![CDATA[This has to be the parameter with the worst. Name. Ever. But let me start at the beginning.
Some databases require you to use actual column names in an ORDER BY clause. Like this:
select first_name, last_name, phone
from employee
order by last_name, first_name
Other databases let you take a shorter approach and sort by the position of the column [...]]]></description>
			<content:encoded><![CDATA[<p>This has to be the parameter with the worst. Name. Ever. But let me start at the beginning.</p>
<p>Some databases require you to use actual column names in an ORDER BY clause. Like this:</p>
<p><code>select first_name, last_name, phone<br />
from employee<br />
order by last_name, first_name</code></p>
<p>Other databases let you take a shorter approach and sort by the position of the column in the select clause, Like this:</p>
<p><code>select first_name, last_name, phone<br />
from employee<br />
order by 2, 1</code></p>
<p>To be honest, I don&#8217;t like the shortcut. I would rather see explicit column names in my order by because that way I know exactly what is being sorted without having to refer back to the select clause. Another advantage is that if the objects in my select ever change, my order by is not affected.</p>
<p>There is a parameter found in the .PRM file for each database named SORT_BY_NO. When you see that name, what do you think it is? Every time I see it I assume that it is used to determine whether the SQL will contain numbers in the ORDER BY clause like <code>order by 2, 1</code> instead of <code>order by last_name, first_name</code>. But that&#8217;s not what it does at all. Instead of doing what I described above, this parameter is used to determine if a query can be sorted by a column that does not appear in the select clause. That makes sense, doesn&#8217;t it? <img src='http://www.dagira.com/wp-includes/images/smilies/icon_rolleyes.gif' alt=':roll:' class='wp-smiley' />  It should be called SORT_BY_IN_SELECT or something. But it&#8217;s not, and here&#8217;s how it works.<span id="more-245"></span></p>
<h3>Sorting By &#8220;Something Else&#8221;</h3>
<p>I have a period calendar table where the period names are P01, P02, and so on. The years are fiscal years 2009, 2010, 2011, and on from there as you would expect. The user expects to pick a period and year combination. However, they want to see it in that order&#8230; period, and then year. As a designer I can easily combine the two values together in the format PPP YYYY with a concatenation operation. But then the LOV displays in alphabetical rather than chronological order. So I see this:</p>
<p>P01 2008<br />
P01 2009<br />
P01 2010<br />
P01 2011<br />
P02 2008<br />
P02 2009<br />
&#8230;</p>
<p>Instead of this:</p>
<p>P01 2008<br />
P02 2008<br />
P03 2008<br />
&#8230;<br />
P01 2009<br />
P02 2009<br />
P03 2009<br />
&#8230;</p>
<p>This is not what the user expects or requires, but is easily solved by editing the LOV query and adding a custom ORDER BY clause like this:</p>
<p><code>select period || ' ' || year<br />
from fiscal_calendar<br />
order by period_start_date</code></p>
<p>Sorting by the period start date would cause the alphabetical list to be sorted chronologically instead. However, doing any sort of manual editing &#8211; even in a simple LOV query &#8211; is something I want to avoid. Any time I have to click the &#8220;do not regenerate SQL&#8221; option it leaves me open for problems later on. I could add the start date to my query and sort it normally. However, I don&#8217;t want to do that as it would clutter the display.</p>
<h3>Setting SORT_BY_NO=NO</h3>
<p>This parameter is found in the .PRM file that belongs to the database engine referenced by a universe. In the old days the format of the .PRM file was the same as that found in a Windows .INI file. Today they use an XML structure instead. By default the SORT_BY_NO parameter is set to YES, so the line in the file looks like this:</p>
<p><code>&lt;Parameter Name="SORT_BY_NO"&gt;YES&lt;/Parameter&gt;</code></p>
<p>Clear as mud, yes? no? <img src='http://www.dagira.com/wp-includes/images/smilies/icon_lol.gif' alt=':lol:' class='wp-smiley' />  What it means is that yes, it is true; I cannot sort by something that does not appear in the select clause.</p>
<p>First I need to determine if my database allows me to sort by a column that does not appear in the select; Oracle and Teradata both do and I imagine others do as well. Changing this parameter won&#8217;t do me any good if the database does not support the technique. Next, I can find this file on my computer where I have Designer installed. The actual location will vary based on the installed path. The file name will be DBNAME.PRM, or in my case <code>teradata.prm</code>. I opened the file with a simple text editor, found the line shown above, and changed the value from YES to NO. It&#8217;s now a double-negative. It says, if I can paraphrase:</p>
<p>&#8220;It is NOT TRUE that I CANNOT sort by something NOT in the select&#8221;</p>
<p>or rather</p>
<p>&#8220;I is TRUE that I CAN sort by something NOT in the select&#8221;</p>
<p>Very clear, I am sure.</p>
<h3>The Results</h3>
<p>Before this change was made the &#8220;manage sorts&#8221; button on the query panel for editing LOV definitions in Designer was never available. After making this change, saving the file, and restarting Designer, I can now click this button.</p>
<p><img src="/tips/sort_by_no/toolbar.jpg" alt="toolbar image" title="Sort by button on query panel toolbar" border="0" width="243" height="49" /></p>
<p>When I click that button I get a list of objects from my universe. These objects do not have to appear in the select clause but can now appear in the sort clause. Problem solved.</p>
<p>But only after I set SORT_BY_NO equal to NO rather than YES in my parameter file.</p>
<p>By finally writing this down as a blog post, I hope that I will remember this the next time I get a new laptop and won&#8217;t have to spend time searching for the parameter setting that allows me to do this. I hope it helps someone else as well, but mainly this one is just for me. It happens that way sometimes. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_cool.gif' alt='8-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/03/04/sort_by_nono-very-confusing/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Want To Crash Teradata? Give It Some LOV&#8230;</title>
		<link>http://www.dagira.com/2010/02/26/want-to-crash-teradata-give-it-some-lov/</link>
		<comments>http://www.dagira.com/2010/02/26/want-to-crash-teradata-give-it-some-lov/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 14:27:46 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=240</guid>
		<description><![CDATA[Five easy steps to crash your Teradata system:

Step 1: Upgrade to Teradata version 13
Step 2: Recognize that with this version a &#8220;distinct&#8221; query no longer returns sorted results
Step 3: On the advice of Teradata, reconfigure your box with the &#8220;regression&#8221; parameter that makes distinct queries behave the way they did in 6.2
Step 4: Send a [...]]]></description>
			<content:encoded><![CDATA[<p>Five easy steps to crash your Teradata system:</p>
<ul>
<li>Step 1: Upgrade to Teradata version 13</li>
<li>Step 2: Recognize that with this version a &#8220;distinct&#8221; query no longer returns sorted results</li>
<li>Step 3: On the advice of Teradata, reconfigure your box with the &#8220;regression&#8221; parameter that makes distinct queries behave the way they did in 6.2</li>
<li>Step 4: Send a Business Objects LOV query to the database that includes a DISTINCT keyword and a where clause with a couple of constant values</li>
<li>Step 5: Watch the system reboot</li>
</ul>
<p>That&#8217;s about what happened to us a few days ago. It wasn&#8217;t pretty. It took a long time to get our production box upgraded (and this after seeing development and Q/A roll through the upgrades with flying colors). Once the upgrade was finally completed, we had catch-up work as far as batch processing to do. Once that was complete the users got back into the system&#8230; only to see it sporadically reboot.</p>
<p>With a personal computer or laptop, a sporadic reboot is often a loose connection or faulty piece of hardware. We had not experienced anything like this on our database servers. Ultimately someone figured out that the following query was at fault:</p>
<p><code>select DISTINCT table.column FROM table WHERE table.column in ('A','B')</code></p>
<p>That&#8217;s a fairly innocuous query, isn&#8217;t it? At first someone thought the table was corrupt. Nope, it checks out fine. Next someone suggested that the data in the table was bad. Nope, I can query it just fine. Then we thought maybe the fact that there were some special characters in the where clause was the problem. Nope, they work fine too. Finally it was narrowed all the way down to the fact that we had a DISTINCT clause with the where clause and the regression parameter set on our database. <span id="more-240"></span></p>
<p>Once we had identified the specific cause, there were three factors to consider. First, we needed the where clause on the LOV in order to deliver the proper list of choices to the business. So that could not really change. The Teradata DBA team had set a parameter to make TD13 work like TD6.2 as far as providing a &#8220;distinct&#8221; and sorted LOV result. Could we work with that instead? If I removed the &#8220;distinct&#8221; from the LOV query then Teradata no longer rebooted. However, the results of the LOV were no longer sorted, and that presented a new challenge.</p>
<h3>Universe Parameters To The Rescue</h3>
<p>There are a couple of universe parameters that were used to solve this issue. Side note: it was back in 6.0 (I believe) that the ability to override universe parameter settings within the Designer application first appeared. Prior to that version any parameter changes were applied via a configuration file. Any changes made then affected every universe that used that parameter file (there was a different one for each database). Today I can change individual universes, which is good. So what changes did I make?</p>
<h3>Universe Parameter: DISTINCT</h3>
<p>The first change was to change the setting for the DISTINCT parameter. By default this parameter value is set to DISTINCT, which seems redundant. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  It essentially becomes DISTINCT=DISTINCT. However there is another option, that being GROUPBY. Note that there are no underscores or spaces in that phrase. With the setting updated to DISTINCT=GROUPBY my LOV queries no longer included the &#8220;distinct&#8221; keyword at the top of the query. Instead of this:</p>
<p><code>select DISTINCT table.column from table</code></p>
<p>I see this:</p>
<p><code>select table.column from table group by 1</code></p>
<p>There is another parameter that controls whether I see <code>group by 1</code> or <code>group by table.column</code> and that depends on whether a database supports that syntax or not.</p>
<p>This change solved the first half of the issue. Without a &#8220;distinct&#8221; the LOV queries would no longer cause the server to reboot. This was certainly a positive step. However, making this change had an important side effect: the LOV results were unique because of the GROUP BY clause, but they were no longer sorted.</p>
<h3>Universe Parameter: FORCE_SORTED_LOV</h3>
<p>The second parameter that I changed was FORCE_SORTED_LOV. The default value is <code>No</code> and I changed it to <code>Yes</code> instead. The theory behind this parameter is simple: it would force the LOV results to be sorted. Before I changed the parameter my LOV query looked like this:</p>
<p><code>select table.column from table group by 1</code></p>
<p>After applying this change, my LOV query looked like this:</p>
<p><code>select table.column from table group by 1</code></p>
<p>Hm. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_eek.gif' alt=':shock:' class='wp-smiley' />  Not what I expected. I was looking for an ORDER BY clause at the end of the query and I was not seeing one.</p>
<p>I went ahead and saved the universe and exported it to the Q/A system. We ran a few test queries (including one using the problem LOV definition mentioned above) and everything worked fine. Short LOV results were sorted. Long LOV result sets were also sorted, and they were still &#8220;paged&#8221; like I expected them to be. It still bothered me that I wasn&#8217;t seeing an ORDER BY clause in the LOV query definition, but obviously the sort was happening somewhere. We speculated that perhaps the Web Intelligence server was applying the sort which raised some concerns. LOV performance definitely suffered during the upgrade from 6.5 to XI and I didn&#8217;t want to introduce anything that would cause it to degrade further.</p>
<p>Fortunately one of the Teradata DBA team found the LOV queries in the logs, and despite the fact that no ORDER BY was showing up in the SQL when I viewed it via Designer, it was really there by the time Teradata got the query request.</p>
<h3>Wrap Up</h3>
<p>Ultimately the two universe parameters (DISTINCT and FORCE_SORTED_LOV) were used to change how LOV queries were generated, which allowed the Teradata DBA team to revert back to a standard version 13 installation (without the regression parameter turned on). I imagine that the Teradata engineering team is busy working on the bug and it will be fixed soon, so if you are using Teradata and looking at a version 13 upgrade at some point later this year you probably won&#8217;t encounter the same issue.</p>
<p>But it&#8217;s nice to know that we can help solve the issue by changing a few parameters on the universe side.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2010/02/26/want-to-crash-teradata-give-it-some-lov/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Foodmart 2000 Universe Review &#8211; Part I: Introduction</title>
		<link>http://www.dagira.com/2009/12/23/foodmart-2000-universe-review-part-i-introduction/</link>
		<comments>http://www.dagira.com/2009/12/23/foodmart-2000-universe-review-part-i-introduction/#comments</comments>
		<pubDate>Wed, 23 Dec 2009 18:54:23 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[Foodmart]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=207</guid>
		<description><![CDATA[Earlier this year I attended SAP TechEd 2009. Many of their sessions were lecture only, but they also provided a number of two or four-hour hands-on sessions. I selected one specific session in order to learn about improvements in the process used to build universes against SAP data sources like BEx queries. But of course [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier this year I attended SAP TechEd 2009. Many of their sessions were lecture only, but they also provided a number of two or four-hour hands-on sessions. I selected one specific session in order to learn about improvements in the process used to build universes against SAP data sources like BEx queries. But of course I could not leave it at that. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  I got to the session a bit early and started poking around on the laptop to see if I could get some hints as to what we were going to cover. While poking around I found a universe named &#8220;Foodmart&#8221; so I opened it. It was&#8230; interesting.<span id="more-207"></span></p>
<p>Regular blog readers might remember I had posted something about a <a href="http://www.dagira.com/2009/08/01/news-post-introducing-the-dagira-group/">&#8220;universe review service&#8221;</a> that I was thinking of offering. I have not posted anything further about that service, mainly because I <a href="http://www.dagira.com/2009/09/08/news-post-september-2009/">got a real job instead</a>. But since this universe is a new one (at least it was new to me, I had not seen it used in training sessions or demonstrations before) and since it has a number of interesting mistakes, I decided to go ahead and talk about that process. I will use this universe as my subject and along the way talk about various things that I make sure to cover during a universe review. I got permission to take a copy of the universe from the session laptop and placed it on the memory stick that they had provided with all of the session presentations. (That was handy.) Unfortunately the connection was pointing to a Microsoft SQL Server database so I could hardly grab a copy of that.</p>
<p>Later on I asked around to see if anyone knew what the Foodmart universe was used for. One of the responses I got back included a link to another blog where they had a Microsoft Access database that matched the structure from the universe perfectly. After downloading (and testing) that file I now have a copy of the universe and the matching database. I can work with that. </p>
<p>After further investigation it seems that the Foodmart database is yet another demonstration database, much like the Northwind or Summit Sporting Goods or (dare I say it) Island Resorts Marketing. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  There should not be any license issues with distributing or using this database.</p>
<p>Since I expect that this universe will be new to most folks I decided to use it for this series of posts detailing the process I go through to review a universe. I am going to start the more detailed blog posts next year but figured that I would go ahead and post the universe and the associated database for folks to be able to download now. Download links are at the end of the post.</p>
<h3>And Now, The Rules</h3>
<p>I really hope that this series of posts will be useful. In order to provide the most effective use of these materials, I have a few ground rules.</p>
<ul>
<li><strong>No spoilers in comments please</strong><br />
I expect that many blog readers who download the universe and start looking for mistakes are going to find some. I found many in less than five minutes, and I expect no less from some of you. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Please don&#8217;t spoil the process by posting comments about additional mistakes that I have not covered yet. I have a specific order that I want to cover things in and spoilers will, well, they&#8217;ll spoil the process.</li>
<li><strong>Please keep questions on topic</strong><br />
This is a standard request that I make for all blog posts but I wanted to repeat it here.</li>
<li><strong>Don&#8217;t ask if you can send me your universe</strong><br />
Unfortunately I cannot review your universe for you for free (or for a fee) right now. I simply do not have the time to perform this service at the moment.</li>
</ul>
<p>That&#8217;s it. I hope this will prove to be a valuable series of posts for both new and experienced designers, and these few requests will help that happen.</p>
<p>You can follow all of the posts related to the Foodmart universe review using the <a href="http://www.dagira.com/category/design/foodmart/">&#8220;foodmart&#8221;</a> cateogry tag.</p>
<h3>Download Links</h3>
<p>Here are the links to download the database and universe. This universe was created in non-secured mode which means anyone should be able to open it. You do not even have to have a CMS running in order to use this universe; you should be able to switch to <strong>Standalone (no CMS)</strong> mode on the Designer login screen and still be able to use this resource. The database is in Microsoft Access format and is over 20MB when unzipped.</p>
<ul>
<li><a href="/tips/foodmart_download/foodmart_2000.zip">Microsoft Access Database (zipped &#8211; 8.37MB)</a></li>
<li><a href="/tips/foodmart_download/foodmart.zip">Business Objects Foodmart Universe (zipped &#8211; 37KB)</a></li>
</ul>
<p>I am not going to cover how to set up the database and connection. I don&#8217;t want to sound rude but if you don&#8217;t already know how to do these basic steps then the discussions that follow this one are going to be way over your head. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>This will be my final blog post for 2009. Happy holidays, and I will see you next year. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_cool.gif' alt='8-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2009/12/23/foodmart-2000-universe-review-part-i-introduction/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Time Sliced Measures Part III: Making Measures</title>
		<link>http://www.dagira.com/2009/12/17/time-sliced-measures-part-iii-making-measures/</link>
		<comments>http://www.dagira.com/2009/12/17/time-sliced-measures-part-iii-making-measures/#comments</comments>
		<pubDate>Fri, 18 Dec 2009 03:52:29 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[2008 GBN - Dallas]]></category>
		<category><![CDATA[Universe Contexts]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=187</guid>
		<description><![CDATA[In the first post in this series I defined what time-sliced measures are and why they can be useful in a universe. In the second post I described a special calendar table that was designed and built to support the requirements for this solution. I also showed how the join logic worked in conjunction with [...]]]></description>
			<content:encoded><![CDATA[<p>In the <a href="http://www.dagira.com/2009/08/08/time-sliced-measures-part-i-defining-the-problem/">first post in this series</a> I defined what time-sliced measures are and why they can be useful in a universe. In the <a href="http://www.dagira.com/2009/08/28/time-sliced-measures-part-ii-time-slice-calendar-table/">second post</a> I described a special calendar table that was designed and built to support the requirements for this solution. I also showed how the join logic worked in conjunction with the table design. This post completes the implementation. I am finally going to work on the measure objects that a user will see. </p>
<p>In any universe design project I strive for the following goals:</p>
<ul>
<li><strong>Deliver the correct result</strong><br />
In my opinion, this is always the number one goal in any universe design.</li>
<li><strong>User friendly</strong><br />
This is quite important but secondary to correctness</li>
<li><strong>Easy to maintain</strong><br />
Universe maintenance is always allowed to suffer in order to provide the first two attributes on this list, but it is a worthwhile goal to strive for nonetheless</li>
</ul>
<p>In this post I will show how all three of these goals are ultimately met by this implementation. When I am done I will have a completed universe. <em>This post will cover slides 26 through 30 from my 2008 GBN Conference presentation. There is a link to download the file at the end of this post.</em> <span id="more-187"></span></p>
<h3>A Brief Recap</h3>
<p>The overall goals of this project were listed in the prior posts in this series. The most important ones for this post are:</p>
<ul>
<li>Each report is expected to have multiple time-sliced measures</li>
<li>The process of splitting each time-slice time period into its own SQL statement should be completely transparent</li>
</ul>
<p>Current Year and Prior Year are different time slice attributes. Every report is expected to have month-to-date and year-to-date measures that cover both the current and the prior year. That means I expect to see four different measures used in each document. The month and year date ranges are obviously different, as are the current and prior year ranges. In the last post on this topic I described a special time-slice calendar table and set up different aliases for each required time-slice. I set up contexts to keep queries from hitting more than one time-slice at the same time.</p>
<p>That&#8217;s how we got here. Where to next? It&#8217;s time to build some measures.</p>
<h3>Bits and Pieces</h3>
<p>I subscribe to the thought that anything that makes the universe designer&#8217;s job easier while making the user&#8217;s life harder is the wrong approach. However, if I can do something to make my life easier without impacting the user, then whatever I come up with is fair game. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  That&#8217;s how I arrived at the strategy that I am going to outline next.</p>
<p>Normally a &#8220;bit&#8221; is either a one or a zero. Many of the bit objects used in this solution will be just that. However, I am also going to create some special measure bits that make use of the @Select() function. As a general rule I am not in favor of using this function, but in this case I will and I will justify why in a bit. (heh, pun <img src='http://www.dagira.com/wp-includes/images/smilies/icon_razz.gif' alt=':-P' class='wp-smiley' />  )</p>
<p>I am going to focus on creating time-sliced objects for the revenue object. In order to do this, I am going to create a measure bit called &#8220;Revenue&#8221; that contains only the table / columns required to calculate this value. I am <strong>not going to include an aggregation function for this measure</strong> which violates another one of my universe design rules. Again, I will justify why. The other bit objects will be used to select (or mark) which context is used for each time slice.</p>
<h3>Using Measure Bits</h3>
<p>In my time-sliced solution everything is made up of combinations. Imagine a cartesian product between measures and time periods, something like this:</p>
<p><img src="/tips/time_slice_part_iii/cartesian_measures.jpg" /></p>
<p>Each time-slice on the left is combined with each measure on the right. Rather than create six different references to the revenue formula shown here&#8230;</p>
<p><code>INVOICE_LINE.DAYS * INVOICE_LINE.NB_GUESTS * SERVICE.PRICE</code></p>
<p>&#8230;I am going to create just one. Notice that the code shown above does not have an aggregate function? This is by design&#8230; I am going to use the @Select() function to build my visible measure objects. Here&#8217;s the initial part of the object definition for the CY MTD Revenue object.</p>
<p><code>sum(@Select(Measure Bits\Revenue))</code></p>
<p>By leaving the aggregate function out of the source measure I can also create a minimum, maximum, or other aggregate version of the sales revenue. This is because the code referenced by the @Select() doesn&#8217;t have an aggregate&#8230; and why is this important? Because I <strong>can&#8217;t nest aggregate functions</strong>. If my source measure included the sum() function then that&#8217;s all I can do with it. By keeping it generic (no aggregate) I can reuse it with any aggregate function as needed. So that&#8217;s why I am breaking the rule that says every measure must have an aggregate function.</p>
<p>Ultimately the measure bits are going to be hidden from view so they will never be viewable / usable by report writers. They are only used as building blocks in the Designer application.</p>
<h3>Time Slice Bits</h3>
<p>Now I need to build some time-sliced bits. These are really simple. Here&#8217;s what the object definition looks like for the CY MTD bit object.</p>
<p><img src="/tips/time_slice_part_iii/time_bit_definition.jpg" /></p>
<p>The select clause contains the number 1 and that&#8217;s it. Notice what the help text says? In big CAPITAL letters it says &#8220;DON&#8217;T FORGET THE TABLE!!!&#8221; I wonder what that means? <img src='http://www.dagira.com/wp-includes/images/smilies/icon_lol.gif' alt=':lol:' class='wp-smiley' /> </p>
<p>Here&#8217;s what it means. The 1 is just a placeholder to put something in the Select clause. The Where clause is empty. The From clause comes from the table list, and here&#8217;s what that looks like.</p>
<p><img src="/tips/time_slice_part_iii/table_reference.jpg" /></p>
<p>Stick with me just a little bit longer and this will all make sense. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Remember that in the earlier posts I set up aliases for my time-slice calendar table. Each alias became part of its own context. A context is a series of joins that make up a particular path through a universe. What I am working towards is a set of time-sliced objects that a user can drag onto their query panel without having to worry about where they come from or how they&#8217;re put together. Here&#8217;s what my schema looks like as a reminder. All of the aliases are on the right side.</p>
<p><img src="/tips/time_slice_part_iii/schema.jpg" /></p>
<p>This structure gives me a way to automatically split out each time-sliced measure into its own path so that a user can combine CY MTD and PY MTD and any of the other options onto one query and run it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2009/12/17/time-sliced-measures-part-iii-making-measures/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Calculation Options</title>
		<link>http://www.dagira.com/2009/10/28/calculation-options/</link>
		<comments>http://www.dagira.com/2009/10/28/calculation-options/#comments</comments>
		<pubDate>Wed, 28 Oct 2009 11:00:21 +0000</pubDate>
		<dc:creator>Dave Rathbun</dc:creator>
				<category><![CDATA[2009 GBN - Dallas]]></category>
		<category><![CDATA[Report Techniques]]></category>
		<category><![CDATA[Universe Design]]></category>

		<guid isPermaLink="false">http://www.dagira.com/?p=216</guid>
		<description><![CDATA[When working with the reporting suite from Business Objects there are many different calculation engines. A report developer can create custom formulas or variables in Desktop Intelligence, Web Intelligence, and of course Crystal. A universe designer can build custom objects using database functions in the universe. An ETL architect can design special query transformations. So [...]]]></description>
			<content:encoded><![CDATA[<p>When working with the reporting suite from Business Objects there are many different calculation engines. A report developer can create custom formulas or variables in Desktop Intelligence, Web Intelligence, and of course Crystal. A universe designer can build custom objects using database functions in the universe. An ETL architect can design special query transformations. So where do you do the work?</p>
<p><em>This post covers slides 6 through 9 from my 2009 GBN presentation titled &#8220;Return of the Variables&#8221; which can be downloaded from my <a href="http://www.dagira.com/conference-presentations/">conference page</a>.</em> <span id="more-216"></span></p>
<h3>Push it Back, Push it Back, Way Back!</h3>
<p>I can&#8217;t help but hear some football cheerleaders calling out inspirational words to their team as I write that heading. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_lol.gif' alt=':lol:' class='wp-smiley' />  But the reality is that the concept is quite appropriate for this discussion. There are quite a few advantages to putting calculation logic into your ETL (extract / transform / load) tool. For example&#8230;</p>
<ul>
<li><strong>Build on core data</strong><br />
ETL tools or other scripts working on core data have an advantage&#8230; they&#8217;re working on the core data. (That&#8217;s almost a recursive definition, which is another presentation. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  ) What I mean by that is there isn&#8217;t anything getting in the way between the process and the data. My scripts can do whatever they want because everything they need should be available. If it&#8217;s not immediately available I can build it in a temporary table and move on from there. ETL tools are very powerful in this regard.</li>
<li><strong>Procedural languages / scripts</strong><br />
Even in a simple ETL environment where no formal tool is being used I can still write procedural scripts. I can write procedural SQL (PL/SQL) or even C scripts. The only limits are my imagination and the grammar of the selected tool.</li>
<li><strong>Consistency across all access paths</strong><br />
This is a real key advantage of pushing calculations back to the ETL layer. Since this process is generally responsible for filling your data warehouse tables, and the calculations are done during that load, it means that any tool used to access that database inherits the results. It means I can use any query tool and still gain the benefits of the calculation results.</li>
<li><strong>Calculate and store once, retrieve many times</strong><br />
This is my primary differentiator between putting a calculation in the universe versus doing it in the ETL. I was recently given a block of code and asked to create a predefined condition in the universe to handle the logic. The problem was it was extremely complex and included a number of case statements and outer join requirements&#8230; and it was going to impact the fact table. I pushed back and requested that the logic be placed into the ETL. The end result was I got a simple Boolean flag (zero or one value) on the fact table that was much easier to use. An added benefit? If the logic required to populate that field ever changed it could easily be done in the ETL.</li>
</ul>
<p>Unfortunately there are other issues to consider before putting a calculation into the ETL.</p>
<ul>
<li><strong>Change Management</strong><br />
Not all of the aspects of doing calculations in the ETL are good. For example, most companies seem to have far stricter controls and change management processes around their ETL than they do for reports. Getting a change pushed through the ETL team can have a far wider impact and therefore can take longer and require more justification. Sometimes it is easier to keep a calculation closer to the report to avoid development delays.</li>
<li><strong>Complexity</strong><br />
ETL scripts can already be quite complex. Adding new calculations might slow the scripts down and cause them to run beyond their available load window.</li>
<li><strong>Impact Analysis</strong><br />
One of the primary advantages of including calculations in the ETL is that the results are shared by everyone. One of the disadvantages of including calculations in the ETL is that a change in that area will impact everyone as well. That means more teams to talk to in order to get a consensus that the change is appropriate and approval to start the process.</li>
</ul>
<p>In my opinion, despite the potential for slower turn-around time on development requests and the need for greater impact analysis, complex calculations should be pushed back to the ETL if at all possible. This is especially true if the calculations impact join logic, affects security profiles, or needs to be performed on the fact where it will impact nearly every single query that is executed.</p>
<h3>Using Universe Objects</h3>
<p>The next possible location for calculations is the universe. There are quite a few advantages to putting something into the universe as opposed to having it done in the report. Such as&#8230;</p>
<ul>
<li><strong>Build once &#8211; use many times</strong><br />
Universe objects are designed to be reusable. Once an object is built it can be used / reused in any number of reports. This is one of the primary reasons we even build universes (that and providing the abstraction layer so business users don&#8217;t need to know technical terms.)</li>
<li><strong>Use full range of database functions</strong><br />
There is a wide range of functions available inside each of the different report engines, but you can&#8217;t access the power of the database. I can use almost any function available from my database to build an object. And if there isn&#8217;t a function available to do what I want, I can build one. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </li>
<li><strong>Ensure consistency from report to report</strong><br />
For quite a long time Business Objects used the slogan &#8220;Single version of the truth&#8221; when talking about their products. This was based on the concept of reusable objects that I already covered above. Since every business user will be using the same object for Sales Revenue, they should all get the same result when they ask the same question.</li>
<li><strong>Updates automatically propagate</strong><br />
Reports built on a universe will automatically get updates when they are published. This means if I discover a bug in one of my objects and fix it, every report that uses that object will get the updated code the next time the report is refreshed. This was a big advantage of the &#8220;classic&#8221; Business Objects tools over Crystal reports for many years. Any tool that requires a developer to manually write their own SQL code is subject to the same limitation.</li>
</ul>
<p>That&#8217;s a nice list of advantages for universe objects. What about the disadvantages?</p>
<ul>
<li><strong>Limited to information from a single universe</strong><br />
Universe calculations cannot combine objects from two different universes. In fact it&#8217;s worse than that; you can&#8217;t combine objects from two different contexts in the same universe! <img src='http://www.dagira.com/wp-includes/images/smilies/icon_eek.gif' alt=':shock:' class='wp-smiley' />  This can be a very limiting factor depending on how complex your universe models are.</li>
<li><strong>Maintenance required by universe designer</strong><br />
Getting universe work done should be less traumatic than changing the ETL, but it can still be a roadblock depending on the availability of your developer. Notice I said &#8220;developer&#8221; as it is very difficult to manage a process where more than one person works on the same universe at the same time.</li>
<li><strong>Some aggregation issues can be tricky</strong><br />
Percentages and ratios and average calculations are <a href="http://www.dagira.com/2009/05/15/why-cant-i-average-in-my-universe/">all difficult to do in a universe</a>. There is a new feature called &#8220;smart&#8221; or <a href="http://www.dagira.com/2008/11/10/designer-xi-3-new-feature-database-delegated-measures/">&#8220;database delegated&#8221; measures</a> that started in XI 3.0 that helps some but it&#8217;s still an issue to be considered.</li>
<li><strong>Some functionality might be missing from the database</strong><br />
It&#8217;s less likely today then when I wrote the very first Variables presentation back in 1997, but it might be that the functionality you want or need just isn&#8217;t available in your database. Rather than writing a custom database function it might be easier simply to create report calculations instead.</li>
</ul>
<h3>Report Structure Items</h3>
<p>There are three basic options available in the report engines provided by Web Intelligence (and Desktop Intelligence as well if you&#8217;re using that tool.) They include constants, formulas, and variables. I will talk more about these three choices in the next post. For now I would like to consider the pros and cons of this option.</p>
<ul>
<li><strong>Available on all platforms</strong><br />
As mentioned both Desktop Intelligence and Web Intelligence offer this feature. Crystal goes even further and provides a language that includes scripting features. A report writer can pick the appropriate tool for the job without worrying about losing a calculation engine.</li>
<li><strong>Independent of SQL restrictions</strong><br />
The grammar for local calculations comes from the report engine, not the SQL database. That means if you are using a database with limited functions (Sybase IQ comes to mind) you can still accomplish complex tasks in the report by using the available report functions instead. The first time I worked with Sybase IQ was years ago, and it didn&#8217;t even offer a way to retrieve the current system date from the host.</li>
<li><strong>Calculations based on document data</strong><br />
When you refresh a query you download a microcube into your report. The calculations are then done on that smaller summarized dataset rather than applied to the entire rowset processed during the query process. It could be a performance benefit to be able to do the calculations locally.</li>
</ul>
<p>That all sounds great, but there are disadvantages as well.</p>
<ul>
<li><strong>Stored in a single document</strong><br />
In the ETL and Universe section I talked about being able to reuse calculation results. I can do that with report variables too, but only in different reports (tabs) within the same document. If I want to use a complex formula in a new document I have to copy / paste or recreate from scratch.</li>
<li><strong>Require some level of technical expertise</strong><br />
Most report writers can build simple calculations, especially operations found on the toolbar buttons like sum and count. However, getting complex calculations correct can be a struggle even for seasoned developers. It took a long time for the concept of calculation context to &#8220;click&#8221; for me, and even today it can sometimes be a challenge.</li>
<li><strong>Volume of data could impact performance</strong><br />
Earlier I said that calculations at the report level might improve performance, and now I am saying that the volume could impact performance. Is this a contradiction? No, not really. What I said earlier was that if the report calculations are done on data already summarized by the database engine they should be fast. If the data is not summarized, meaning if the microcube present in the report has a large number of rows, then having additional calculations can certainly slow things down.</li>
</ul>
<h3>Summary</h3>
<p>Wow, that was a lot of text. It&#8217;s easier to say all of this stuff than to write it down. Can you all do me a favor and just come to the conference next time so I don&#8217;t have to write so much? <img src='http://www.dagira.com/wp-includes/images/smilies/icon_lol.gif' alt=':lol:' class='wp-smiley' /> </p>
<p>In all seriousness, what I tried to do with these few slides in my presentation was to outline some of the general thoughts that go into deciding where to put a calculation. Is there a clear and obvious choice as to which to use? No, not really. I apologize if that&#8217;s what you were hoping for at this point. Each of these three areas (ETL, universe, report) has clear advantages. In general I prefer to push complex calculations as far back as possible. But if needed we have quite a few options available. </p>
<p>The next post in this series will focus on the different types of report calculations (specifically formulas and constants) and discuss which of those is better. In this case (unlike this post) there is a very clear choice. You can probably guess what it is&#8230; after all the presentation title was <em>Return of the Variables</em>. <img src='http://www.dagira.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.dagira.com/2009/10/28/calculation-options/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>
