May 27 2010

Everything About Shortcut Joins

Categories: Join Techniques,Universe Design Dave Rathbun @ 6:30 am

Of course the proper solution here is to create an alias of the country table. Aliases are used when a single table has multiple purposes, and in this case the country table is being used both as a showroom location as well as the customer residence.

The truth is that shortcuts are never the answer when trying to solve a loop in a universe. That’s not their purpose.

The Purpose of Shortcut Joins

I started this post with a couple of simple statements.

  • Shortcut joins do not provide an alternate path.
  • Shortcut joins do provide a shorter path.

I have shown how to check for the first case. Next I will define what I mean by a shorter path. Here is a picture showing a valid shortcut from the Prestige Motors universe.

screenshot of a valid shortcut join path

In this case there is a relationship from Country to Region and from Region to Client. However there is also a direct relationship from Country to Client as shown by the dotted join. Is this valid?

When I run a query using Country, Region, and Customer objects, here is the SQL that is generated.

SELECT 
	Country_Region.COUNTRY_NAME,
	REGION.REGION_NAME,
	CLIENT.CLIENT_LASTNAME
FROM
	COUNTRY  Country_Region,
	REGION,
	CLIENT
WHERE
	( REGION.REGION_ID=CLIENT.REGION_ID  )
  AND	( Country_Region.COUNTRY_ID=REGION.COUNTRY_ID  ) 

The regular join path is being used to pull everything together. What if I don’t care about the region? Here is the SQL for that query.

SELECT
	Country_Region.COUNTRY_NAME,
	CLIENT.CLIENT_LASTNAME
FROM
	COUNTRY  Country_Region,
	CLIENT
WHERE
	( Country_Region.COUNTRY_ID=CLIENT.COUNTRY_ID  )

How many tables appear in the FROM clause? Only two. Since there is a shorter path from Country to Client I can drop the Region table from the query and take the shortcut. It’s time to ask the same question I asked earlier: Will I get the same results from both of these queries? The answer is, in this case, yes. The customer to country relationship doesn’t change just because I have dropped the region information from the query. So this is a valid application of a shortcut join.

Shortcut Process

Shortcut joins are probably more frequently found in OLTP/ODS schemas due to the propagation of keys. Shortcut joins are not as frequently found in a standard star schema because joins are typically only found between dimension and fact tables. Shortcuts can be used very effectively in snowflake schemas, but I have to tweak a universe parameter to make everything work as expected. It all goes back to the “shorter path” process.

I said earlier that the shortcut shown below would not be used in the scenario I outlined. Why is that?

screenshot of the shortcut join to the showroom table

A shortcut is only used if it will shorten the path. In the query I presented earlier, I was going to select the Country, the Showroom, and the Sales objects. The standard path would run from Country to Region to Client to Sales and then end at Showroom. That is the initial SQL that would be generated in this case. The last step in the query generation process is to see if any shortcut joins can be used. In the picture shown above, the shortcut goes directly from Country to Showroom. It can only be used if it truly eliminates all of the other tables. Since I am required to use the Sales table for this query, I cannot use the shortcut. In other words, shortcuts are an “all or nothing” choice. I can’t pick to eliminate some of the joins but not all of them. Since I need the Showroom to Sales join to answer the question, the shortcut is not valid.

In fact, the only time the shortcut would ever be used is if I use only the Country and Showroom tables. I can use the shortcut then because all of the other tables are eliminated.

I still get the wrong answer, but it’s faster. 🙂

Shortcut Joins in a Snowflake Schema

Years ago I used to say that we only got to use one shortcut per query. (Not one shortcut per universe, just one per query.) This was true because the query pathfinder algorithm was not sophisticated enough to evaluate more than one potential shortcut path. Today I have options. There is a universe parameter named SHORTCUT_BEHAVIOR that I can use to help the query generator understand my universe structure.

The default value for this parameter is Successive. This means that shortcut joins will be evaluated one at a time, and only used if they truly eliminate tables as described earlier. This causes problems in a snowflake schema.

Here is a version of eFashion that I have modified so that it uses a snowflake structure.

screenshot of the snowflake schema from a modified eFashion universe

Notice that the outlet table is joined to the fact directly. There are two snowflake tables STATE_LOOKUP and REGION_LOOKUP that join to the outlet table directly, and the same two tables join indirectly (via shortcuts) to the fact table. My expectation is that if I use objects from the state and fact tables I should see the shortcut being used, and it is as shown here:

SELECT 
	STATE_LOOKUP.STATE,
	sum(SHOP_FACTS.Amount_sold)
FROM
	STATE_LOOKUP,
	SHOP_FACTS
WHERE
	( STATE_LOOKUP.STATE_ID=SHOP_FACTS.STATE_ID  )
GROUP BY
	STATE_LOOKUP.STATE

If I use the region and fact tables I get another shortcut:

SELECT
	REGION_LOOKUP.REGION_NAME,
	sum(SHOP_FACTS.Amount_sold)
FROM
	REGION_LOOKUP,
	SHOP_FACTS
WHERE
	( REGION_LOOKUP.REGION_ID=SHOP_FACTS.REGION_ID  )
GROUP BY
	REGION_LOOKUP.REGION_NAME

What happens when I combine state, region, and fact? I still expect to see the shortcut joins being used, but instead I see this:

SELECT 
	REGION_LOOKUP.REGION_NAME,
	STATE_LOOKUP.STATE,
	sum(SHOP_FACTS.Amount_sold)
FROM
	REGION_LOOKUP,
	STATE_LOOKUP,
	SHOP_FACTS,
	OUTLET_LOOKUP
WHERE
	( OUTLET_LOOKUP.SHOP_CODE=SHOP_FACTS.SHOP_CODE  )
  AND	( STATE_LOOKUP.STATE_ID=OUTLET_LOOKUP.STATE_ID  )
  AND	( REGION_LOOKUP.REGION_ID=OUTLET_LOOKUP.REGION_ID  )
GROUP BY
	REGION_LOOKUP.REGION_NAME, 
	STATE_LOOKUP.STATE

What is going on?

48 Responses to “Everything About Shortcut Joins”

  1. Comment by Andreas

    Great post, Dave, keep the good work up, please!

  2. Comment by Bill Ritchotte

    That was a very informative post. Thanks

  3. Comment by Sharan

    Hi Dave, thanks for this wonderful post, very useful.

  4. Comment by Dave Rathbun

    Thanks, folks, this one took a while to pull together, so I’m glad y’all appreciate it.

  5. Comment by Tanisha

    Thanks Dave!!
    Its pretty informative.

  6. Comment by kay

    Dave, there is never a post from you we don’t appreciate

  7. Comment by Ashley Burton

    Great post, I must say that I’ve always found it a bit sketchy working out if a shortcut join would actually be used though from your post I think I have been using them correctly so far. In this case, would the join you’ve demonstrated be a good way to help get a cascading prompt for Showroom by country without having to have a second context?

  8. Comment by Sriraman

    Dave,
    Great post and excellent work. After reading this post one can clearly understand the difference between normal and shortcut join.
    You blog makes me interested in BO career.
    Thanks

  9. Comment by Preeti Jha

    Great post..keep up the good work!!

  10. Comment by Nniravv

    Great Post, DAVE !!
    Thanks for sharing..
    Can you please cover Contexts and Aliases also with such details?

    Thanks
    Nirav

  11. Comment by Dave Rathbun

    Nniraw, there are a number of posts already available on my blog that cover contexts and aliases. The most summarized version is Context Versus Alias Overview which was part of a presentation from 2008.

  12. Comment by Gary Cronin

    Great post Dave. Very informative.

  13. Comment by Gaurav Singh

    Great post Dave.
    But If in your query you only need something from Country, Showroom and Sales then shortcut join does give you the information you need. If you want to include client then its will take the longer path. Doesn’t it also depends how you define your query?
    Also is it a good idea to join dimension tables with shortcut join all the time?

    Thanks in advance.

  14. Comment by Prasanth

    Hi Dave,

    Perfect post …Wish to be your Student …

    Regards
    Prashant

  15. Comment by Vrushank Patel

    Thanks !! Thanks !! for such a wonderful post

    Regards,
    Vrushank

  16. Comment by Rig

    Hi Dave,

    I deeply appreciate the amount of painstaking efforts that you put to make each of your posts so thoroughly detailed and yet the easiest to undestand.

    Many Many Thanks!!
    Rig 🙂

  17. Comment by Arya Majumdar

    Thanks Dave.

  18. Comment by Prashant Sahay

    Fantastic Work Dave!!.. I am delighted the way you explain the concepts… Hats off to you… Please keep continuing… Do explain the concepts of incompatibilitis… when it comes into picture… what to keep in mind when objects are incompatible..etc

    Thanks a ton!!

    Prashant

  19. Comment by Tom

    Dave. Could you direct me where to find a copy of the Prestige Motors database and maybe the universe you base your example??
    Thank you.

  20. Comment by Dave Rathbun

    Prestige Motors is (or at least was) used for the Universe Designer training course. The course materials come with the database so that you can work on the problems at home later.

  21. Comment by Tom

    believe it or not I am taking the eLearning version of that course now (BOU315), but I can’t get access to the materials. I did sent an email to their help desk and the person answering said that she did not know where to get this DB or universe file. Would you know of anyone at SAP BI that I could ask for the materials.

  22. Comment by Dave Rathbun

    I expect it would need to be someone in the education or training department. I don’t have any experience with the eLearning versions of these courses.

  23. Comment by Andreas (Xeradox)

    Try contacting: Training@SAP.Com directly

  24. Comment by Tom

    Training@SAP.Com did not work.

    I called their support number and they said that if I was to get their instructor based class then I would get the DB materials.
    I don’t get what is the big deal about this freaking DB. Especially when the course (BOU315) cost $1300. The other courses I got from them work with the eFashion DB that was not a problem to find.

    Anyway thanks for your help.

  25. Comment by Karthik NS

    Great Post. I am an Business Objects administrator and I dont understand anything about Universe Development. By going through your post I am getting to understand the concepts of Universe Design.

  26. Comment by edpypf

    Great Post, Dave! Thank you!!

  27. Comment by Matt

    It seems odd to have Successive as the default for this setting. I have a similar situation to the example shown and this must be common.
    I also found that the choice of which join is the shortcut can also change the behaviour. It seems that the standard join path needs to follow the hierarchy through successive 1-N joins. In the example the standard joins could have been set from the fact table to region, state and outlet with shortcuts from outlet to region and state. This leads BO to choose the ‘wrong’ joins and include the fact table when you just need a list of outlets.

  28. Comment by Dave Rathbun

    Matt, your scenario should never occur. My general rule is that a universe should work perfectly before any shortcuts are added. That means the join path you described should never occur because the universe doesn’t workout the join from outlet to region.

    The reason for the setting is at least in theory to avoid cartesian product joins. However I tend to agree that the choice is unfortunate.

  29. Comment by Matt

    Thanks for highlighting the cartesian problem. I found that the Global setting was giving unwanted (and unexpected) cartesians so I have gone back to the Successive setting and introduced a RegionState entity and then joined that to the Outlet and ShopFacts table. The join to the ShopFacts becomes the shortcut join and everything works as required.

  30. Comment by John

    Hi Dave,

    Would it be possible to share a copy of the Prestige Motors database as well please?

    Thanks,
    John

  31. Comment by Dave Rathbun

    I cannot share the Motors database as it’s used (or formerly was used) for the Designer training course. If you attended the course (or taught the course as I did) then you got a copy of the database.

    I just happened to use it to demonstrate this particular concept.

  32. Comment by sandy

    Very informative and knowledgeable post. Thanks Dave keep up the good work.

  33. Comment by Aaron Reese

    What concerns me more is that you have the fields available to create a shortcut on Client. By denormalising this field you are now relying on application logic or db constraints to ensure that Client.CountryID matches Region.CountryID where Client.RegionID = Region.RegionID. If this referential integrity is broken, then the shortcut will give you the wrong (or at least different because who is to say it is wrong..) results to the ‘proper’ join How would the universe determine whether the shortcut changes the actual results set rather than just eliminating intermediate tables. I suppose the Fault – in the most generous meaning of the word – would lie with the universe designer for either not clarifying the reasons and constraints on the denormalisation or not providing aliases for the country table (1 for use with client and 1 for use with showrooms).

    As an aside, using the original joins, it would be possible to identify clients who have purchased vehicles from outside their country, an acitivity which is quite common in Europe as different tax rates apply.

  34. Comment by swetha

    very informative post.
    my doubt being:

    what is the use of short cut joins?
    will there be any performance improvement if short cut joins used?
    how to say the query/webi to use the short cut join?
    when do we need to use short cut join?

    regards,
    swetha

  35. Comment by Dave Rathbun

    I think most of those questions are already answered in the post, are they not? Certainly the first one is. The improvement comes from the fact that fewer tables are included in the query so the database does less I/O. You as a user don’t force Webi to use a shortcut join, it makes that determination automatically. And you never need to use a shortcut join. You should always be able to get the answer going the longer route, a shortcut is just for convenience and performance.

  36. Comment by wasif

    Hey Dave I am currently working on this universe and was wondering if there is another book for webi based on the same database so I can have better understanding of the report since my universe is based on the motors logic. If there is plz refer .

  37. Comment by Dave Rathbun

    The Motors database comes with (or used to) the standard SAP training material. It’s not likely that there will be any 3rd party books that use this universe since they don’t have the rights to republish it.

  38. Comment by Nisha Prasad

    Hi Dave,

    This post defines the perfect ways to the concept of shortcut joins . It amazingly defined and expalined to be remembered always.

  39. Comment by Ganesh

    Very elaborate information. Showing a good example where a user can understand very easily. Simply U R Gr8.

  40. Comment by Vinod

    Hi Dave,

    I tried to implement the shortcut join in Island Resorts universe between the Sales and Invoice_Line tables but to no avail. Query generator is not recognizing the (only)shortcut join and several tables are being included in the query, even though i have taken only two objects Year and Number of guests. I had also removed the additional tables included in the object properties.
    Please resolve my issue.

  41. Comment by Dave Rathbun

    The question is, why are you trying to use a shortcut between Sales and Invoice Line? That join is not a candidate for a shortcut.

  42. Comment by vinod

    Hi Dave,
    To get the Number of guests based on Year dimension.

    Year Number of guests
    FY2004 10,947
    FY2005 11,364
    FY2006 12,432

    The query generated for the above result is:

    SELECT
    ‘FY’+Format(Sales.invoice_date,’YYYY’),
    sum(Invoice_Line.nb_guests)
    FROM
    Sales,
    Invoice_Line,
    Service_Line,
    Service,
    Reservation_Line,
    Reservations,
    Customer
    WHERE
    ( Customer.cust_id=Sales.cust_id )
    AND ( Invoice_Line.service_id=Service.service_id )
    AND ( Service.sl_id=Service_Line.sl_id )
    AND ( Customer.cust_id=Reservations.cust_id )
    AND ( Reservation_Line.res_id=Reservations.res_id )
    AND ( Reservation_Line.service_id=Service.service_id )
    AND ( Service_Line.service_line = ‘Accommodation’ )
    GROUP BY
    ‘FY’+Format(Sales.invoice_date,’YYYY’)

    so my question is that, can’t we remove the intermediary tables from the above query using shortcut join.

  43. Comment by Dave Rathbun

    Your query is using both sides of the loop (invoices and reservations). There is something wrong with your universe.

  44. Comment by Kevin

    what is impact if I use shortcut joins every where in universe instead of normal join?

  45. Comment by Dave Rathbun

    Simple answer: don’t do it. 🙂 It’s certainly not an appropriate design technique, and I have no idea how the SQL would be generated.

  46. Comment by Mumtaz

    Hi Dave… Very useful post.
    I understand that every join must exist in at least one context. Do we also put short cut joins in contexts or leave them as neutral join (I am considering IDT in stead of Designer)?

  47. Comment by Dave Rathbun

    I have not thoroughly tested this, but my general practice from UDT was to leave shortcut joins out of a context. Shortcut selection should not result in a different context as it could then generate an infinite loop…

  48. Comment by Tulasi

    Gracefully explained!!