Sunday 19 June 2016

Retail Category performance problems and a new way of thinking

Hi all. It's been a while since I blogged and like all of us, our precious time means that, on rare occasions we are compelled to share in the hope of exchanging knowledge. Hopefully the solution and investigation below will help many a fellow AX developer and architect in more ways than one.

Cutting a long story short, we discovered an issue in editing Retail Hierarchies (EcoResCategory), which interacts with RetailCategoryContainmentLookup (RCCL). I wanted to publish the solution for many reasons, detailed below, but more so, I strongly felt compelled to explain the need for designing elegant solutions and minimizing the impact on the system
  1. First reason is to share the solution, exiting any user form should take no more than a couple of seconds, no matter what the change, or users get concerned.
  2. Solving any problem should involve thinking about the wider issues, not just the one at hand, this is the second principle of innovation and all code is about creating something innovative. The routine RetailCategoryContainmentLookup::synchronizeAllRetailCategories  deletes all records. Rebuilding a table of a 100k entries is unnecessarily long and introduces a secondary issue where the table is empty for what seems line an eon, so in designing any code routine ensure you are working with the smallest possible changes.
  3. When handling multiple records, start thinking about how to work in small sets of records. Unsurprisingly, rebuilding the missing 1000 or 10000 entries can be slow if you process each record one at a time. But also the first code fix we saw was attempting a set operation by a mass cross joins and group, instead of handling the problem as a set of smaller steps.
  4. Ultimately there is many ways to solve this by using the most appropriate technology, whether SQL or X++ to ensure you meet just the right requirements.

For those wondering, the aforementioned routine in AX 2012 R3 CU 10 used to take a few minutes even for a 1500 lines in EcoResCategory, not exactly what you expect from closing a form. Microsoft has published a solution KB2992401 which reduces this to 2 minutes but still deletes all entries. Our customer has 90k records and even though the KB reduces this from 45 minutes to 10 minutes the X++ solution I am about to talk about reduces this to 3 seconds (a full rebuild in 18 seconds).

The optimal solution for our requirements is at the bottom :) read on if you want to understand my thinking and have time to understand more than just enclosed X++ code. More than happy to guide you if you contact me. Please also note the usual disclaimer in using any of the snippets below as this is not a deployable solution nor intended to be so.

First some background


Our current client has a rather complex retail hierarchy (EcoResCategory) and this is unique to each branch of their business. So in fact, we will end up with a table of over 90,000 entries. But our problems started a lot earlier than that with only 6,000 entries. Simply put when editing a hierarchy within the EcoResCategory form and closing this form we end up with very long delay (minutes). This causes the destruction and recreation of the RetailCategoryContainmentLookup (RCCL) table, which also introduces our second problem in that whilst this is being rebuilt we get errors in routines which we constructed to lookup values using this table. We create a view to flatten the RetailCategoryContainmentLookup (RCCL) for 3rd level lookups which fail during this time.

A little bit of background, the EcoResCategory is in fact a hierarchical tree, for example
  • Store 1 (parent 0)
    • links to Business Unit 1 (Parent Store 1)
      •     which in turn links to Department 1.1 (Parent Business Unit 1)
      •     and also links to Department 1.2 (Parent Business Unit 1)
    • A second Business Unit 2 (Parent Store 1)
      • links to a new Department 2.1 (Parent Business Unit 2)
The 6 records are then expanded into the RetailCategoryContainmentLookup to show all the parent child pairs as follows (I will tell you in a little while how I discovered this)

  • Store 1 - Store 1
  • Store 1 - Business unit 1
  • Business unit 1 - Business unit 1
  • Store 1 - Department 1.1
  • Business Unit 1 - Department 1.1 
  • Department 1.1 - Department 1.1 
  • Store 1 - Department 1.2
  • Business Unit 1 - Department 1.2
  • Department 1.2 - Department 1.2 
  • Store 1 - Business unit 2
  • Business unit 2 - Business unit 2
  • Store 1 - Department 2.1
  • Business Unit 2 - Department 2.1
  • Department 2.1 - Department 2 .1

Begin with understanding and the test scenario first


At first glance RCCL looked like it had tree pairs of records, but no easy way to confirm this. The code is complex and looked like a puzzle. In ensuring we had the correct solution I deduced that each EcoResCategory record needs as many RCCL records as its level, it was a mathematical equation, like in my school days (who says maths does not come in handy). So a leaf at level 3 needs exactly 3 records. This is both an easy test as well as a invalidation method for dirty records.

After confirming that the total RCCL records we need, is the same as the sum(level) from EcoResCategory, it was now abundantly clear that the RCCL table is a recursive tree. Although this is slightly lengthy to solve in X++, SQL has an easy way to achieve this. And out of the blue I had both a test method and the first method for creating RCCL records

with RetailCategoryCTE (category, lvl1, containedcategory, lvl2) as
(select e.recid, e.level_, e.recid, e.Level_ from ECORESCATEGORY e
join ECORESCATEGORYHIERARCHYROLE h
on h.CATEGORYHIERARCHY = e.CATEGORYHIERARCHY
where e.ISACTIVE = 1 and h.NAMEDCATEGORYHIERARCHYROLE in (4,20,22)
union all
select a.category, a.lvl1, b.recid, b.Level_ from RetailCategoryCTE a
join ecorescategory b on a.containedcategory = b.PARENTCATEGORY
and ISACTIVE = 1 and b.PARENTCATEGORY > 0)
select * from RetailCategoryCTE c
left outer join RETAILCATEGORYCONTAINMENTLOOKUP l
on c.category = l.CATEGORY and c.containedcategory = l.CONTAINEDCATEGORY
where l.CATEGORY is null
union
select * from RetailCategoryCTE c
right outer join RETAILCATEGORYCONTAINMENTLOOKUP l
on c.category = l.CATEGORY and c.containedcategory = l.CONTAINEDCATEGORY
where c.CATEGORY is null

In determining a way to test this I now had a solution in SQL which at low record counts (9000 or less) is instant, as well as I gained deeper understanding of the structure of the records.

Solution 1: using SQL


As a solution, we can simply destroy the whole table and reconstruct it, by creating a Pre Event handler to the RCCL method. We then create a set of stub records within AX X++ code and then send the harder problem to SQL.

Constructing the stub inside AX, by simply looping for all the possible levels, so first find the maximum number of levels in EcoResCategory for the appropriate hierarchyRoles (see original method for this)
select firstonly level from ecoRedCategory order by level desc 
 Secondly create the record stubs in X++ which creates RecIds and the necessary pairs
for (levelLoop = 1; levelLoop <= maxlevel; levelLoop += 1)
{
    insert_recordset containmentLookup (category, containedcategory, isdirty)
    select levelLoop, recid, SetNo
    from category where category.level >= levelLoop;
}
You can then use an executeUpdate query which looks very much like the above test query and updates the category field.

This took a couple of seconds to process but unnecessarily deletes all records first.

Please also note that there is a tertiary set of tables involved which needs to be accounted for RetailSpecialCategoryMember and RetailGroupMemberLine (see notes later)

Solution 2: Start by deleting only necessary records

First of all the records in RCCL table are not deleted but set to isdirty. So lets apply certain rules to dirty any necessary records, as follows
  • First dirty any records in RCCL which do not have the equivalent RecID in EcoResCategory
  • Secondly mark any records where the pair no longer exists in EcoResCategory
update_recordSet RCCLNewsetting isDirty = setYes
exists join ERC where (RCCLNew.ContainedCategory == ERC.RecId) && ERC.ParentCategory != 0
notexists join RCCL
where RCCL.ContainedCategory == ECR.RecId
&& RCCL.Category == ERC.ParentCategory
&& RCCL.IsDirty == setNo && RCCL.Processed == setYes;
 
  •  Repeat the same update above for the RCCLnew.Category field. Lesson learned here, combining the above statement to join for both (RCCLnew.Category || RCCLNew.ContainedCategory) against ERC.RecId gives you significantly slower results. So always check your execution with full data and use trace parser to better understand your code.
  •  Mark all records where the total number of recs for RCCL.ContainedCategory is not the same as the Level field in EcoResCategory. I used a View based on a Query which groups the RCCL records by ContainedCategory and provides the total number of records
Finally delete all records where the RCCL.ContainedCategory is dirty for any records at all

delete_from RCCLNew
exists join RCCL
where RCCLNew.ContainedCategory == RCCL.ContainedCategory
&& (RCCL.IsDirty == setYes || RCCL.Processed == setNo);

You may have spotted already a new field in RCCL called processed. This is being used to process only the necessary records.

Solution 2. Finish by creating the necessary records

Now we cleaned up by only deleting the dirty records and any records associated with these records, we begin by reconstructing all necessary missing records starting from the bottom up.

Unsurprisingly we first add into the RCCL table all records, for the appropriate category roles, which do not currently exist in RCCL. Setting the isDirty to No and of course the Processed to No. The Category and ContainmentCategory are set to the EcoResCategory Recid
insert_recordset containmentNew
(category, containedcategory, isdirty, processed)
select
recid, recid, setNo, setNo
from category
exists join hierarchyRole
where category.CategoryHierarchy == hierarchyRole.CategoryHierarchy
&& (hierarchyRole.NamedCategoryHierarchyRole == EcoResCategoryNamedHierarchyRole::Retail
|| hierarchyRole.NamedCategoryHierarchyRole == EcoResCategoryNamedHierarchyRole::RetailSpecialGroup
|| hierarchyRole.NamedCategoryHierarchyRole == EcoResCategoryNamedHierarchyRole::RetailChannelNavigation)
notexists join containmentLookup
where containmentLookup.Category == category.RecId
&& containmentLookup.ContainedCategory == category.RecId;

So now we keep looping until we have nothing else to process and first we add the parent records to the records currently not processed, which explain the new indicator

insert_recordset containmentNew
(containedcategory, isdirty, processed, category)
select
containedcategory, setYes, setNo
from containmentLookup
where containmentLookup.processed == setNo
join ParentCategory from category
//If the category field has a parent then this needs to be created
where category.recid == containmentLookup.category
&& category.ParentCategory != 0;
We can also now handle the inserting of records for RetailSpecialCategoryMember and RetailGroupMemberLine

Mark all records which are not dirty and not processed as processed. And finally mark all dirty records as not dirty, This will basically leave you with records which were not processed and not dirty for the next round of processing.

Repeat these steps until you have no more records.

Solution 2 final conclusions.

Is it in fact possible to look to simplify the problem by using a combination of SQL CTE returning a full table and then using this to compare and contrast, but in all my working it seems as fast to handle the situation in X++ with comments and selects which most X++ developers understand, unless we are dealing with much larger sets of records.

Two key techniques helped in deterring the best solution, first being able to test the solution using the SQL CTE query and secondly trace parser. It became easily visible that in fact the RCCL table had an index which only contained the isDirty flag. This caused over  3 second delay in my code and I removed it as its a completely pointless non specific, non unique and in fact poor index as its non-sequential. Records will switch between isDirty and cause unnecessary page breaks in SQL so this is a clear target for performance elimination :)

Conclusion is that by breaking the problem down to small enough chunks and using simply set operations is much more efficient and faster (<1 second) than trying to solve the problem in one big step (2 minutes) or in  many individual steps (20 minutes).
 
For most folks taking the KB from Microsoft may suffice, after all use the path of least resistance.

If you need something more timely then feel free to borrow from the above and contact me in your endeavors.

Keep enjoying the learning process
Angelos


No comments:

Post a Comment