Database Queries / Data Cental

Dataset design

Be precise on column naming across the entire solution to simplify creation of drilldowns for designers (Automatic column filtering).

Create wide datasets:

  • To improve usability on the above layers (Editor and Runtime)
    If a dataset lacks a specific column, add it as long as it does not affect performance dramatically
     
  • To prevent creation of numerous similar dataset
    Duplication of similar datasets increases the risk of data inconsitency between datasets

The only limitation of wide datasets is performance. The wider the dataset, the poorer performance.

Manual Query Performance

To be able to implement grouping on a manual query object we need to wrap the query into a queriable and groupable unit.
In SQL Server the best option is using a CTE (Common Table Expression) which works pretty much like a view but it is not a persistant object in the database but rather an ad-hoc object queriable only once.

Just like views and table valued functions, these CTE's have its limitations when it comes to what kind of queries it supports.
Eg. it does not support declarative statements inside the query etc.

When your query contains commands not supported by the CTE rules, there is a fallback feature which means that your query will be wrapped inside a temporary procedure.
This means that your query will always execute successfully but this fallback has a rather dramatic impact on the performance!

1 - Prevent temp-procedure creation

In cases where your manual query datatable execution fallback to the temporary procedure creation we strongly recommend either to rewrite your manual query to be CTE-valid or to create a permanent stored procedure in the database instead.

You can identify the type of execution by doing a datatable preview in DataCentral and click the double-arrow button to view its query:

2 - Join tables by primary keys or indexed columns

It is considered best practice to always join tables by primary keys or columns that are indexed when it comes to query performance.

3 - Filter datasets as early as possible

When a dataset consist of several large database tables joined together it is considered best practice to filter data as early as possible to improve performance and memory consumption on the database server.
By moving filtering statements from the where-statement to the join-statements you limit the amount of data that has to be loaded into memory for joining.

For that exact reason we strongly recommend implementing crucial parameters inline (more on this in the section below).
By moving these parameters inline we can limit the amount of data needed to be loaded into memory in both join- and grouping-operation which will have a significant impact on performance.

Bottom line her is:

=param(); is your performance friend!

Check out the next article on how to implement parameters inline!