Monday, August 15, 2016

ROLAP Polling Query - use a change tracking table

Setting up the polling query for our real-time ROLAP cube against an EXASOL database, we used an easy, but very bad practice approach.  We set our polling query directly against a fact table. Nice and easy for testing and proof of concept. Production? don't consider using it.

Polling a count against a small or medium size fact table might won't create much of a problem. When the dataset is huge it might take more resources.  So, when the data set is really large, and our goal is very near real time, why not poll a very small table.

Data Change Table

Let's call it TrackDataChange.  A minimalist version will have just a few columns:  the source system that loaded the data, the load date and a primary key ID. The only difficult part is you have to configure your ETL process(es) to insert a single row to your new TrackDataChange table. The new record simply marks that the ETL process has completed.


With the TrackDataChange table in place, and your ETL processes configured to insert a row after inserting new fact table table, your polling query will look like this:

        SELECT COUNT(*) FROM DWTEST.TRACKDATACHANGE;

You'll probably want more than just the basic columns. The key point is to run SSAS's polling query against an independent table dedicated to tracking fact table changes.

Pros

the polling query avoids data tables used for data analysis
the TrackDataChange table is narrow and holds few records
the TrackDataChange table is quick to query
the TrackDataChange table is quick to insert into

Cons

the ETL process is marginally more complex

No comments: