Saturday, 9 February 2013

What is Extended Star Schema?

In the star schema design, fact table sits in the middle and is connected to other surrounding objects (dimension lookup tables) like a star. Each dimension is represented as a single table. The primary key in each dimension table is related to a foreign key in the fact table.
Ok. Now what is Extended Star Schema?
Let us take a sample transaction:
*Customer * *Product * *Color*** *Sales * *Fact *
C1 P1 Black S1 30

In the above figure, you can notice a cluster of tables. Let me explain what is actually happening with these tables.
We mentioned Sales, Customer, Product and Color as the master data.SID tables are the tables which store the Master Data in BW. So as soon as system encounters Sales person S1, Customer C1, Product P1 and Color Black which are not present in the system. BW System creates SID’s for them. In the above figure you can see SID’s created highlighted in red box.
In the above figure you can see how the Dimension Id’s (DID’s) are created. For every combination of SID’s we get a unique DID.
We have 2 cases here:
1)      Keeping the Sales and Customer as separate dimensions, we can see how the Dimension ID’s are created.
2)      In the other case, where Product and Color are kept under the same dimension as the relationship between the product and Color is one-one. We can see that Dimension ID is created for the unique combination of both the SID’s.
We can see now how the Fact table is constructed with the help of Dimension and Surrogate ID tables.
In this way the facts are analyzed with the help of dimensions.
Now let us have a look on what happens if we have another transaction entry.
*Customer * *Product * *Color*** *Sales * *Fact *
C1 P1 Red S1 40
In the above transaction, we find all the entries are same except color i.e. “RED”.
Now let us see how BW captures the data into the database.
Now you can see that new color “RED” has created a new entry in SID table as 2 and therefore created a new entry in dimension table “2”.
And thus the fact table points to those values and transaction is stored
This is how an Extended Star Schema works.
Now let us focus on some interesting points of Extended Star Schema:
1)      By keeping the master data outside of Dimension tables, we are able to share the master data across the system there by reducing the need of redundancy of master data.
2)      We can have a maximum of 16 Dimension tables around Fact table.
But with advent of HANA, do you think this design is relevant? Do we really need to maintain data? So is this most popular design is not going to be used anymore?
With HANA we implement in-memory technique. Which means the system stores all the data “IN- MEMORY” like RAM storing all the data? When asks any question (reporting) to the system, then the system would take help of the views concept in HANA and gather the require data for answering the question.
Let us witness the biggest change of how the world moves towards HANA. I hope this wiki helped you in understanding the basic concept of Extended Star Schema.

Star Schema   


InfoCubes are made up of a number of InfoObjects. All InfoObjects (characteristics and key figures) are available independent of the InfoCube. Characteristics refer to master data with their attributes and text descriptions.
An InfoCube consists of several InfoObjects and is structured according to the star schema. This means there is a (large) fact table that contains the key figures for the InfoCube, as well as several (smaller) dimension tables which surround it. The characteristics of the InfoCube are stored in these dimensions.
An InfoCube fact table only contains key figures, in contrast to a DataStore object, whose data part can also contain characteristics. The characteristics of an InfoCube are stored in its dimensions.
The dimensions and the fact table are linked to one another using abstract identification numbers (dimension IDs) which are contained in the key part of the particular database table. As a result, the key figures of the InfoCube relate to the characteristics of the dimension. The characteristics determine the granularity (the degree of detail) at which the key figures are stored in the InfoCube.
Characteristics that logically belong together (for example, district and area belong to the regional dimension) are grouped together in a dimension. By adhering to this design criterion, dimensions are to a large extent independent of each other, and dimension tables remain small with regards to data volume. This is beneficial in terms of performance. This InfoCube structure is optimized for data analysis.
The fact table and dimension tables are both relational database tables.
Characteristics refer to the master data with their attributes and text descriptions. All InfoObjects (characteristics with their master data as well as key figures) are available for all InfoCubes, unlike dimensions, which represent the specific organizational form of characteristics in one InfoCube.


You can create aggregates to access data quickly. Here, the InfoCube data is stored redundantly and in an aggregated form.
You can either use an InfoCube directly as an InfoProvider for analysis and reporting, or use it with other InfoProviders as the basis of a MultiProvider or InfoSet.
Newer Post
This is the last post.


Post a Comment