Page 1 :
4.6 INFORMATION PACKAGE DIAGRAMS (IPD), , In order to record information requirements for a data warehouse project and, present the various insights, thoughts and opinions expressed during the, process of collecting requirements in terms of business dimensions and facts,, IPDs are used. The IPD, are very useful for taking the development of data, warehouse to the subsequent stages., , 4.6.1 What Information does an IPD Contain?, , The primary concern of the requirements definition phase of IPD is to compile, information packages for all the subjects that are relevant for the data, warehouse. The IPD enables the data warehouse project team to define the, subject areas, key business metrics, the way in which the users will aggregate, or roll up data, the data granularity, and the size of the data warehouse size., , As we have already discussed that the traditional methods of gathering, requirements for operational systems are not applicable for a data warehouse, project, we need a new technique for determining requirements for a data, warehouse system that is based on the metrics and the business dimensions, along which the facts are analysed. This new technique is known as IPD. For, every specific subject, we form an IPD., , To sum up, the subject goes on the top most position of the IPD, the metrics, forms the next section, the business dimensions will be the row headings and, in each row you will include the hierarchies and categories for the business, dimensions.
Page 2 :
QE] Data Warehousing, , , , , , 4.6.2 Example, , Let us first consider a simple example of our student monitoring system and, form its IPD’as shown in Fig. 4.3. Once the concept is clear, we will take two, more examples that will discuss the features of IPD in more detail., , This IPD is very simple. Here we are analysing two facts—student's aggregate marks and his attendance. These facts are measured along different dimensions. The first obvious dimension is time that is used to monitor daily / weekly,, monthly /yeatly, and attendance of the student in the entire semester. A student |, can perform much better in the same subject if it is taught by a different professor. Professor dimension is therefore the next important dimension that we, have. Here we analyse every professor's name, his professional qualification,, experience, grade, and number of subjects that he is teaching. :, , , , , , , , , , , , , , , , Subject: Student's performance i, ‘Facts: Attendance, aggregate ;, Time dimension : Day Week ‘Month Semester La eae, Professor dimension Name | Professional:} Experience | Grade No. of, eee é qualification 5 subjects., “Subject dimension | Name | Semester Theory/, : practical, Student dimension Name | Grad/PG Family Division in Grade, : : income class-Xil, Course dimension Name Duration “Type University, , , , , , , , , , , , , , , , , , Figure 4.3 IPD for a student monitoring system, , It is not surprising that a student may perform poorly in one subject and, outstandingly in the another, our next dimension is the subject dimension., Here we will analyse the student's attendance and aggregate based on the, name of the subject, whether it is theory or practical and the semester in which, the subject is taken (some students perform better in semester in which there, are less activities)., , The most obvious dimension is the student dimension. This dimension, contains all the details of every individual student. The course dimension, incorporates the details of the course that the student is pursuing. :, , Figure 4.4 shows the information package diagram for analysing sales for, a retail chain. The subject is sales. The measured facts that are. of interest for, analyses are shown in the bottom section of the information package diagram., In this case, the metrics to be analysed are actual sales, forecast sales, price,, and discount. j
Page 3 :
Gathering the Business Requirements, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Sales, , Facts: Actual sales, forecast sales, price, discount, , Time dimension Day Week Month Quarter Year, , Product dimension Name Brand Category | Colour Price, , Customer dimension Name Age Income Gender Marital, status, , Store dimension Name City State Country | Operational, from year, , Payment method Payment type | Interest rate, , dimension, , , , , , , , , , , , , , , , Figure 4.4 IPD for sales analysis, , The business dimensions along which these facts/metrics are to be, analysed are shown as row headings. In our example, these dimensions are, time, product, customer, store, and payment method. Each business dimension contains a hierarchy or levels. For example, the time dimension has the, hierarchy going from the year down to the level of a single day. The other, intermediary levels in the time dimension could be year, quarter, month, and, week. These levels or hierarchical components are shown as attributes within, their particular column headings in the IPD., , Similarly, the IPD of the hotel business can be formed as given in Fig. 4.5., Here the dimensions are time, hotel, the type of room, and finally the facilities, that are provided to the customers. The facts are number of occupied rooms,, number of vacant rooms, number of unavailable rooms, number of occupants,, and revenue generated., , , , , , , , , , , , , , , , Subject: Hotel occupancy, Facts: Occupied rooms, vacant rooms, unavailable rooms, occupants, revenue, Time dimension Day Week Month Quarter | Year Holiday flag, Hotel dimension Branch | Branch | Region City Construction | Renovation, name | code year year., Room dimension Room | Room | No. of Bed size | Max No.of | Price, type size beds occupants, Facilities dimension Gym Side Swimming} Cultural | Out door SPA, view pool activities | games, , , , , , , , , , , , , , , , , , Figure 4.5 IPD of the hotel business, , One thing to note is that, it is not necessary that all the IPDs of the hotel, business will be the same, It will vary depending upon the requirements of the, users—what they want to measure and along what dimensions they want to, measure.
Page 4 :
THQ bata Warehousing, , 4.6.3 ‘Reason for Forming the IPD, , Once the facts, dimensions and the hierarchies/categories within, Be dimensions are finalized, we will see how dimension tables and fact tables be, B | formed from'a given IPD. To build the logical design of the data Wareho ae, ‘ ~ We will form two types of tables— the fact table and the dimension table. ©, For a given IPD, there will be n number of dimension tables where 71 jg th, ‘umber of dimensions. The name of the dimension table will come from i, Name of the dimension itself, that is, the row heading of the IPD wil] beco,, the name of the dimension table. The hierarchies /categories shown in the Ph s, will become the attributes of the dimension table. Figure 4.6 shows th, dimension tables that are formed from the retail chain IPD., Note that apart from other attributes of the dimensions, there is one more, attribute— the Key. It represents the primary key of every individual record that, will be stored in the dimension table. Its value will be used to uniquely identif,, one entity from the other. Most preferably, the value of the key is a system gen., erated sequence number. We will learn more about this in the next chapter., The metrics or facts from the IPD will form the fact table. Figure 4.6 shows, the fact table that is formed. The fact table gets its name from the subject fo;, analysis. In this case, it is sales. Each fact item or measurement goes into the |, fact table as an attribute for sales. Note that the fact table contains the fact plus, the primary keys of all the dimension tables., , , , , , , , , , , , , , , , Time dimension, table, , ee llant-tae ile) a}, table, , , , , , Product dimension, table, , © Product key, , (Time key “Customer key, , , , ' Year » Name ‘Name, ). Quarter Brand Age, Month = _ Category “Income, , Week. =, Date, , * Colour. ~ Gender., Marital status, , , , , , , , Store dimension ayment method, , t table, te)] Asion tab z, , Time key, | Customer key. + Product key., Store key |, Payment method key., Actual sales, Forecast sales, Price, Discount, , , , , , , , , , , , , , , , , , , , Figure 4.6 Fact tables and dimension tables formed from the, given IPD