Page 1 :
2.5 METADATA, , Before going into the details of the metadata component, we will first, understand why metadata is important in the data warehouse environment., Imagine that the data warehouse project team has completed the project and, everything was done as per to the schedule and within the estimated budget., The results have been tested and the data warehouse is ready for deployment., After deployment, when the first user sits at the machine to execute his, queries. We will have a number of questions in his mind like—, , = Are there any predefined queries?, , What are the various elements of data in the warehouse?, , * Does it contain the data that I need?, , * How can I browse the data to find out what data is available?, , From which source systems has the data in the data warehouse been, collected?, , * How old is the data in the warehouse?, « Is there any summary data?, To what level has the summarization/aggregation of data been done?, , Is there any pre-calculated data already available in the data warehouse?, » When was the data last refreshed?, , It is the metadata in the data warehouse which answers all the questions., Let us take an example of an entity called customer, in the data warehouse. If, , the user wants to know about this entity, he will search for this information in, the metadata repository.
Page 2 :
EEA Data Warehousing ar. a ee, , Definition A client is a person or an organization that, purchases goods or services from your company., , Remarks Customer entity includes regular, current and, past customers., , Source systems Orders placed, Maintenance contracts,, Online sales., , , , , , Create date, , Last update date, Update cycle, , Last full refresh, , Full refresh cycle, Data quality reviewed, Last de-duplication, Planned archival:, Responsible user, , , , , , 26 April 2905, , 16 November 2006, Weekly, , § June 2005, , Every Quarter, , 25 September 2006, 19 Septernfier 2006, 16 January 2007, John Mathew, , , , Figure 2.17 A sample metadata, , A metadata does not give you just the des, gives you other details explaining the synta, elements. Metadata describes all the pertinent aspe, warehouse fully and precisely to help the, warehouse. A typical metadata contains i, , = Structure of data from the programmer's perspective, , » Structure of data from the end-user’s perspective, = Source systems that feed the data warehouse, , * Transformation process that was applied before the dat, , systems could pass into the data warehouse, , = Data model, , » History of data extraction process, , cription of the entity but also, x and semantics of the data, cts of the data in the data, users and the developers of the data, nformation about the following:, , a from the source, , A sample look of how the details are stored about an entity is given in the, , Fig. 2.17., , 2.5.1 Role of Metadata, , Metadata in the data warehouse is similar to the data dictionary in a database, management system. The metadata component stores data about the data in, the data warehouse. The metadata is often used for building, maintaining,, managing, and using the data warehouse. It is the key to providing users and, developers with a road map to the information in the Warehouse. Thus, it, forms an essential ingredient in the transformation of raw data into knowledge. The three main functions that metadata performs in a data warehouse, , environment are that it, = Connects the different parts of the data warehouse thereby acting as a, glue that connects all the parts. :, * Provides information about the contents of the data and its underlving, structure to the data warehouse administrator and other users, , " Enables the end-users to search for the desired data in their own bus, terms.
Page 3 :
Data Warehouse: Defining Features (ZX, , Metadata can be looked at as tongs that handle the raw data. Without a, proper metadata in place, data stored in the warehouse will be meaningless, since the users will not be able to know- what, where, why and how the data, exists within the organization. And in the absence of this crucial knowledge, they will never be able to query this data., , / Table 2.4 illustrates that a data warehouse in the absence of metadata is, like simply filling the cabinet with a number of papers with no folder or label., No person will be able to find any valuable piece of information from those, pages. Similarly is the case with a data warehouse., , Table 2.4 Role of metadata in a data warehouse, , , , Metadata plays a key role in converting raw data into knowledge as it helps to provide, valuable description about the data so that it can be understood and converted into, meaningful information., , Data, , , , , , 12 233 870 This string of digits could either be sales of a product or population of a city., , 06/07/08 This date could either be Date of Birth or date when the sales transaction, took place. This date could either mean 6 July 2008 or 7 June 2008., , , , Metadata contains data about data, , , , 12 233 870 Sale of T-shirts in the western region, 07/06/08 Refers to the date of sales transaction stored in the format—dd/mm//yy, , , , , , , , , , 2.5.2 Classification of Metadata, , Metadata can be classified in three main groups (Refer Fig. 2.18):, , , , , , Metadata, Operational Extraction and End-user, metadata transformational metadata, metadata, e Source of the data « Extraction frequency of data ° Acts as a navigational, Data structure, field e Extraction methods map for the user, lengths, and data type e Rules for data extraction e Helps the users to find, of the source data e Transformation techniques information using their, applied to data own terms, , , , , , , , Figure 2.18 Types of metadata, , Operational metadata Data for the data warehouse comes from several operational systems of the organization which contain different data structures,, that have varying field lengths and data types. In selecting data from the, source systems, you may either have to split certain records or may have to
Page 4 :
ne rn, ; Data Warehousing pia ee, 5 with multiple :, i t files and deal Ple codin,, combine parts of records from differen oer want to be able to tie ont, , i ths. The end-users may , ', Oe ad ore ee The operational metadata solves this p: Urp ogg, , by containing all of this information about the operational data sources,, Ms tional metadata As. the name suggests, the extrac, atio’ a contains data about the data extraction, , i ion techniques that w,, , e systems and the various transformation lat Were, aed te ce before storing it in the data ie mes ee and, transformational metadata serves 4 technical purp see bes oon and, maintenance of the warehouse. The primary ee lin the nes is to, map every individual data element from its source Sy; r ware., , house. This calls for identification of the data element by its source field name,, , destination field name, transformation rules applied, business rules for Usage, , i i index., and derivation and its format, key, si2&, and inde:, , End-user metadata The end-user metadata acts as a navigational map of the, , data warehouse by enabling the end-users to find information from the data, , warehouse using their own business terminologies. eae metadata, translates a cryptic name code of a data element into a meaningful description, of the data element so that end-users can understand and use that data. For, example, it is duty of the metadata to clarify that the data element ‘CName’, , ‘ ”, represents “Customer Name for the business., , Extraction and transform, tion and transformational metadat, , 2.5.3 Importance of Metadata, , All the processes that are carried out during the building and administering of, the data warehouse generate parts of the data warehouse metadata., Metadata generated by one application is used by some other application, In, the data warehouse environment, metadata occupies a key position. It is only, through metadata that the communication among various applications and, processes is made possible. It will not be wrong to say that the metadata acts, like a nerve centre in the data warehouse., , For Using the Data Warehouse, , In a data warehouse system, the users themselves retrieve the information by, creating ad hoc queries and running them against the data warehouse. They, format their own reports. So before creating their own reports and queries, the, users need to know about the data in the data warehouse. For this they need, the metadata., , The users need sophisticated methods for browsing and examining the contents order to derive maximum output from the data warehouse. Users should, know the meaning of the data elements so that due to their ignorance about the, exact meanings, they may not draw wrong conclusions from their analysis.
Page 5 :
Data Warehouse: Defining Features Ea, , For Building the Data Warehouse, A data warehouse d., transformation techni, different tools. But, t, structures and conte, understanding of the, mappings and the t, available. Therefore,, components of the, source-to-target map, , Even. the datab, metadata for creatin:, loading of the data, this case, will pro, warehouse databas, , €veloper may be an expert in data ery oe, ques and may know very well how to wor _, © apply his expertise he first needs to know the da, nts of the data warehouse. Only after having a firm, data, the data warehouse developer can determine the, tansformations that have to be done with the data, to build the data extraction and data transformation, data warehouse, metadata about the source systems,, pings, and data transformation rules is needed., , ase administrator of the data warehouse needs the, g the physical design of the database, for doing the initial, and for performing the incremental loads. The metadata in, Vide information like—the logical structure of the data, e, the data refresh, and the load cycles., , For Administering the Data Warehouse, , The underlying complexities and enormous sizes of the data warehouses make, , it impossible to administer the data warehouse without substantial metadata., , One cannot administer the data warehouse without addressing some barely, minimum set of questions which are listed below:, , * How to include new data sources?, * How to drop some external data sources? C MITtM, , * How to handle data changes? LIBRARY, " How to audit the applications?, , How to add new summary tables in the data warehouse?, How to control run away queries?, , * How to expand storage? ye g 4, , * When to schedule platform upgrades?, , How to add new information delivery tools for the users?, How to continue ongoing training?, , How to maintain and enhance user support functions?, How to monitor and improve query performance?, , * When to schedule backups?, , * How to perform disaster recovery?, , * How to maintain security?, , * How to monitor load balancing?, , , , 2.5.4 Metadata Management, , The metadata forms a vital element in a data warehouse environment, but the, seamless integration of all the parts of metadata is a formidable task. There