Sunday, August 26, 2007

Reporting

Reporting

The next important tool in the Data Warehousing is the Reporting tool.The Reporting tool makes our life much easier as it enable us to create reports within few minutes The Reporting tool is one of the significant tool in the Data Warehouse industry.Based on the requirement in the business application,the reporting tool is being designed.The main criteria on which it is depending on are as follows :
  • Number of reports :If the number of reports to be designed and created are enormous, the Reporting tool should be chosen based on its utility.
  • Desired Report Distribution mode: The accessing of the reports would be better if its accessed in a variety of channels rather than accessing it via browser.This implies viewing the report in excel,.csv ,pdf ,html,etc.
  • Adhoc report creation: The ability of the reporting tool lies in the creation of the ad-hoc reports.This requires the strong metadata layer to be built so that the application can be built on .

Tuesday, August 21, 2007

OLAP Tools


OLAP Online Analytical Processing

The foremost element of OLAP is OLAP server, residing between the client and a database management systems(DBMS).There are two types of OLAP server based on the functionalities.They are as follows:

  1. ROLAP : An acronym of Relational Online Analytical Processing,which utilises the relational database source
  2. MOLAP:An acronym of Multidimensional Online Analytical Processing, which utilises the multidimensional database source
Functionalities:
  • Power to render parallelism furnished by RDBMS and hardware : This enhances the tool's performance and more apparently assists in loading the data quickly
  • Performance:In addition to rendering parallelism,the tool should have the ability in loading and reading the data from the cube
  • Customization efforts:In today's world the rate at which OLAP tools are used as reporting tool are increasing and for certain ROLAP implementations, the same OLAP tools are being utilized as the reporting tool.Hence the ease of customization must be handled efficiently
  • Security Features: Since OLAP tools are being utilized by ' n ' number of users, the security is a BIG concern.The data must be authenticated for every login user.Usually SSO (Single Sign On) policy is adopted for security concerns.
  • Metadata Support:The support of the Metadata in an OLAP tool is essential,because OLAP tools combines the data into the cube and also serves as the front- end tool.


Popular Tools

  • Business Objects
  • Cognos
  • Hyperion
  • Microsoft Analysis Services
  • MicroStrategy

Wednesday, August 8, 2007

ETL Tools

ETL an acronym of Extraction Transformation and Loading is an important process in the Data Warehousing.The determination of ETL tool depends on three things namley:

  • Complexity of the data transformation
  • Data Cleansing needs:This depends on the type of the data being stored in the Data Warehouse,if the data requires more cleansing, the ETL tool should be selected accordingly
  • Data Volume:If the data volume is huge the tool should be chosen based on it.

ETL Functionalities :

  • Functional capability: The prenominal aspect of the ETL tool includes both the "transformation" and the "cleansing".Generally the ETL tool has strong capabilities in both the process.Merely if the data is filthy,it requires enormous amount of cleansing and hence the ETL tool chosen should have the more cleansing potentiality.So based on the requisite, the tool can be determined.
  • Power to interpret directly from data source:Every organisation utilises different set of data sources।Hence the ETL tool selected should have the ability to connect with the data source
  • Metadata Support: The ETL tool plays a vital role in Metadata because it maps the source data to the destination.

Popular Tools for ETL are as follows :

  • IBM WebSphere Information Integration (Ascential DataStage)
  • Ab Initio
  • Informatica

Various Datawarehousing Tools

The various tools that are used for Data Warehousing are as follows:
  • Database,Hardware
  • ETL (Extraction,Transformation and Load)
  • OLAP
  • Reporting
  • Metadata


Database/Hardware :There are several factors that needs to be considered.
They are as follows:



  • Scalability: The Data Warehouse system is a centralised system which is used to large amount of data.The data storage is quite enormous।The system should be selected in such a manner that it can handle large amount of data efficiently
  • Parallel Processing Support: Today's powerful computers utilize multiple CPU's where in each of the process can perform apart of the task,all at the same time
  • RDBMS/Hardware Combination: The RDBMS physically resides n the hardware platform,so the choice of the RDBMS should be based on the efficencyof the hardware to handle enormous data

RDBMS: Its a Relational Database Management System which stores the data in the form of related tables and they are very powerful because its requires few assumptions the way in which the data is related and they are extracted from the database


Few of the popular Relational Databases and OS platforms are listed below:

Relational Databases
  • Oracle
  • Microsoft Sql Server
  • IBM Db2
  • Teradata
  • Sybase
  • MySql

OS Platforms
  • Linux
  • FreeBSD
  • Microsoft

Tuesday, August 7, 2007

Types of Datawarehouse

A Data warehouse, a relational/ multidimensional database , in which the environment often consists of an ETL solution, an OLAP engine ,client analysis tool and other applications that can manage the process of data gathering and delivering to business users.

Basically Data Warehouse, a dedicated database, a central repository containing the data from the various operational sources in an organization.It includes detailed,stable,non-volatile and consistent data which can be analyzed in the time variant.


Types of
Datawarehouse:

  • Enterprise Data Warehouse: Provides a central database for decision support throughout the enterprise.
  • ODS(Operational Data Store) : Having a broad enterprise wide scope,but unlike real enterprise datawarehouse, data is refreshed in real time and used for routine business activities
  • Data Mart: A subset of data warehouse,supporting a particular region,business unit or business function.

Wednesday, August 1, 2007

What is Datawarehouse?

A Data Warehouse is a repository of integrated information, available for queries and analysis.Data and information are extracted from heterogeneous sources.It is merely a collection of data marts representing historical data from various operations in the company.

Data mart :

Its a segment of data warehouse that can provide data for reporting and analysis on a section, unit, department, (e,g). sales,payroll,production.


Why Data warehousing?

Benefits:
  • Datawarehouses are designed to perform aggregate queries running on large amounts of data .
  • The structure of data warehouse is easier for end users to navigate and understand.
  • Queries that would be complex in very normalized database could be easier to build and maintain in data warehouses decreasing their workload on transactions.
  • Efficient way to manage demand for lots of information from lots of users.
  • Efficient way to manage and report on data that is from a variety of sources, non uniform and scattered throughout a company.
  • Provides the capability to analyze large amounts of historical data that can provide an organisation with competitive advantage

Components of Data Warehouse:

Data Warehouse system consists of three components namely,
  • A database :Pulls the selective data from the existing operational databases such as Accounting, Forecasting, HumanResources, Marketing, Engineering & Sales
  • A query tool : This tool enable the users to access to the real time Data Warehouse database for query generation, creating a report, analysing the reports and also for exporting the data to numerious formats
  • Database interfaces : It allows for updating the database daily, weekly or on a monthly basis.In other words it acts as a simple bridge for transfer of data between operations database to the data warehouse