微軟數(shù)據(jù)倉庫介紹
,Slide Title,Body Text,Second level,Third level,Fourth level,Fifth level,Module,1,Introduction to,D,ata Warehousing,Module Overview,數(shù)據(jù)倉庫概述,考慮數(shù)據(jù)倉庫解決方案,Lesson 1,:,數(shù)據(jù)倉庫概述,商業(yè)難題,什么是數(shù)據(jù)倉庫?,數(shù)據(jù)倉庫架構(gòu),數(shù)據(jù)倉庫解決方案組件,數(shù)據(jù)倉庫項(xiàng)目,數(shù)據(jù)倉庫項(xiàng)目角色,SQL,Server,作為數(shù)據(jù)倉庫平臺(tái),業(yè)務(wù)難題,關(guān)鍵業(yè)務(wù)數(shù)據(jù)分布在多個(gè)業(yè)務(wù)系統(tǒng),找到業(yè)務(wù)決策的信息是耗時(shí)的和容易出錯(cuò)的,基本的業(yè)務(wù)問題很難回答,?,What Is,a,Data Warehouse?,一,個(gè)集中存放用于報(bào)表和數(shù)據(jù)的信息容器,通常,一個(gè)數(shù)據(jù)倉庫,:,包含大量的歷史數(shù)據(jù),優(yōu)化了數(shù)據(jù)查詢,(,而,不是插入和更新,),定期加載新的業(yè)務(wù)數(shù)據(jù),為企業(yè)商務(wù)智能解決方案提供依據(jù),Data Warehouse,Architectures,Centralized Data Warehouse,Departmental Data Mart,Hub and,Spoke,Components of a Data,Warehousing,Solution,從業(yè)務(wù)系統(tǒng)和其他數(shù)據(jù)源抽取數(shù)據(jù)加載,數(shù)據(jù)通常最終加載到數(shù)據(jù)倉庫,數(shù)據(jù)清洗和重復(fù)數(shù)據(jù)的刪除,確保數(shù)據(jù)倉庫中數(shù)據(jù)的質(zhì)量,MDM,提供確切的業(yè)務(wù)數(shù)據(jù)實(shí)體,Data Warehouse,Staging Database,ETL Load Process,ETL Staging Process,Master,D,ata Management,10,1,10,00110,Data Cleansing,Reporting and Analysis,Data Sources,Data,Warehousing Projects,首先確定數(shù)據(jù)倉庫需要解決的業(yè)務(wù)問題,確定回答這些問題所需的數(shù)據(jù),識(shí)別所需數(shù)據(jù)的數(shù)據(jù)源,評(píng)估關(guān)鍵業(yè)務(wù)目標(biāo)價(jià)值可行性,從現(xiàn)在的數(shù)據(jù)回答每個(gè)問題,對(duì)大量數(shù)據(jù)的項(xiàng)目,使用增量更新比較有效,:,把項(xiàng)目分解為多個(gè)子項(xiàng)目,每,個(gè)子項(xiàng)目處理一個(gè)特定的主題,Data,Warehousing Project,Roles,Project manager,Solution architect,Data modeler,Database,administrator,Infrastructure,specialist,ETL,developer,Business users/analyst,Testers,Data stewards,SQL Server As a Data Warehousing Platform,SQL Server,Analysis Services,SQL Server Database Engine,Microsoft SQL Server Integration Services,SQL Server Master,D,ata Services,10,1,10,00110,SQL Server Data Quality Services,Microsoft SQL Azure,and the Windows Azure,M,arketplace,Microsoft SharePoint Server,Microsoft PowerPivot Technologies,Microsoft Excel,Data Mining Add-In,PowerPivot Add-In,MDS Add-In,Power View,SQL Server,Reporting Services,Reports,KPIs,and Dashboards,Interactive data visualizations,Interactive data analysis,Data Warehousing,Business Intelligence,Lesson,2,:Considerations for a,Data Warehouse Solution,Data Warehouse Database and Storage,Data Sources,Extract,Transform,and Load Processes,Data Quality,and Master,Data,Management,Data Warehouse Database,and Storage,考慮數(shù)據(jù)倉庫包括,:,Database schema,Logical:typically denormalized,for optimal read performance,Physical:,often partitioned for performance and management,Hardware,Query processing and memory,Storage,Network,High,availability,and,disaster recovery,Hardware redundancy,Backup strategy,Security,Server access,Data permissions,Data Sources,數(shù)據(jù)源連接類型,憑證和權(quán)限,數(shù)據(jù)格式,數(shù)據(jù)采集窗口,Extract,Transform,and Load,Processes,臨時(shí)表:,存放,臨時(shí)數(shù)據(jù),所需的轉(zhuǎn)換,:,提取數(shù)據(jù)時(shí)所需的數(shù)據(jù)轉(zhuǎn)換和清洗,增量,ETL:,數(shù)據(jù)的變化加載,Data Quality,and Master Data Management,Data quality:,Cleansing data:,Validating data values,Ensuring data consistency,Identifying missing values,Deduplicating data,Master data management:,Ensuring consistent business entity definitions across multiple systems,Applying business rules to ensure data validity,10,1,10,00110,Module Review and Takeaways,Why might you consider including a staging area in your ETL solution?,What options might you consider for performing data transformations in an ETL solution?,Why would you assign the data steward role to a business user rather than a database technology specialist?,