VB.net 2010 视频教程 VB.net 2010 视频教程 python基础视频教程
SQL Server 2008 视频教程 c#入门经典教程 Visual Basic从门到精通视频教程
当前位置:
首页 > 数据库 > SQL教程 >
  • 2019 年数据仓库 BI 及 Data Science 最全书单

​前两天在网络上搜集一些数据仓库的书,发现有个哥们写了一个非常详细的书单。这份书单可能是2019年最齐全的数据仓库,BI以及数据科学学习书单了,不敢独享,转载到这里方便大家一起学习。

由于本篇文章,在 wordpress.com 站点上,原文可能并不是每个人都可以访问,具体原因大家都懂的。所以我就一字不差都转载过来,包括作者自己写的一本入门级数据仓库的书。

作者:Vincent
原文:dwbi1.wordpress.com/dat

Disappointed with the Google search result of “data warehousing books”, I try to put all data warehousing books that I know into this page. It is totally understandable why Google’s search result don’t include ETL or Dimensional Modeling, for example. Same thing with Amazon, see Note 1 below. Even data warehouse books as important as Inmon’s DW 2.0 was missed because the title doesn’t contain the word “Warehouse”.

For data modelling my all time favorite is the Kimball’s toolkit (#1 in the list). Devlin’s, Inmon’s and Imhoff’s classics (#3, #4 and #5 in the list) have broaden my horizon on the basic principles of DW design. For ODS design it’s #17 and the newest model is in #6. If you are building a DW on SQL Server platform, Mundy’s Toolkit (#2) is a treasure. On Oracle, it’s Hobbs (#54) and on Teradata it’s Coffing’s series (#58 to #63). #7 to #11 explain Kimball’s theory in more detail. Some of them are dimensional modelling (Adamson’s #8 is excellent), some are about ETL (Kimball’s #7 is a jewel). For methodology/project management #11 is the classic, #27 is a proven treasure and #83 for the iterative approach.

  1. The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling by Ralph Kimball and Margy Ross
  2. Microsoft Data Warehouse Toolkit: With SQL Server 2005 and the Microsoft Business Intelligence Toolset by Joy Mundy, Warren Thornthwaite, and Ralph Kimball
  3. Building the Data Warehouse by W. H. Inmon
  4. Mastering Data Warehouse Design: Relational and Dimensional Techniques by Claudia Imhoff, Nicholas Galemmo, and Jonathan G. Geiger
  5. Data Warehouse: From Architecture to Implementation by Barry Devlin
  6. DW 2.0: The Architecture for the Next Generation of Data Warehousing by William H. Inmon
  7. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data by Ralph Kimball and Joe Caserta
  8. The Star Schema Handbook: The Complete Reference to Dimensional Data Warehouse Designby Christopher Adamson
  9. The Data Webhouse Toolkit: Building the Web-enabled Data Warehouse by Ralph Kimball and Richard Merz
  10. Data Warehouse Design Solutions by Christopher Adamson and Michael Venerable
  11. The Data Warehouse Lifecycle Toolkit by Ralph Kimball, Margy Ross, Warren Thornthwaite, and Joy Mundy
  12. Building a Data Warehouse: with Examples on SQL Server by Vincent Rainardi
  13. Oracle Data Warehousing and Business Intelligence Solutions: With Business Intelligence Solutions by Robert Stackowiak, Joseph Rayman, and Rick Greenwald
  14. Impossible Data Warehouse Situations: Solutions from the Experts (Information Technology)by Sid Adelman, Joyce Bischoff, Jill Dyché, and Douglas Hackney
  15. Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance by Christopher Adamson
  16. Data Warehouse Performance by W. H. Inmon, Ken Rudin, Christopher K. Buss, and Ryan Sousa
  17. Building the Operational Data Store by W. H. Inmon, Claudia Imhoff, and Greg Battas
  18. Rapid Data Warehouse Design: User-Focused Techniques for Designing Dimensional Data Warehouses by Lawrence Corr
  19. Data Warehouse Design: Modern Principles and Methodologies by Matteo Golfarelli and Stefano Rizzi
  20. Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications (Data-centric Systems and Applications) by Elzbieta Malinowski and Esteban Zimányi
  21. Designing a Data Warehouse – Supporting Customer Relationship Management by Chris Todman
  22. Data Warehouses and OLAP: Concepts, Architectures and Solutions by Robert Wrembel and Christian Koncilia
  23. Implementing a Data Warehouse: A Methodology That Worked by Bruce Russel Ullrey
  24. Data Warehousing for Dummies by Thomas C. Hammergren
  25. Improving Data Warehouse and Business Information Quality : Methods for Reducing Costs and Increasing Profits by Larry P English
  26. Data Warehouse 100 Success Secrets – 100 Most Asked Questions on Data Warehouse Design, Projects, Business Intelligence, Architecture, Software and Models by Richard Martin
  27. Data Warehouse Project Management by Sid Adelman and Larissa T. Moss
  28. Data Warehouse Management Handbook by Kachur
  29. Data Warehouse: Extract, Transform, Load, Metadata, Data Integration, Data Mining, Data Warehouse Appliance, Database Management System, Decision Support System by Frederic P. Miller, Agnes F. Vandome, and John McBrewster
  30. Oracle Data Warehouse Tuning for 10g by Gavin JT Powell
  31. Using the Data Warehouse by W. H. Inmon and Richard D. Hackathorn
  32. Entity-attribute-value model: Data model, Data warehouse, Denormalization, Attribute- value system, Linked Data, Resource Description Framework, Semantic Web, Inner- platform effectby Frederic P. Miller, Agnes F. Vandome, and John McBrewster
  33. Index Structures for Data Warehouses: v. 1859 (Lecture Notes in Computer Science) by Marcus Jürgens
  34. Tivoli Data Warehouse Version 1.3: Planning And Implementation by IBM Redbooks and Vasfi Gucer
  35. Data Warehouse Implementations: Critical Implementation Factors Study by Joe Ganczarski
  36. The Enterprise Data Warehouse: Planning, Building and Implementation v. 1 by Eric Sperley and Hewlett-Packard
  37. Data Warehousing in the Real World: A Step-by-step Guide for Building Decision Support Data Warehouses by S. Anahory and D. Murray
  38. Filtering the Web to Feed Data Warehouses by Witold Abramowicz, Pawel J. Kalczynski, and Krzysztof Wecel
  39. Data Warehouse: Practical Advice from the Experts by Joyce Bischoff and Ted Alexander
  40. Leveraging DB2 Data Warehouse Edition for Business Intelligence by IBM Redbooks
  41. Fundamentals of Data Warehouses by Matthias Jarke, Maurizio Lenzerini, Yannis Vassiliou, and P. Vassiliadis
  42. Web-enabled Data Warehouse by William A. Giovinazzo
  43. Decision Support and Data Warehouse Systems by Efrem G Mallach
  44. Planning and Designing the Data Warehouse (The Data Warehousing Institute series) by Ramon Barquin and Herb Edelstein
  45. Data Warehouse Design by William A. Giovinazzo
  46. Building, Using and Managing the Data Warehouse (Data Warehousing Institute) by Ramon Barquin and Herb Edelstein
  47. Building a Data Warehouse for Decision Support by Vidette Poe and Laura L. Reeves
  48. Parallel Systems in the Data Warehouse (Data Warehousing Institute) by Steve Morse and David Isaac
  49. Decision Support in the Data Warehouse (The Data Warehousing Institute series) by Hugh J. Watson and Paul Gray
  50. Building a Better Data Warehouse by Don Meyer and Casey E. Cannon
  51. The Data Model Resource Book: A Library of Logical Data and Data Warehouse Models by Len Silverston, W. H. Inmon, and Kent Graziano
  52. Managing the Data Warehouse: Practical Techniques for Monitoring Operations and Performances Administering Data and Tools by W. H. Inmon, J. D. Welch, and Katherine L. Glassey
  53. The Intranet Data Warehouse: Tools and Techniques for Building Intranet-enabled Data Warehouse by Richard Tanler
  54. Oracle 10g Data Warehousing by Lilian Hobbs PhD, Susan Hillson MS in CIS Boston University, Shilpa Lawande, and Pete Smith
  55. Oracle9iR2 Data Warehousing by Lilian Hobbs, Susan Hillson MS in CIS Boston University, and Shilpa Lawande
  56. Oracle8i Data Warehousing by Lilian Hobbs PhD and Susan Hillson MS in CIS Boston University
  57. Oracle8i Data Warehousing by Michael J. Corey, Michael Abbey, Ben Taub, and Ian Abramson
  58. Tera-Tom on Teradata Basics by Tom Coffing and Gareth Walter
  59. Tera-Tom on Teradata Physical Implementation by W. Coffing and Mark Ferguson
  60. Tera-Tom on Teradata SQL by Tom Cofffing and Robert Hines
  61. Tera-Tom on Teradata Database Administrator by Tom Coffing and Steve Wilmes
  62. Tera-Tom on Teradata Designer by Tom Coffing and Todd Wilson
  63. Tera-Tom on Teradata Application Development by Tom Coffing and Scott Smith
  64. Tera-Tom on Teradata E-Business by Randy Volters and Tom Coffing
  65. Teradata SQL Unleash the Power V2R6 by Thomas L. Coffing and Michael Larkins
  66. Teradata Utilities – Breaking the Barriers by Tom Coffing, Morgan Jones, Mike Larkins, Steve Wilmes, Randy Volters
  67. Netezza SQL – Harness the Power by Mike Larkins and Tom Coffing
  68. Netezza Underground: The unauthorized tales of derring-do and adventures in resilient data warehousing solutions byDavid Birmingham
  69. Teradata Users Guide: The Ultimate Companion by Tom Coffing, Leona Coffing, Chris Coffing, and Robert Hines
  70. Teradata SQL Quick Reference Guide – Simplicity By Design by Tom Coffing, Todd Carroll, Robert Hines, and Mike Larkins
  71. Secrets of Best Data Warehouses in the World by Rob Armstrong, Tom Coffing, and Rolf Hanusa
  72. Common Warehouse Metamodel: An Introduction to the Standard for Data Warehouse Integration (Omg) by John Poole, Dan Chang, Douglas Tolbert, and David Mellor
  73. 50 Tb Data Warehouse Benchmark on IBM System Z by IBM Redbooks
  74. E-Business Intelligence Front-End Tool Access to Os/390 Data Warehouse by IBM Redbooks
  75. Rdb/vms: Developing a Data Warehouse by William H. Inmon and Chuck Kelley
  76. Data Warehouses: More Than Just Mining by Barbara J. Bashein and M. Lynne Markus
  77. Corporate Information with Sap(R)-Eis: Building a Data Warehouse and Mis-Application (Efficient business-computing) by Bernd-Ulrich Kaiser
  78. Dimensional Data Warehousing with MySQL: A Tutorial by Djoni Darmawikarta
  79. Data Warehousing Fundamentals: A Comprehensive Guide for IT Professionals by Paulraj Ponniah
  80. Data Warehousing, Data Mining, and OLAP (Data Warehousing/Data Management) by Alex Berson and Stephen J. Smith
  81. Data Warehousing: Architecture and Implementation by Mark W. Humphries, Michael W. Hawkins, and Michelle C. Dy
  82. Data Warehousing 101: Concepts and Implementation by Arshad Khan
  83. Agile Data Warehousing: Delivering World-Class Business Intelligence Systems Using Scrum and XP by Ralph Hughes
  84. e-Data: Turning Data Into Information With Data Warehousing by Jill Dyché
  85. Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL by Roland Bouman and Jos van Dongen
  86. A Manager’s Guide to Data Warehousing by Laura Reeves
  87. Data Warehousing with SAP Bw7 Bi in SAP Netweaver 2004s: Architecture, Concepts, and Implementation by Christian Mehrwald and Sabine Morlock
  88. Data Warehousing: Using the Wal-Mart Model (The Morgan Kaufmann Series in Data Management Systems) by Paul Westerman
  89. Oracle DBA Guide to Data Warehousing and Star Schemas by Bert Scalzo
  90. Building and Maintaining a Data Warehouse by Fon Silvers
  91. Evolving Application Domains of Data Warehousing and Mining: Trends and Solutions by Pedro Nuno San-Banto Furtado
  92. Data Warehousing And Business Intelligence For e-Commerce (The Morgan Kaufmann Series in Data Management Systems) by Alan R. Simon and Steven L. Shaffer
  93. Data Warehousing with Informix: Best Practices by Angela Sanchez
  94. Data Warehousing: Concepts, Technologies, Implementations, and Management by Harry Singh
  95. Data Warehousing in Action by Sean Kelly
  96. High Performance Oracle Data Warehousing: All You Need to Master Professional Database Development Using Oracle by Donald K. Burleson
  97. Implementing Enterprise Data Warehousing: A Guide for Executives by Alan Schlukbier
  98. Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development: Innovative Methods and Applications (Advances in Data Warehousing and Mining (Adwm) Book Series) by Tho Manh Nguyen
  99. New Trends in Data Warehousing and Data Analysis (Annals of Information Systems) by Stanislaw Kozielski and Robert Wrembel
  100. Data Warehousing with Service-oriented Architecture: Designing and Implementing Prototype Models For an Integration of Near-Real-Time Data Warehousing Architecture with Service-oriented Architecture by Ronnie Abrahiem
  101. Encyclopedia of Data Warehousing and Mining, Second Edition by John Wang
  102. IBM Data Warehousing: With IBM Business Intelligence Tools by Michael L. Gonzales
  103. Clickstream Data Warehousing by Mark Sweiger, Mark R. Madsen, Jimmy Langston, and Howard Lombard
  104. Intelligent Data Warehousing: From Data Preparation to Data Mining by Zhengxin Chen
  105. Data Stores, Data Warehousing, and the Zachman Framework: Managing Enterprise Knowledge (Mcgraw-Hill Series on Data Warehousing and Data Management) by William H. Inmon, John A. Zachman, and Jonathan G. Geiger
  106. Progressive Methods in Data Warehousing and Business Intelligence: Concepts and Competitive Analytics (Advances in Data Warehousing and Mining) by David Taniar
  107. Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas
  108. AS/400 Data Warehousing: The Complete Guide to Implementation by Brian W. Kelly
  109. Data Warehousing and Data Mining for Telecommunications (Artech House Computer Science Library) by Rob Mattison
  110. Data Warehousing : Design, Development and Best Practices by Soumendra Mohanty
  111. Exploration Warehousing: Turning Business Information into Business Opportunity by William H. Inmon, R. H. Terdeman, and Claudia Imhoff
  112. The Data Model Resource Book: A Library of Logical Data and Data Warehouse Designs by Len Silverston, William H. Inmon, and Kent Graziano
  113. Data Warehousing in the Real World (A Practical Guide for Building Decision Support Systems)by Dennis Murray Sam Anahory
  114. Parallel Processing Techniques for Data Warehousing and Mining: Application and Challengesby Satchidananda Dehuri
  115. Essential Oracle8i Data Warehousing: Designing, Building, and Managing Oracle Data Warehouses by Gary Dodge and Tim Gorman
  116. The Essential Guide to Data Warehousing by Lou Agosta
  117. Data Warehousing OLAP and Data Mining by S. Nagabhushana
  118. Building the Customer-Centric Enterprise: Data Warehousing Techniques for Supporting Customer Relationship Management by Claudia Imhoff, Lisa Loftis, and Jonathan G. Geiger
  119. Data Warehousing: The Ultimate Guide to Building Corporate Business Intelligence (HOTT Guide) by SCN Education B.V.
  120. Data Warehousing and Knowledge Discovery: 9th International Conference, DaWaK 2007, Regensburg, Germany, September 3-7, 2007, Proceedings (Lecture Notes … Applications, incl. Internet/Web, and HCI) by Il Yeol Song, Johann Eder, and Tho Manh Nguyen
  121. Clinical Data Mining and Warehousing, An Issue of Clinics in Laboratory Medicine (The Clinics: Internal Medicine) by James Harrison Jr. MD PhD
  122. Using data warehousing to deliver integrated management information: Case studies of customer data integration using sales and marketing data marts by Shana Ponelis
  123. Data Warehousing and Knowledge Discovery: 6th International Conference, DaWaK 2004, Zaragoza, Spain, September 1-3, 2004, Proceedings (Lecture Notes in Computer Science) by Yahiko Kambayashi, Mukesh Mohania, and Wolfram Wöß
  124. Strategic Data Warehousing: Achieving Alignment with Business by Neera Bhansali
  125. Strategic Data Warehousing Principles Using SAS Software by Peter R. Welbrock
  126. Data Warehousing: The Route to Mass Communication by Sean Kelly
  127. Data Warehousing for E-Business by R. H. Terdeman, Joyce Norris-Montanari, Dan Meers, and William H. Inmon
  128. Data Warehousing and Knowledge Discovery: 10th International Conference, DaWak 2008 Turin, Italy, September 1-5, 2008, Proceedings (Lecture Notes in Computer … Applications, incl. Internet/Web, and HCI) by Il-Yeol Song, Johann Eder, and Tho Manh Nguyen
  129. Data Warehousing and Data Mining Techniques for Cyber Security (Advances in Information Security) by Anoop Singhal
  130. Data Warehousing and Decision Support : The State of the Art, Volume 1 by Pam Roth. Volume 2 is here.
  131. Advances in Database Technologies: ER ’98 Workshops on Data Warehousing and Data Mining, Mobile Data Access, and Collaborative Work Support and Spatio-Temporal … (Lecture Notes in Computer Science) by Yahiko Kambayashi, Dik Lun Lee, Ee-Peng Lim, and Mukesh Kumar Mohania
  132. Data Warehousing and Web Engineering by Shirley A. Becker
  133. ERP and Data Warehousing in Organizations: Issues and Challenges by Gerald G. Grant
  134. Data Warehousing and Knowledge Discovery: 8th International Conference, DaWaK 2006, Krakow, Poland, September 4-8, 2006, Proceedings (Lecture Notes in … Applications, incl. Internet/Web, and HCI) by A Min Tjoa and Juan Trujillo
  135. Data Warehousing Advice for Managers by Patricia L. Ferdinandi
  136. Data Warehousing and the Management Accountant (CIMA Research) by Ian Cobb
  137. Data Warehousing and Knowledge Discovery: 4th International Conference, DaWaK 2002, Aix-en-Provence, France, September 4-6, 2002. Proceedings (Lecture Notes in Computer Science) by Yahiko Kambayashi, Werner Winiwarter, and Masatoshi Arikawa
  138. Oracle Data Warehousing Unleashed by Michael Schrader, John Dakin, Kieron Hardy, and Matthew Townsend
  139. Journal of Healthcare Information Management, E-Healthcare Data Warehousing Journal of Healthcare Information Management, No. 2: Journal of Healthcare … Health Care Information Mgmt) by Julie Foreman
  140. Worldwide Data Warehousing Tools 2004 Vendor Shares by Dan Vesset
  141. Constructing Data Warehouses with Metadata-driven Generic Operators by Dr Bin Jiang.
  142. Testing the Data Warehouse Practicum by Doug Vucevic and Wayne Yaddow

Notes:

  1. You may think that “data warehouse” search in Amazon would also include “data warehousing”. That was what I was thinking. But sadly no. I don’t hope Amazon search is smart enough to interpret that the term “ETL” or “Dimensional Model” has a lot to do with data warehousing either, hence my motive to create this list. Same for the term “ODS” and “data mart”.
  2. Data warehouse book as important as Inmon’s DW 2.0 was missed because the title doesn’t contain “Warehous*”. Sad. And Data Warehousing 101: Concepts and Implementation by Arshad Khan was missed when we search “Data Warehouse” in Amazon.
  3. I don’t limit myself on SQL Server. As you can see I also include Oracle ones. We can learn a lot about data warehousing from other platform, particularly the ETL. In fact I learnt a lot from a book called “Oracle 8i Data Warehousing” (Corey et al, not Hobbs & Hilson). Informix, DB2, MySQL, AS/400, SAS, are all in there now.
  4. I don’t include data modelling book in the list if it’s a general one. I only include it if it’s dimensional model.
  5. I don’t include “bundle”, e.g. several books packaged and sold as one. An example of a bundle is Kimball’s Toolkit bundle. The reason is because I have included the components individually.
  6. I don’t include data mining book if it’s only data mining. But if contains data warehousing as well then I include it. See Alex Berson’s for example. Ditto for MDM, BI, OLAP, DQ and Text Analytics. I do include Decision Support though (well of course)
  7. Can you believe it’s 123 books in data warehousing! That’s a lot of books for 1 area of study/work. And that exclude the things I mentioned above.
  8. If there are many editions of the book (like Inmon classic) I only include the latest one. First edition is an absolute treasure sometimes, like Kimball’s 1996 but there you go. When it’s a rewrite using different version of the software, I include them. For example: Oracle 8i, 9i and 10g Data Warehousing.
  9. I do include conference proceedings and lecture notes, despite that some people say they are not ‘real books’. I don’t care the physical form of it (thin, thick, non paper, etc), as long as the content is warehousing.
  10. Apologies there are many DW books in German which I don’t include here. Primarily because this is an English blog and I can’t write in German. Perhaps somebody else could make a list of these German DW books (there are really a lot of them, check in Amazon).
  11. I know there is a Data Warehousing book in MySQL. I know it exists because I know the author, who is also from Indonesia like me but he lives in Canada now. Djoni Darmawikarta. So I’ll find it and put it here too.
  12. I own Barry Devlin’s warehousing book. Very old, the binder is almost off, but the content is illuminating. Primarily because it was written free from Inmon & Kimball influence, hence it defined its owned principles of design. I’ll add it here.
  13. Intelligent Solution composed a comprehensive list of data warehousing articles, from 1993 to 2006.

My Book

I was sometimes asked by people who wanted to learn data warehousing to recommend a book for them. Some of them are database administrators/data architects (on various platforms) and some are developers (application developers and database developers). They know how to write SQL. They know how to create tables. They know how to query data. They are looking for a basic data warehousing book, which is practical and aimed for beginners. A book that can be used by new starters to build their first data warehouse, and the BI on top of it. A book that contains all the essential topics such as methodology, architecture, data modelling, ETL, data quality, reports, cubes and BI. A book that contains examples and illustrations from real projects which are easy to understand. For this reason I wrote a data warehousing book: Building a Data Warehouse: with Examples on SQL Server (#12).

It has 17 chapters:

  • Chapter 1 is about what a data warehouse is
  • Chapter 2 is about data warehouse architecture
  • Chapter 3 is about methodology / project management
  • Chapter 4 is about gathering requirements
  • Chapter 5 is about designing the data model, both dimensional and normalised
  • Chapter 6 is about the system architecture/servers and configuring the databases
  • Chapter 7 is about ETL (extracting data from source systems)
  • Chapter 8 is also about ETL (loading data into the warehouse)
  • Chapter 9 is about data quality
  • Chapter 10 is about metadata
  • Chapter 11 is about reports
  • Chapter 12 is about OLAP cubes
  • Chapter 13 is about BI (Business Intelligence)
  • Chapter 14 is about using a data warehouse for CRM
  • Chapter 15 is about unstructured data and data warehousing search
  • Chapter 16 is about testing
  • Chapter 17 is about operation and administration

 

 

 

It contains all the essential topics in data warehousing. In order for this book to be able to be used to build the reader’s first data warehouse, and the BI on top of it, I need to give a case study. A case study that contain examples which span across all those chapters. From designing the architecture, to building the cubes and reports. For this purpose I had to choose a platform. I chose SQL Server as the platform. Not only it has an excellent database engine, it also comes with the ETL, reports, OLAP cubes and data mining tool built-in. SQL Server 2005/2008 is a complete end-to-end data warehousing solution. So in chapter 6 I use SQL Server database server to create the databases. In chapter 7 & 8 I use SSIS for data extraction and data loading (ETL). In chapter 10 I used SQL Server database for metadata. In chapter 11 I used SSRS for reports. In chapter 12 I used SSAS for OLAP cubes. And in chapter 13 I used SSAS for data mining. I hope this book will serve its purpose in providing a basic data warehousing book, which is practical and aimed for beginners


相关教程