In a recent blog, I have introduced the Data Warehousing Quadrant, a problem description for a data platform that is used for analytic purposes. The latter is called a data warehouse (DW) but labels, such as data mart, big data platform, data hub etc., are also used. In this blog, I will map some of the SAP products into that quadrant which will hopefully yield a more consistent picture of the SAP strategy.
To recap: the DW quadrant has two dimensions. One indicates the challenges regarding data volume, performance, query and loading throughput and the like. The other one shows the complexity of the modeling on top of the data layer(s). A good proxy for the complexity is the number of tables, views, data sources, load processes, transformations etc. Big numbers indicate many dependencies between all those objects and, thus, high efforts when things get changed, removed or added. But it is not only the effort: there is also a higher risk of accidentally changing, for example, the semantics of a KPI. Figure 1 shows the space outlined by the two dimensions. The space is the divided into four subcategories: the data marts, the very large data warehouses (VLDWs), the enterprise data warehouses (EDWs) and the big data warehouses (BDWs). See figure 1.
Now, there is several SAP products that are relevant to the problem space outlined by the DW quadrant. Some observers (customers analysts, partners, colleagues) would like SAP to provide a single answer or a single product for that problem space. Fundamentally, that answer is HANA. However, HANA is a modern RDBMS; a DW requires tooling on top. So, there is something more required than just HANA. Figure 2 assigns SAP products / bundles to the respective subquadrants. The idea behind that is to be a “flexible rule of thumb” rather than a hard assignment. For example, BW/4HANA can play a role in more than just the EDW subquadrant. We will discuss this below. However, it becomes clear where the sweet spots or the focus area of the respective products are.
From a technical and architectural perspective, there is a lot of relationships between those SAP products. For example, operational analytics in S/4 heavily leverages the BW embedded inside S/4. Another example is BW/4HANA’s ability to combine with any SQL object, like SQL accessible tables, views, procedures / scripts. This allows smooth transitions or extensions of an existing system into one or the other direction of the quadrant. Figure 3 indicates such transitions and extension options:
Data Mart → VLDW: This is probably the most straightforward path as HANA has all the capabilities for scale-up and scale-out to move along the performance dimension. All products listed in the data mart subquadrant can be extended using SQL based modeling.
Data Mart → EDW: S/4 uses BW’s analytic engine to report on CDS objects. Similarly, BW/4HANA can consume CDS views either via the query or in many cases also for extraction purposes. Native HANA data marts combine with BW/4HANA similarly to the HANA SQL DW (see 3.).
VLDW ⇆ EDW: Here again, I refer you to the blog describing how BW/4HANA can combine with native SQL. This allows BW/4HANA to be complemented with native SQL modeling and vice versa!
VLDW or EDW → BDW: Modern data warehouses incorporate unstructured and semi-structured data that gets preprocessed in distributed file or NoSQL systems that are connected to a traditional (structured), RDBMS based data warehouse. The HANA platform and BW/4HANA will address such scenarios. Watch out for announcements around SAPPHIRE NOW 😀
The possibility to evolve an existing system – located somewhere in the space of the DW quadrant – to address new and/or additional scenarios, i.e. to move along one or both dimensions is an extremely important and valuable asset. Data warehouses do not remain stale; they are permanently evolving. This means that investments are secure and so it the ROI.