Today, BW/4HANA 2.0 has been released. It is a major update to version 1.0 which was shipped 2 years ago. With 1.0 SP8, the BW/4HANA Cockpit has emerged as the new entry point to the product, basically replacing the classic BW admin workbench. It is based on the Fiori Launchpad paradigm and technology. As such, it works in any browser, on PCs and touchscreen devices. For example, a BW/4HANA admin can now easily check from her or his smartphone whether process chain runs have succeeded and data has been released to end users.
In the 3-min video below, you can see the new version of the BW/4HANA Cockpit that is shipped with 2.0. Jascha shows a glimpse of what’s improved and what’s new. My three personal highlights are:
- the tiles are not only buttons but provide useful data related to the underlying functionality, e.g. the number of infoproviders, i.e. it is launchpad and dashboard at the same time,
- the addition of the Data Privacy Workbench, e.g. to propagate GDPR relevant changes that have been processed in an S/4HANA source system and can now be propagated to the BW/4 instance,
- the notifications feature that immediately indicates alerts to which I have subscribed.
PS: The BW/4HANA 2.0 documentation can be found here.
Here is a short video (2:47 min) showing the new BW/4HANA cockpit as it has been shipped with 1.0 SP8 on 26 Mar 2018. It serves as the new entry point for the BW/4 admin, meaning that you can access monitors and related dashboards, scheduler, modeling editors etc. It is based on the Fiori Launchpad paradigm and technology. Consequently, it can easily be personalised to your specific needs, e.g. by adding a tile showing the logs of “your” process chains. It works on traditional laptops, smartphones or – as in the video – on a giant touchscreen. It can be considered as replacing the admin workbench (RSA1).
We will provide more detailed videos on the capabilities of the BW/4HANA cockpit soon. More on BW/4HANA 1.0 SP8 can be found in this slide deck.
From Sep 24 to 29, the 5th edition of the Heidelberg Laureate Forum (HLF) took place, mainly on premisses of the University of Heidelberg. It intends to bring together laureates in mathematics and computer science with young researchers. The program roughly looks like this: during the mornings, the laureates give presentations on their field, their experience, their opinion where things are leading to, interesting research questions etc. During breaks, workshops, poster sessions, social and other events, there is plenty of room to pick up one or the aspect, e.g., from those presentations and have a fruitful discussion with some of the leading brains in the field. One of my favourite moments was a coincidental lunch time chat with Sir Michael Atiyah on a variety of topics and that variety makes it so fascinating.
Material from the HLF is available on the internet:
- Recordings of the presentations, even last year’s, can be found in the HLF Youtube channel.
- Blogs on the HLF 2017 can be found here.
- Tweets on the HLF 2017 can be found under the #HLF17 hashtag and from the HLF’s official Twitter account @HLForum.
- Finally, there is a photo gallery on Flickr.
I could not attend all the presentations this year. But from those that I could attend, I recommend the following 3; this is an arbitrary, sobjective selection as all presentations had impressive content:
- Jeffrey A. Dean: “Deep Learning and the Grand Engineering Challenges”. Jeff gave a huge set of insights and examples on the potential of deep learning.
- John E. Hopcroft: “Deep Learning Research”. If you wonder about the relationship between maths and computer science then watch the first 5 minutes of John’s presentation. Later, he showed some fascinating potential of neural networks in image processing.
- Aaron Ciechanover: “The Personalized Medicine Revolution: Are We Going to Cure all Diseases and at What Price?”. Aaron presented this year’s Lindau lecture. He talked about exiting the era where the treatment of many diseases is “one size fits all”, and enter a new era of “personalized medicine” where the treatment is tailored according to the patient’s molecular/mutational profile.
Many companies currently complement their existing relational data warehouses with big data components, such as Spark, HDFS, Kafka, S3, … This leads to a new form of data warehouse (DW) that we call big data warehouse (BDW). This blog elaborates how BW/4HANA and the SAP Data Hub (DH) are a perfect match for building a BDW.
The idea of a BDW is prevailing in many companies and industries. This blog describes a BDW built at Netflix, this one a BDW at Sears. Many more can be found on the web. All those examples show how big data storage and processing environments complement traditional relational data warehouses by providing
- an easy way to process semi- and unstructured data, such a photos, videos, sound, text,
- an inexpensive storage for fine granular data, e.g. from sensors and logs.
Figure 1 shows a generic setup of a BDW. Usually, there are 2 to 3 storage layers involved; sometimes, the first two are collapsed into one:
- an ingestion layer: inexpensive storage to collect data from many sources, e.g. thousands of sensors; Amazon’s S3 is frequently used here,
- a processing and refinement layer for distributed processing of large and/or many files,
- a relational DW: this layer serves to provide semantically rich and well structured data to business users who use analytic clients tools for interactive analyses.
Many SAP customers are on the same trajectory as described in the Netflix and Sears examples. All of them have run a relational DW for many years and are now evolving and complementing it with big data components. BW and BW-on-HANA are capable to play the role of the relational DW in such an environment through various connection options. However, BW/4HANA’s ambition is to excel this and be well integrated with SAP’s Data Hub. The latter manages the ingestion and processing layers to the left of figure 1. This is outlined in figure 2 which represents the pattern of figure 1 implemented with SAP software components.
Now, what does this tight integration between BW/4HANA and SAP’s Data Hub mean? What are the specifics? This is shown in figure 3 and comprises the following features:
- Workflows between the 2 environments can be mutually triggered: the Data Hub’s data pipeline can be part of BW/4HANA’s process chains and vice versa.
- Data movement between BW/4HANA and the Data Hub – or technically: between HANA and VORA – is highly optimized and aligned for performance (e.g. align data types thereby reducing overheads through type casting).
- The repositories of BW/4HANA and the Data Hub will be integrated and interoperate to enable common transports, lineage and impact analysis.
- In the area of data tiering, VORA is leveraged for archiving (cold store) of BW/4HANA data with high data throughput and fast read access.
What already exists today and what is planned to be shipped at what time is described in the roadmap shown in figure 4. Click the picture to enlarge.
In times of digitalization and the Internet-of-things, traditional and relational data warehouses are complemented with tooling, engines and infrastructure from the big data area. This leads to “big data warehouses” or, sometimes, also labeled “modern data warehouses”. BW/4HANA and the SAP Data Hub are a perfect match in that respect.
HANA promises to cater for both, OLTP and OLAP, workloads. That allows to provide operational analytics within a S/4HANA system. The SAP-focused reader might wonder why, on earth, do you still want to have a BW/4HANA system in your landscape? This blog looks at 3 anonymised customer examples that reveal why having a data warehouse – such as BW/4HANA – is even more pressing in times of digitalisation than ever before. A data warehouse is thereby considered as the place that brings data and its underlying semantics from a variety of sources together in one place, either physically, virtually or mixed, either using an RDBMS, a big data environment or a combination thereof, either deployed on premise or in the cloud.
Example 1: Consumer Goods Customer
The first example comes from a leading consumer goods company. Figures 1a and 1b show details from 2 of their slides and list the sources of data that feed into their data warehouse. As expected, there is a bunch of traditional SAP systems, such as ERP (S/4), CRM and APO, but – as it has become common in days of digitalisation – also from sensors, logs, digitalised sales and marketing. Now, bringing that data semantically together – for example, to understand the financial impact of digital marketing on financial results – becomes mandatory. You need a system that is equipped with tooling and mechanisms (like modeling, security, transformation, connectivity, lifecycle, monitoring, governance in general) that allows that semantic consolidation. This is exactly what a data warehouse does. BW/4HANA provides this infrastructure while S/4HANA focuses on certain business processes.
Example 2: Fashion Customer
The second example is from a fashion customer who sells his products predominantly via on-premise stores but increasingly online. The latter triggers the need to look into more and more online behavioural data, such as clickstream or social media information, in order to answer questions about the products for which a customer has shown some interest or what the brand perception is etc. Fig. 2 lists data sources that this company is analysing. One aspiration is that demand can be better predicted by better understanding a customer’s interest indicators from clickstreams and social media. That in turn can impact demand, supply and other planning in – e.g. – a BW/4HANA system.
Example 3: Oil and Gas Customer
The third example is from an oil and gas customer. Fig. 3 shows the data sources that they connect to their data warehouse. There is obviously a mix of SAP and non-SAP sources. For instance, there is data on seismic measurements, oil rig sensor information, drill status (both for predictive maintenance), oil well status etc. Again, there is a number of scenarios or analytic questions that require to combine such data with data from an SAP system. To that end, a data warehouse approach is required. Simply copying such data into the HANA system underlying an S/4HANA instance would fall short in many ways: you would still end up creating a data warehouse on HANA that coincidentally sits on the same HANA as the S/4HANA instance.
These 3 real-world examples show that modern analytics requires data from an even larger variety of data sources than ever before. Big data, IoT, digitalisation etc. are trends that have added to that variety. Integrating data from those sources is more than just copying them together to or exposing them logically in one location. The need for a data warehouse remains as the place that brings the data together (physically or logically) and semantically integrates them through transformation, harmonisation, synchronisation etc. This is complemented by operational analytics inside a single operational system, such as S/4HANA, as it focuses and analyses data in there in an isolated way.
Hasso’s SAPPHIRE NOW 2017 Keynote Comments
Hasso commented in his SAPPHIRE NOW 2017 keynote (see here, at 0:37 to 0:39) that “he fought against data warehouses in the 1990s”. However, he also states that “there is still an application for data warehouses”. He then elaborates that not all analytics does have to sit in a data warehouse.
This is exactly the distinction and the point argued in this blog, namely: there is operational analytics (directly inside an operational system like S/4HANA and not necessarily in a data warehouse) and there is cross-system analytics (which needs something like a data warehouse). The latter is a problem that is not addressed by S/4HANA but that exists in the real world – see the customer examples above – and that is addressed by BW/4HANA.
In a recent blog, I have introduced the Data Warehousing Quadrant, a problem description for a data platform that is used for analytic purposes. The latter is called a data warehouse (DW) but labels, such as data mart, big data platform, data hub etc., are also used. In this blog, I will map some of the SAP products into that quadrant which will hopefully yield a more consistent picture of the SAP strategy.
To recap: the DW quadrant has two dimensions. One indicates the challenges regarding data volume, performance, query and loading throughput and the like. The other one shows the complexity of the modeling on top of the data layer(s). A good proxy for the complexity is the number of tables, views, data sources, load processes, transformations etc. Big numbers indicate many dependencies between all those objects and, thus, high efforts when things get changed, removed or added. But it is not only the effort: there is also a higher risk of accidentally changing, for example, the semantics of a KPI. Figure 1 shows the space outlined by the two dimensions. The space is the divided into four subcategories: the data marts, the very large data warehouses (VLDWs), the enterprise data warehouses (EDWs) and the big data warehouses (BDWs). See figure 1.
Now, there is several SAP products that are relevant to the problem space outlined by the DW quadrant. Some observers (customers analysts, partners, colleagues) would like SAP to provide a single answer or a single product for that problem space. Fundamentally, that answer is HANA. However, HANA is a modern RDBMS; a DW requires tooling on top. So, there is something more required than just HANA. Figure 2 assigns SAP products / bundles to the respective subquadrants. The idea behind that is to be a “flexible rule of thumb” rather than a hard assignment. For example, BW/4HANA can play a role in more than just the EDW subquadrant. We will discuss this below. However, it becomes clear where the sweet spots or the focus area of the respective products are.
From a technical and architectural perspective, there is a lot of relationships between those SAP products. For example, operational analytics in S/4 heavily leverages the BW embedded inside S/4. Another example is BW/4HANA’s ability to combine with any SQL object, like SQL accessible tables, views, procedures / scripts. This allows smooth transitions or extensions of an existing system into one or the other direction of the quadrant. Figure 3 indicates such transitions and extension options:
Data Mart → VLDW: This is probably the most straightforward path as HANA has all the capabilities for scale-up and scale-out to move along the performance dimension. All products listed in the data mart subquadrant can be extended using SQL based modeling.
Data Mart → EDW: S/4 uses BW’s analytic engine to report on CDS objects. Similarly, BW/4HANA can consume CDS views either via the query or in many cases also for extraction purposes. Native HANA data marts combine with BW/4HANA similarly to the HANA SQL DW (see 3.).
VLDW ⇆ EDW: Here again, I refer you to the blog describing how BW/4HANA can combine with native SQL. This allows BW/4HANA to be complemented with native SQL modeling and vice versa!
VLDW or EDW → BDW: Modern data warehouses incorporate unstructured and semi-structured data that gets preprocessed in distributed file or NoSQL systems that are connected to a traditional (structured), RDBMS based data warehouse. The HANA platform and BW/4HANA will address such scenarios. Watch out for announcements around SAPPHIRE NOW 😀
The possibility to evolve an existing system – located somewhere in the space of the DW quadrant – to address new and/or additional scenarios, i.e. to move along one or both dimensions is an extremely important and valuable asset. Data warehouses do not remain stale; they are permanently evolving. This means that investments are secure and so it the ROI.