EN – Understanding SQL Server Analysis Services

EN – Understanding SQL Server Analysis Services

1. Introduction:

The objective of this paper is to provide a better understanding on the SSAS and necessity towards that. This article also highlights the SSAS capabilities.

2. Revisiting Analysis services

Before getting into to Analysis services a quick recap on why OLAP is essential.

2.1. Online analytical processing (OLAP):

OLAP is an abbreviation for Online Analytical Processing. OLAP accomplishes multidimensional analysis of business data and provides better capability for complex calculations, trend analysis, and refined data modeling. It typically helps business decision makers analyze their data to expose trends that might not be discovered when viewing data in normal standard reports.

In the software market all leading providers such as Oracle, Micro strategy, SAP, IBM have released their OLAP servers. This article deals about the Microsoft OLAP and its subsequent releases.

Microsoft SQL Server Analysis Services popularly known as SSAS has online analytical processing (OLAP) and data mining functionality for business intelligence applications.

SSAS supports OLAP by allowing to design, create, and manage multidimensional structures that contain data aggregated from other data sources, such as relational databases. For data mining applications, Analysis Services allow to design, create, and visualize data mining models that are constructed from other data sources by using a wide variety of industry-standard data mining algorithms.

2.2 Multidimensional Data

Data becomes a progressively happening topic in the business world as the amount of data that an organization logs, stores, analyzes, and utilizes continues to grow. Storing and accessing relevant data is imperative for reporting company performance and planning for growth and development in the future. However, outside of having an IT or programming background, comprehending the technology options for data storage can be a challenge. The options are few, but their functionalities vary, and depending on the various Business Intelligence (BI) requirements that need to be met, the product may require a certain data storage solution.

Predominately there are 3 types of data

  • One Dimensional data
  • Two Dimensional data
  • Multidimensional data

2.2.1 One Dimensional data:

The below figure decipts a one dimensional data, consider it as an axis and its position starts with 0, 1, 2, 3… The value available in position 3 is CA and it is straight forward and easy to represent.

Não foi fornecido texto alternativo para esta imagem

2.2.2 Two dimensional Data:

The figure depicts a two dimensional data with X axis and Y axis. For example the position (1, 1) would return [VA, 700] and position (3,-2) would return [NJ, 900]. We need two coordinates for two dimensional data.

No alt text provided for this image

Representing the two dimensional in our relational systems perspective will as mentioned below. For instance if we need know the customer from United Kingdom we can get it from the table.

No alt text provided for this image

2.2.3 Multidimensional Data:

The multidimensional figure have 3 dimensions: For instance (1) Time in quarters, (2) Sales amount in USD, and (3) regions. We can use these three dimensions to define a space of the sales of a specific product over a specific period of time, measured in quarters.

No alt text provided for this image

The primary reason for building an Analysis Services multidimensional model is to achieve fast performance of ad hoc queries against business data. A multidimensional model is composed of cubes and dimensions that can be annotated and extended to support complex query constructions. BI developers create cubes to support fast response times, and to provide a single data source for business reporting. Given the growing importance of business intelligence across all levels of an organization, having a single source of analytical data ensures that discrepancies are kept to a minimum, if not eliminated entirely.

Another pictorial example for accessing data in multi-dimensional

No alt text provided for this image

In a nutshell OLAP is necessary for as it helps to generate Business Reports without lot of time to load. At the same time creating Business Reports over Transactional systems (OLTP- Relational DBMS) data is very bad idea: Ad-hoc reporting over OLAP systems brings more clarity.

3. Multidimensional & Tabular Model

Microsoft has been a market front-runner in OLAP technology for many years. Microsoft’s technology is very established and scalable. However, in the last few years, new players have entered the BI market with the assurance to finally deliver “BI for the crowds”. These new players have provided BI tools with extremely interactive and graphical user interfaces built on in-memory architectures and associative paradigms in terms of selecting data in the analysis. They address business users. These tools have become very popular and have started to become important competitors for Microsoft especially in the midsized business segment.

The Multidimensional model is nothing but the traditional OLAP Cube which we discussed in the last figure.  This model exists for more than ten years. OLAP technology organizes summary data into multidimensional structures. Aggregations are stored in the multidimensional structure in cells at coordinates specified by the dimensions.

This does not mean that Microsoft is abandoning disk-based OLAP technology. Microsoft communicated at the end of 2011 that it will keep on investing in OLAP technology to maintain its position as market leader. Therefore, in Analysis Services 2012, we find two predominate different products:

No alt text provided for this image

These two models have the similar goal to provide a semantic layer on top of the Data Warehouse with high performance capabilities that allows the end-users to dig into the data. The fact that these two models are part of Analysis Services could give the impression that both models are similar and that you can easily switch from one model to another. The reality is that they are two different products with two totally different design experiences and underlying data-architectures. They can co-exist on the same machine but they both have their own instances.

The Tabular model (In-Memory Cube). Tabular models are in-memory databases in Analysis Services. Using state-of-the-art compression algorithms and multi-threaded query processing, the Xvelocity engine delivers fast access to tabular model objects and data through reporting client applications such as Microsoft Excel and Microsoft Power View.

4. Advantages for SSAS cube

Given a reporting tool and having the option to link to tables in a data warehouse or an SSAS cube, what would be the benefits if it’s linked to the SSAS cube?

  • Speed: Aggregating (Summarizing) the data for performance: During cube processing SSAS will pre-calculate and physically stores aggregations of facts (Amounts, Quantities, Money). These aggregations, for example Turnover by Year and Region, are used when a business user queries the cube for this type of information. Therefore the query response time can be very short. When the query is fired SSAS does not have to calculate the outcome from the underlying details (like T-SQL has to do), but can take the values directly from the stored aggregations. Besides that SSAS stores query-results in a cache. So the next time the same type of query is fired, it will try to get it from the cache. Speed is especially important for a dashboard that an executive is using to slice-and-dice as any mouse click that takes more than a few milliseconds to return data will draw complaints
  • Multidimensional analysis – slice, dice, drilldown: This very much depends on the tool or front end that is layered over the data, but the idea is that you can very quickly navigate around the data, finding trends, spotting patterns, ‘drilling down’, ‘slicing and dicing’ – all key to the concept of cubes. Allowing the user to intuitively ‘wander’ around the data, not even realizing that they performing analysis
  • Can store Hierarchies
  • Can usual handle more concurrent users than what the data warehouse can handle
  • No need to join the fact and dimension tables, as this will be done in the cube
  • Built-in advanced time-calculations – i.e. 12-month rolling average: It’s very easy to implement advanced time calculations like 12-month rolling average, year-to-date and references to parallel periods in previous years. This is typically the stuff decision-makers in the organization want to have. Imagine how many T-SQL queries are required for calculating rolling averages for each of the previous 12 months (2016 May -2017 April) => 12. Using the cube as a data source => Only 1
  • Easily use Excel to view data via Pivot Tables
  • Security: You can use the security setting to give end-users access to only those parts (slices) of the cube relevant to them
  • Automatically handles Slowly Changing Dimensions (SCD)
  • Built-in support for KPI’s
  • Ability to automatically link and display the records that make up an aggregation (“Show Details”)
  • Support for drill through actions such as generating an SSRS report or linking to a URL based on the value selected
  • Reinforces best practice by requiring you to build a good data model (star schema) that’s created for business reporting and analysis. So you are creating an abstraction/semantic layer between the data warehouse and the end user to make it easier for them to query data
  • Can use tools Performance Point and Power View
  • Can use Analysis Services Data Mining
João Barros