Association Rule Mining, also known as Market Basket Analysis, mainly because Association Mining is used to find out the items which are bought together by the customers during their shopping. The most popular Association Rule Mining example that you will find is the story at the supermarket chain in the US. It is said that they have found out that the customers that are buying beer will buy nappies for their kids. After this finding, management has taken a decision to move the beer palette close to the nappy palette.

Author:Sagor Kekus
Language:English (Spanish)
Published (Last):10 January 2006
PDF File Size:6.87 Mb
ePub File Size:16.22 Mb
Price:Free* [*Free Regsitration Required]

For that, I have put the below screenshot from Youtube. These measures are analyzed using Product Categories, Time dimensions. If we further analyze, Time Dimension, we will see that year and month are the main attributes in the Time Dimension. Following is the Solution Explorer for the sample project created in the previous screen. Let us create a data source for the sample project.

We will be using the AdventureWorksDW sample database in this article. Next is to create a data source view, from that we will be selecting the needed Fact and Dimension tables. There is an easy way to add all the above tables with the least number of clicks. First, select the FactInterSales fact table and move it to the right-hand side then click the Add Related Tables button.

With this, another fact table, FactInternetSalesReason will be added which should be removed as it is not required in this example. After these few clicks, you will end up with the above screen. The following screen is the star-schema for the chosen data source view. Since foreign key constraints are implemented in these tables, relationships are automatically created.

If the foreign key constraints are not implemented, you need to create the relationships manually. Right-click the Cube node and select the New Cubeā€¦. This will take you through a cube creation wizard.

First, you need to choose the measures columns. Measures are the core element of the dimensional model. Measures are data values that can be aggregated as summed, averaged, minimized, etc. Let us see how to choose a measure from the following screenshot. As we know, FactInterSales is the measure group table. If you are not sure, which should not be the case, click the Suggest button.

Suggest button will provide you the suggestion for the measure groups. Next is to select measure columns from the following screen. It is important to note that you have to choose only the required measure columns. If unnecessary columns are selected, it will cause delays in cube processing. In the above example, we have eliminated the Revision Number column which is not a business measure column. Dimension is a collection of referenced information so that measures can be analyzed into detail.

From the following screen, you can choose the required dimensions and modified them as shown below. Though the cube is configuration is completed, every dimension is empty. So it is important to add attributes to the dimensions. It is essential to add only the required attributes. Otherwise, the cube process will take longer and the cube will be larger. If the cubes are larger, cube accessing also will have a negative impact.

Apart from the attributes, hierarchies can be created so that users can analyze data much effectively. This means all the measures and dimensions are stored in the cube after processing. Since all the data is stored in the cube, data accessing is very fast as no processing is required. After the cube process, now the cube is ready to access.

There are multiple ways to access the processed cubes. The following screenshot shows how to access the cube using the visual studio itself. In this, it is simply a matter of drag and drop the columns. You will be able to see the necessary data in a quick time. Since pivot tables are more used by the business users, they can leverage the excel features by using the cubes. In the Excel pivot tables, you can use columns as well as rows to select dimensions.

This means you can perform the ad-hoc analysis much easier. In addition to the simple analysis, users can use created hierarchies as shown in the below screenshot. There is a special type of MDX queries which can be used to retrieve data from Cubes. Those options need to be discussed in separate articles. Even without those advanced options, OLAP Cube is an important option that is available for the end-users.

DSE 6010 PDF

Dinesh Asanka

Microsoft Clustering is an unsupervised learning technique. In supervised training, there will be a variable that is already tagged to. In unsupervised training, there is no previously set variable as such. Clustering is used to find out imperceptible natural grouping in a data set.


OLAP Cubes in SQL Server

Microsoft Linear Regression is a forecasting technique. In this type of technique, there are multiple independent variables from which the dependent variable is predicted. For example, if you want to predict the house prices, you need to know the number of rooms, the area of the house, and other features of the house. As in the previous examples, today also, we will be using the vTargetMail view in the AdventureWorksDW sample database. As we did for other data mining techniques, first, we need to create a data source and the Data Source View. We choose the Microsoft Linear Regression as the data mining technique, as shown in the below screenshot. In this technique, the Microsoft decision trees algorithm is used.


Association Rule Mining in SQL Server


Related Articles