Monday, June 20, 2022

Introduction to Data Analysis using Microsoft Excel

 


Welcome to introduction to data analysis. Using Microsoft Excel in this project we will learn how to perform basic data analysis in Microsoft Excel on sales data from a sample company. With this project, we will learn about the use of sorting and filtering tools as well as the use of functions and pivot tables to obtain specific information from the data provided.

Excel is an invaluable tool for organizing things like annual budgets, developing work projects with teams of several people and often times I work with data sets that I want to perform an analysis on to obtain very specific information in a quick and easy manner and there's a lot you can do in the realm of data analysis but there are a number of very useful and applicable tools for all types of data analysis that we will learn about in this project.

For the data analysis will be performing. We will work with a prepared spreadsheet that contains sales data from a sample company that sells office chairs to other companies and organizations.

As part of this project, you will find the file titled sales data analysis which we will open directly in Excel.

PART 1. UPLOADING A DOCUMENT USING THE FREE MS OFFICE 365

Let's begin for this project. If you have MS Excel in your desktop or laptop then you ready to begin. For those who do not have, use the free online version of Excel from Microsoft Office 365 by using your free Microsoft office online page. If you already have a Microsoft account you can use your account to log in.

If you don't have an account, simply click on, sign up for free.




It will first ask you to write your email address and choose a password for your new account.


And this email address could be your normal email.

It doesn't have to be a Microsoft specific email.

A code will be sent to your email to verify the email address, create your account and log in. After logging in to your Microsoft office account, you'll be directed to your home page and you will now have access to recent documents that you might have been working on.


You can also upload documents from your device and then here on the menu bar on the left side of the screen you have links to the different office programs such as Microsoft word, Excel, Power Point. And you can also go to your One Drive which is your cloud where you can store the different documents that you're working on. Since we're going to be using a prepared spreadsheet with the sales data that will be analyzing, will first have to upload the file to our cloud. We could do this directly with the sleek where it says upload an open or you could go to your one drive and you'll have access to your cloud with all the documents.


So we can upload a new document. We could click. Got it and then upload a new document here where it says upload file and then we will be able to access the file by double clicking in project files. And you should see this document called sales data analysis and click open.

And now that you've opened the file you could see the data.


We will be working with the data tracks the sale of office chairs during the first half of 2020 by the different sales representatives.

So for each sale or order we have the date and the month the sales representative with their region. We also have a customer I. D. And then for the chairs sold in that order we have the model in the color as well as an item code. The number of units that were sold the price per unit and the total income for that sale.

 While the data doesn't look too messy. We're going to start by specifying that the data forms part of the table and this will make it a lot easier to sort data and use filters later on as well as make the information a little easier to read to highlight the entire data from the table.

We could do it manually highlighting from Cell A4 all the way down to cell L 84 or there's another little trick which is we can click on any cell of the table. I'll just click on cell A4 and then type Control A and then Excel will automatically highlight all the cells that are part of the table.


And then to convert this data into a table will click on Insert Table and then it will confirm that the data for your table is the range from A4 to L84 and then you want to specify that your table has headers so that it knows that the first row of your table is for the headers and click OK great.

Now that the data has been converted into a table, you'll see that the format of the cells has changed a little bit, every row of the table is bandaged which makes it a little easier to read.

And then a great thing is that when we scroll down, you'll notice that the titles of the columns have changed to the headers of our table which is great.

So that we don't have to guess what the column means when we scroll down.

In this task, we learn how to set up the Microsoft office 365 account access, use One Drive and upload a file to the drive as well as open the file. You also learn how to convert the data into a table.

 

PART 2. PERFORM DATA ANALYSIS USING SORTING AND FILTERING

 

In this task you will start analyzing the data using sorting and filtering tools. Let's begin now that you've converted your data into a table.

The first thing you might want to do is rearrange the data based on some of the categories of the table.

The data is currently arranged based on the date the order was submitted.

However, you could maybe want to have the data range by sales representative for example so that you can look up information about the sales that each salesperson did having converted the data into a table. Makes the step really easy to the right of each header. You'll see a drop-down arrow that will allow us to sort the information and filter the data based on these headers.

So for example, to sort the data by sales representative we would click on the dropdown arrow to the right of the sales representative header and then we would select sort A to Z.



To sort it in alphabetical order.

So after we click it you'll see that the data has been rearranged by alphabetical order of the sales

representative starting with Amy Brown followed by David Garcia and so on.

And here you can also see that the additional information about each sale was also rearranged along with it and when having a data set like this one, it's really useful to have a first column that simply lists the original order in which the information was listed.

So that we can always go back and re arrange the data as it was originally presented.

So to do this we would sort the data in ascending order of the first column that is named number or numb.



And so we would click the dropdown arrow for this column and then we would select sort from smallest to largest. You can see that exhale recognizes that this is a numerical column. So in this case we're sorting from the smallest value to the largest value.

Another useful tool similar to sorting is filtering and this allows us to only focus on specific data while filtering out data that we don't need to see at a given moment.

Let's say for example your sales manager for the north region of the company and wish to only look at sales that are made by sales representatives that are part of the North division.

In this case it would be useful to apply a filter on the region column by having turned the data into a table.


The filters for each header have already been created and we can automatically click on the dropdown arrow and then select filter and here we can see that we can select which division or which region we want to show. So we could click first on select all to remove all of them. And then click on North and then apply that filter. And here you can see that we're only showing these sales that were done by the representatives that form part of the North division And having the filter played. We can actually obtain some quick information.

For example, if we wanted to calculate the total revenue from the sales made by the North Division, we can highlight the cells from the total column. So from L5 all the way down to the last one. And here at the bottom right corner we could get some information about this data.


We have the average we have the count which tells us how many sales were made. And then we have the total sum of P179065, Which is the information that we wanted.

Now if we wanted to access the entire data set again we would need to remove the filter we currently have on the region category.



You will see when the filter is being applied to a category because the icon changes from a drop down arrow to a funnel, which is the icon for filtering. So to remove the filter we would click on that icon again, go to filter and now we would want to select all to show the sales from the north south and west regions and click.

Ok, let's practice calculating what the total revenue was for the sales made by David Garcia and also calculating how many total chairs he sold.



You'll want to use the filter tool and then reference the quick information provided at the bottom right corner of the screen.

You will have first filtered the data to only show the sales made by David Garcia doing this in the sales representative column.


Now to calculate the total revenue, you can highlight the cells from the total column and we could get the total down here at the sum which is $71,040, than to calculate the total number of chairs that he sold.

You would highlight the number column, You highlight those cells and then you would look at the sun which is 321 chairs in this case.


Great job.

PART 3. PERFORM DATA MINING USING THE IF FUNCTION

The idea behind data mining is that you take the data you already have and you create new or additional data from it. There are different functions and analysis that you can perform as data mining. But one common function that is used is the IF function.

 When using the if function, what we're doing is asking a question about the data and then we get to responses depending on whether the answer would be true or false for the question.

Before we get into an example let's remove first the filter, we had applied from the previous task so that we show all the sales.

So, we'll go to the sales representative column and then click on filter and select all to show all the sales again.


So let's see how we would use the if function with an example which will be easier to understand.

Let's say that when an order has 20 chairs or more, the client receives a 5% wholesale discount.

So we first want to figure out which sales had 22 years or more to see if we need to apply the discount to the sale or not.

So in our table we're going to add a new column which we will call it discount and we could do this to the right of the total column.


So in column M we will write as a header discount and after pressing enter Excel automatically recognizes that we're adding an additional column to our table.

We can make this column a little wider.



And so in this column we're going to have an N for NO if the sale doesn't have the wholesale discount because there were fewer than 20 chairs or a Y for YES for sales that had more than 20 chairs and we'll have the wholesale discount and we're going to do this with the IF function.

As with every function we start with an equal sign and then we'll type IF and here you'll see the list of different if functions that Excel has.



We're just going to use the regular if function and here a little trick is that we could click on or type tab

and Excel will open up the parentheses for our function and will list what we need to include inside The parentheses.

So we have the logical test which is the question that we're answering which is if the sale had more

than 20 chairs or 20 chairs or more then what value we want to have as a response if this was true, which in our case is a Y for yes and the value if the condition is false, which in this case is an N for no.

So let's let's write our function. What we're testing is if cell J5 Is equal more than or equal to 20.

So we'll click on cell J5 then well we're asking if it's greater than or equal to 20 then we want a Y.

As a response for the true condition. And since we're asking for a text response we need to use

quotation. So Y Within quotations and then comma. And if this is false and then we would close the parentheses and we'll press enter and we'll see what we get.



So you could see that the that after writing the equation it's auto filled the remaining cells of the table.



And let's check if this if the equation worked. In the first case we have an end because the number of chairs sold in that first sale was 15 which is less than 20.

But then in the second one we have a Y because there were 22 Chairs sold which is greater than or equal to 20.

So it did work great well to make it easier to see we can maybe align the text in the middle of the cell so let's highlight the whole M Column and then we can align our text in the center so that we could see the end and the Y.

And now with this we know which sells would have the wholesale discount in which sales don't and we could even filter the data to show only the sales with the discount if we wanted to.

Now let's continue with this example to practice this a little more instead of listing the sales that have

the discount and which ones don't.

Let's say that we actually want to calculate the final price for the sale, applying the discount only to the sales that had 20 or more chairs. We could also do this with the IF function we're going to create a new column on column end so I'm going to drag this little to the right and we're going to name this new column, Final Price.




Use the IF function we're going to start like we did last time. So equals if Press tab to open the parentheses and in our case the logical function is the same as we previously had,




where we're asking if J5 Is greater than or equal to 20 But now the response is a little more complex because if this is true, we want to apply the discount to the total price that we calculated and for sales with 20 or more chairs, the final price is going to be 95% of this initial total that we calculated which would be applying that 5% discount.

So 100 -5 gives us 95. So to do that, ,make 95% as a decimal 0.95 and we're going to multiply that so we'll use an asterisk and then multiply it by that total that we had previously calculated.

So our response if True it would be 0.95 times L5. But if it's false we don't want to apply the discount so Well right comma and then for false we would just have sell L5 because we don't want to have a discount apply to it and then we'll close the parentheses and press enter and it should auto populate the remaining cells of the column.



And we could now check this.



So for all of the sales that didn't have a discount, the final price should be the same as the total.

So we could see that for our first sale but for the second sale which did have The discount, the final price, you could see that it's a little lower than our initial total, meaning that that 5% discount has been applied.

Now we could format this column by highlighting the total column and stating that we have Currency in this one.

And now we can see that it's changed to Philippine Peso.



And since they're pretty big numbers we can remove the zeros after the decimal by clicking on the decrease decimal icon two times.

And now we have the correct format for our new column with the final price.

Great job.

 PART 4. PERFORM REFERENCES BETWEEN TABLES AND VLOOK UP

 

In this task you will learn how to relate tables located in different worksheets and searching for information using the VLOOK up function.

You will see on the spreadsheet that there is a second worksheet named CUSTOMER INFO which relates the customer I. D. Number that we have in the first table with the company name for the customer and the client representative.




Let's say that in the first table you would like to add an additional column that includes the company name. In addition to the client ID, using the VLOOK UP function, we can actually look at this information which would relate the Customer I. D. listed for each sale to the company name.

So let's see how we would do this.

We're going to first add a new column to insert that company name and we'll position it to the left of the model column so that it's right after the customer I. D.

So we're going to right click on column G and select insert columns and we'll name this new column Company Name.


Now let's learn about the VLOOK UP function and how we can use it. Our goal is to insert the company name using the client ID that is stated for each sale.

So for example since the client ID for the first sale is 132, if we go to the second worksheet that should list the company name of.

Thank you and we're going to do that with the VLOOK UP function. Doing it manually for each cell would take forever and that's why it's useful to have a function that can look up information from another worksheet or even from the same worksheet and pull that information to this table.

So we'll start by writing the function VLOOK UP at cell G5 and as we do with all functions we start with the equal sign VLOOK UP and then as we did with the IF function we could press tab to open up the parentheses.


And now let's see what we want to include in our function. So the first thing that we need to include is the look up value which in this case is the Customer ID Number from cell F5.

So you'll click at five and that's why what we're trying to look up then we have to select the table array.

So it's the cells where we're going to check if we can find this customer I. D. Number and that is when we're connecting it to the second worksheet.



So now we're connecting information from two different tables located in two different worksheets.

But you just, all you have to do is go to the second worksheet and we're going to highlight the cells from A4 to C12.


So we're highlighting these cells with the content of the 2nd table and up here we can keep track of what we're including in our function. You can see that when we highlighted these cells, it references that these cells belong to the customer info worksheet tab.



So we don't have to do anything special when referencing to different worksheets and automatically Detects that were in the customer info worksheet. Now for this range, we are going to want to anchor these cells so that the range doesn't change when we apply this formula to the other cells for the column.

We could do this by adding a dollar sign in front of the A and the 4 and then in front of the C and the 12.

But there's also a shortcut to do this which is using F4 on your keyboard. So by typing F4 which I'll do right now you can see that it has added the dollar sign in front of the letters and the numbers for our reference cells.



So, after specifying the table array, you then need to specify the column index member from the reference table of the value you want to return.

Our reference table is the one we have just highlighted and it has three columns where column one is the customer ID, column two is the company name and Column three is the name of the client representative.

So in our case we would like to because we want the function to return the company name. So in our formula after the table already will have comma 2. Lastly you're going to have a true or false statement depending on whether you want an approximate or an exact match. The vast majority of the times you use V LOOK UP your state falls to look for an exact match.

So here after the two you'll have a column false and we'll close our parentheses.




And now after closing our parentheses we could check our formula again. So we have F5 then the cell reference with the dollar signs. Then the column index number of two for the company name , comma false for looking for the fact that we're looking for an exact value. Now we could press enter and you'll see that it took us back to our first worksheet and it has inserted the company names for or that match with the different customer IDs for our table.



Great. Well now let's practice this again. But this time adding a new column with the names of the client representatives. The process will be the same but the column that the VLOOK UP function returns will be different.

Use the V LOOKUP function to look up the names of the client representatives. You will need to add a new column position to the right of the company named column you just created. Well in this case you would have F5 for the customer I.D. from the table and the sales data worksheet. You have the cell reference the same as before.

Now the only thing that changes is that the column index number is number three.

Because it's the third column that we want to get as a response the client representative name and then

the condition of false.

Now we could press enter and it will take us back to our table.



 So for this example you should have added a new column named it company name and you would have written the V LOOK UP function in cell H5 and you could see that it has auto populated this new column with the client representative names.



Great job.

 PART 5. PERFORM DATA ANALYSIS USING PIVOT TABLE

 

In this task you will learn about another strategy for data analysis which is using pivot tables. Pivot tables allow you to summarize and analyze the data by seeing comparisons in your data.

To create the pivot table, first highlight the entire sales table by doing it manually or by clicking on any cell on the table and then typing Control A and it'll automatically highlight all the data from the table.


Then to add the pivot table, you will go to Insert in the ribbon and then select Pivot table. It will ask you to define the table or range which you have already done by highlighting the cells. Then it will ask you to choose where you want the pivot table Report to be placed.

I always recommend selecting new worksheet so its placement doesn't affect the other data that already exists in the current worksheet.

Now we'll click Ok. and you will see that a new worksheet appears to the left of the sales data worksheet.

Now that we have the new worksheet, let's create the first pivot table to the left. You'll see the empty pivot table and to the right of the screen you'll see a section called pivot table fields where you will define what information you would like to plot on your pivot table to create a pivot table.



You will define the categories as either filters, columns, rows or values in the values category.

You will only have numbers. So in our case this could be the units column. For the number of chairs sold or the final price column.

For example if a category is specified as a column, it will appear in the pivot table as the columns of the table and the same applies to rose than for filters.

This is less commonly used but you can also have a category appear as a filter with a drop-down menu to select from. It's probably easier to understand with an example.

Imagine your boss asks you to calculate how much each sales representative made in sales throughout the first six months of the year in the value section.

You would want to include the final price category since it shows the total profit from each sale after the wholesale discount.

And this one makes sense because that final price category has numbers in them.

Right? So we would drag that two values to include it in our pivot table.



Then you will also want to include the cells representative category.

But where should this one go? Well, the good news is that it doesn't really matter. You can either put it as rows or columns. The information displayed will be exactly the same.

But the arrangement of the pivot table will be a little different and we could test this. So if we take the sales representative category we drag it to columns, we could see that for each sales representative we get the sum of the total income from their sales.



But we can move that the sales representative two rows instead of columns and we get the same information.



Just the arrangement is now a little different pivot tables are a great way to obtain specific information very quickly.

And you can also do multiple comparisons at once to add on to this pivot table.

Let's say that we now want to calculate how much each sales representative made in sales every month.

We would still keep the final price category in the values and then you could put the sales representative in columns for example and then the month category in rows.

And now let's see what the pivot table looks like.



Well we have the different sales representatives as columns and then we have the different months from January to June as rows and here we get the total income from sales for each sales representative for each month. As a little challenge, Try using the pivot table to calculate how many chairs of each model were sold each month.

If you ever have a pivot table already made and you want to start one from scratch, you can do so by de selecting the categories that you previously had to start with a blank pivot table.

So in this case we wanted to analyze or see how many model or how many chairs of each model were sold in each month. So in this case the values that we want to plot are not the final price but the number of chairs sold. So that will go in values.

Then you could add the model as columns and the month as rows.

Or you could have these flipped and you would get the same information.



It's just the table would look a little different.

So now our pivot table is showing for each model of chair how many what was the number of chairs sold?

So now we're not looking at price or income but the number of chairs sold in each month of that year.

Congratulations.

You have successfully completed the Introduction to data analysis using Microsoft Excel.