Exploratory Data Analysis in SQL
Main Takeaways at the Bottom
Getting an Idea of what is included in the data frame.
0
0
536365
1
1
536365
2
2
536365
3
3
536365
4
4
536365
5
5
536365
6
6
536365
7
7
536366
8
8
536366
9
9
536367
0
536365
85123A
1
536365
71053
2
536365
84406B
3
536365
84029G
4
536365
84029E
5
536365
22752
6
536365
21730
7
536366
22633
8
536366
22632
9
536367
84879
Customer Analysis
What are the countries to which our customers reside?
0
United Kingdom
1
France
2
Australia
3
Netherlands
4
Germany
5
Norway
6
EIRE
7
Switzerland
8
Spain
9
Poland
How many transactions in each country?
0
United Kingdom
23494
1
Germany
603
2
France
461
3
Spain
105
4
Sweden
46
5
EIRE
360
6
Netherlands
101
7
Portugal
71
8
Switzerland
74
9
Channel Islands
33
How many customers are there per country?
0
United Kingdom
3951
1
France
88
2
Australia
9
3
Netherlands
9
4
Germany
95
5
Norway
10
6
EIRE
4
7
Switzerland
22
8
Spain
31
9
Poland
6
How many customers do we have in total?
0
4373
Top 10 Customers with the most orders
0
17850.0
1693
1
13047.0
1355
2
12583.0
5009
3
13748.0
439
4
15100.0
58
5
15291.0
2074
6
14688.0
3222
7
17809.0
2015
8
15311.0
37720
9
14527.0
2049
What is the top 10 products for customer 17850?
0
85123A
WHITE HANGING HEART T-LIGHT HOLDER
1
71053
WHITE METAL LANTERN
2
84029G
KNITTED UNION FLAG HOT WATER BOTTLE
3
84406B
CREAM CUPID HEARTS COAT HANGER
4
22633
HAND WARMER UNION JACK
5
22632
HAND WARMER RED POLKA DOT
6
84029E
RED WOOLLY HOTTIE WHITE HEART.
7
21730
GLASS STAR FROSTED T-LIGHT HOLDER
8
37370
RETRO COFFEE MUGS ASSORTED
9
21068
VINTAGE BILLBOARD LOVE/HATE MUG
Gross Sales to Each Customer (Top 10)
0
14646.0
279489.02
1
18102.0
256438.49
2
17450.0
187482.17
3
14911.0
132572.62
4
12415.0
123725.45
5
14156.0
113384.14
6
17511.0
88125.38
7
16684.0
65892.08
8
13694.0
62653.1
9
15311.0
59419.34
Product Analysis
Average Order Price? Average Unit Price? Average Order Quantity?
Gross Sales. Gross Sales by SKU
0
9747747.93
0
22423
REGENCY CAKESTAND 3 TIER
1
47566
PARTY BUNTING
2
85123A
WHITE HANGING HEART T-LIGHT HOLDER
3
85099B
JUMBO BAG RED RETROSPOT
4
23084
RABBIT NIGHT LIGHT
5
POST
POSTAGE
6
22086
PAPER CHAIN KIT 50'S CHRISTMAS
7
84879
ASSORTED COLOUR BIRD ORNAMENT
8
79321
CHILLI LIGHTS
9
23298
SPOTTY BUNTING
What are the top 10 selling products? In UK?
0
22423
REGENCY CAKESTAND 3 TIER
1
47566
PARTY BUNTING
2
85123A
WHITE HANGING HEART T-LIGHT HOLDER
3
85099B
JUMBO BAG RED RETROSPOT
4
22086
PAPER CHAIN KIT 50'S CHRISTMAS
5
84879
ASSORTED COLOUR BIRD ORNAMENT
6
79321
CHILLI LIGHTS
7
22502
PICNIC BASKET WICKER 60 PIECES
8
21137
BLACK RECORD COVER FRAME
9
23084
RABBIT NIGHT LIGHT
Average Price Per SKU?
0
85123A
3.1
1
71053
4.78
2
84406B
4.25
3
84029G
5.18
4
84029E
5.07
5
22752
9.77
6
21730
5.32
7
22633
2.19
8
22632
2.2
9
84879
1.72
Average Order Price
0
17.99
Average Unit Price
0
4.61
Average Order Quantity
0
9.55
0
AMAZONFEE
AMAZON FEE
1
22502
PICNIC BASKET WICKER 60 PIECES
2
CRUK
CRUK Commission
3
M
Manual
4
DOT
DOTCOM POSTAGE
5
BANK CHARGES
Bank Charges
6
22828
REGENCY MIRROR WITH SHUTTERS
7
22827
RUSTIC SEVENTEEN DRAWER SIDEBOARD
8
22655
VINTAGE RED KITCHEN CABINET
9
22656
VINTAGE BLUE KITCHEN CABINET
The block above includes some records that I do not want to factor in, therefore I will exclude them in the next block to get a true sense of the highest average priced items.
0
22828
REGENCY MIRROR WITH SHUTTERS
1
22827
RUSTIC SEVENTEEN DRAWER SIDEBOARD
2
22655
VINTAGE RED KITCHEN CABINET
3
22656
VINTAGE BLUE KITCHEN CABINET
4
22823
CHEST NATURAL WOOD 20 DRAWERS
5
22826
LOVE SEAT ANTIQUE WHITE METAL
6
22016
Dotcomgiftshop Gift Voucher 100.00
7
D
Discount
8
21769
VINTAGE POST OFFICE CABINET
9
22929
SCHOOL DESK AND CHAIR
Time Series Analysis
Firstly we need to clean the data to be in the correct format and we need to extract information from the InvoiceDate column.
0
12/1/2010 8:26
Used python to add the day, month, year columns to our data frame. Will use SQL below to analyze.
0
0
536365
1
1
536365
2
2
536365
3
3
536365
4
4
536365
0
0
536365
1
1
536365
2
2
536365
3
3
536365
4
4
536365
5
5
536365
6
6
536365
7
7
536366
8
8
536366
9
9
536367
0
2010
1
2011
Time EDA
Sales per Year
0
2010
2025
1
2011
23875
0
12
Gross Revenue per Year
0
2011
8998790.91
Gross Revenue per Month over the two years.
0
11
1461756.25
1
10
1070704.67
2
9
1019687.62
3
5
723333.51
4
6
691123.12
5
3
683267.08
6
8
682680.51
7
7
681300.11
8
1
560000.26
9
2
498062.65
Main Takeaways
- We sell to 38 countries - UK has the highest count of customers at 3,951 out of 4,373 total customers. - While Customer 17850 has ordered the most product by quantity, customer 14646 has contributed to the most amount of revenue at $279,489.02 out of a total of $9,747,747.93 in the 13 months observed. - Our average order price is $17.99, the average unit price is $4.61, and the average quantity ordered is 9.55 units. - For 2011, the best month by revenue was November at $1,461,756.25.