0
ID_0040R73
2010-05-14
1
ID_0046BNK
2010-11-29
2
ID_005QMC3
2010-03-21
3
ID_0079OHW
2010-08-21
4
ID_00BRP63
2010-08-29
0
ID_009D84L
2010-04-24
1
ID_01DO2EQ
2010-01-01
2
ID_01QM0NU
2010-10-23
3
ID_024NJLZ
2010-10-14
4
ID_02BYET3
2010-09-16
0
ID_009D84L
0
1
ID_01DO2EQ
0
2
ID_01QM0NU
0
3
ID_024NJLZ
0
4
ID_02BYET3
0
The shape of the train set is: (12079, 14)
The shape of the test set is: (5177, 13)
The shape of the combined dataframe is: (17256, 14)
17251
ID_ZYXX5AF
2010-07-18
17252
ID_ZYYOZ5L
2010-12-04
17253
ID_ZZ1GTKD
2010-09-24
17254
ID_ZZDXQSI
2010-07-17
17255
ID_ZZYTLV1
2010-07-17
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17256 entries, 0 to 17255
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ID 17256 non-null object
1 Policy Start Date 17256 non-null object
2 Policy End Date 17256 non-null object
3 Gender 16741 non-null object
4 Age 17256 non-null int64
5 First Transaction Date 17256 non-null object
6 No_Pol 17256 non-null int64
7 Car_Category 11880 non-null object
8 Subject_Car_Colour 7289 non-null object
9 Subject_Car_Make 13719 non-null object
10 LGA_Name 7998 non-null object
11 State 7980 non-null object
12 ProductName 17256 non-null object
13 target 12079 non-null float64
dtypes: float64(1), int64(2), object(11)
memory usage: 1.8+ MB
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17256 entries, 0 to 17255
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ID 17256 non-null object
1 Policy Start Date 17256 non-null datetime64[ns]
2 Policy End Date 17256 non-null datetime64[ns]
3 Gender 16741 non-null category
4 Age 17256 non-null int64
5 First Transaction Date 17256 non-null datetime64[ns]
6 No_Pol 17256 non-null int64
7 Car_Category 11880 non-null category
8 Subject_Car_Colour 7289 non-null category
9 Subject_Car_Make 13719 non-null category
10 LGA_Name 7998 non-null category
11 State 7980 non-null category
12 ProductName 17256 non-null category
13 target 12079 non-null float64
dtypes: category(7), datetime64[ns](3), float64(1), int64(2), object(1)
memory usage: 1.1+ MB
Gender
['Male', 'Female', 'Other', NaN]
Categories (3, object): ['Male', 'Female', 'Other']
Car_Category
['Saloon', 'JEEP', NaN, 'Motorcycle', 'Truck', ..., 'Wagon', 'Shape Of Vehicle Chasis', 'Sedan', 'Station 4 Wheel', 'Tipper Truck']
Length: 17
Categories (16, object): ['Saloon', 'JEEP', 'Motorcycle', 'Truck', ..., 'Shape Of Vehicle Chasis', 'Sedan', 'Station 4 Wheel', 'Tipper Truck']
Subject_Car_Colour
['Black', 'Grey', 'Red', NaN, 'As Attached', ..., 'Yellow & White', 'Beige Mitalic', 'Light Gray', 'Blue Sky', 'Red Maroon']
Length: 47
Categories (46, object): ['Black', 'Grey', 'Red', 'As Attached', ..., 'Beige Mitalic', 'Light Gray', 'Blue Sky', 'Red Maroon']
Subject_Car_Make
['TOYOTA', NaN, 'REXTON', 'Lexus', 'Hyundai', ..., 'BRILLIANCE', 'Buik', 'COMMANDER', 'Bajaj', 'Datsun']
Length: 76
Categories (75, object): ['TOYOTA', 'REXTON', 'Lexus', 'Hyundai', ..., 'Buik', 'COMMANDER', 'Bajaj', 'Datsun']
LGA_Name
[NaN, 'Lagos', 'Ikeja', 'Badagry', 'Eti-Osa', ..., 'Hong', 'Ifako-Agege', 'Benue', 'Okpokwu', 'Ngor-Okpala']
Length: 271
Categories (270, object): ['Lagos', 'Ikeja', 'Badagry', 'Eti-Osa', ..., 'Ifako-Agege', 'Benue', 'Okpokwu', 'Ngor-Okpala']
State
[NaN, 'Lagos', 'Benue', 'Eti-Osa', 'Delta', ..., 'ENUGU-SOUTH', 'Ijebu-North', 'Asari-Toru', 'Idemili-south', 'Ngor-Okpala']
Length: 114
Categories (113, object): ['Lagos', 'Benue', 'Eti-Osa', 'Delta', ..., 'Ijebu-North', 'Asari-Toru', 'Idemili-south', 'Ngor-Okpala']
ProductName
['Car Classic', 'CarSafe', 'Muuve', 'CVTP', 'Car Plus', 'Motor Cycle', 'Customized Motor', 'CarFlex', 'Car Vintage']
Categories (9, object): ['Car Classic', 'CarSafe', 'Muuve', 'CVTP', ..., 'Motor Cycle', 'Customized Motor', 'CarFlex', 'Car Vintage']
0
ID_0040R73
2010-05-14 00:00:00
1
ID_0046BNK
2010-11-29 00:00:00
2
ID_005QMC3
2010-03-21 00:00:00
3
ID_0079OHW
2010-08-21 00:00:00
4
ID_00BRP63
2010-08-29 00:00:00
0
ID_0040R73
2010-05-14 00:00:00
1
ID_0046BNK
2010-11-29 00:00:00
2
ID_005QMC3
2010-03-21 00:00:00
3
ID_0079OHW
2010-08-21 00:00:00
4
ID_00BRP63
2010-08-29 00:00:00
(12079, 543) (12079,)
(20716, 543) (20716,)
F1 score on the X_test is: 0.9462229932187353
0
ID_009D84L
1
1
ID_01DO2EQ
1
2
ID_01QM0NU
0
3
ID_024NJLZ
0
4
ID_02BYET3
1