Discussion
If we start to compare multinomial (MnB) and Bernoulli Naive Bayes(BnB) when ran on hard ham we see som differnces. To begin with BnB managed to classify all mails that were hard ham correctly however it missclassified many spam-mails as hard ham. The MnB had a better accuracy overall and although it missclassified some of the hard ham mail, it did a better job of classifying the spam ones.
If we look at the spam versus easy ham in task 2 we see that MnB outperformed BnB even more in this case and it even had the highest accuracy of all the models.
To summarize we can say that in this case MnB works better to classify the emails.
The reason behind this may be that the two models operate in diffrent ways. One diffrence between the two is that MnB takes the frequency of appearing words into account while BnB only checks if the word exists or not. This makes sense since frequency of words usually do differ from spam emails and regular emails. For example, while the word 'rich' is likely to appear both in a ham and spam emails, it will likely appear more times in a spam emails to try and lure the reader. This may also be the reason why MnB performed worse on hard ham since the hard ham mails might defined by having a higher frequency of these 'suspicious' words and thus easly being missclassifed as spam by the MnB. To summarize, since frequency is important in this task, MnB is the better model to use.