Text Classification in NLP using Cross Validation and BERT

Nandan Grover
14 min readFeb 15, 2023

Introduction

In natural language processing, text categorization tasks are common (NLP). Depending on the data they are provided, different classifiers may perform better or worse (eg. Uysal and Gunal, 2014). However, there is data where a correlation between (vectorised) texts and classes would be expected, but the assumption is not satisfied, and the classifiers perform poorly. The main reason for this is that there are a lot of classes and a lot of different texts. We’ll look at a variety of preprocessing strategies as…

--

--