Configure the best Source Data Coverage percentage

In the visualized results, you can have a clear understanding of which of the 90% training samples are mapped to which categories and how many. And you can adjust this configuration according to your own situation. Take the previous screenshot of the OOB data for example, we can find, from row 6 to row 14, those categories do not contain many tickets. We can regard them as minor categories, so we can set the “Source data coverage” to 61%. In this way, we can cover the major categories and reduce noises as well.