How to use Confusion Matrix

Training document management

Training documents are the source of the intelligence Smart Analytics learns from. That said, the better you choose the training documents, the higher categorization accuracy you will achieve. Good training document management will eventually improve your IT help desk working efficiency.

In the Category Tuning form, the Category Trainings tab displays all of the documents/tickets that the system uses to train the current category.

If the categorization accuracy is not acceptable, you can do the following:

  1. Check the Confusion Matrix table to identify the test values highlighted in red, and perform the following steps for each corresponding expected category.
  2. Click the corresponding expected category to drill down to the category tuning page.
  3. Review all current training documents. If you find any unreasonable training documents, delete them by clicking the Delete Training Document button to improve the training quality. If you find there are not adequate number of training documents for the category, add more by using the Add Training Document button to tune this category.
  4. Click Apply Changes to make your training changes take effect.

Testing data management

The Confusion Matrix tool can keep all your test tickets, which enqables you to review the test accuracy results and then troubleshoot the causes of bad test results.

The following is one possible scenario:

  1. Find one red value in the Confusion Matrix.
  2. Click the expected category in the same column where the red value is located to drill down to the Category Tuning page for the expected category.
  3. Click the Test Samples tab to review all of the test documents for the current category and identify those unreasonable test tickets. In reality, there might be some tickets mistakenly categorized by IT agents, and these tickets are not appropriate for testing the current category.
  4. Select these unreasonable tickets and then click Delete Test Sample to delete them.

Another way to identify those bad test tickets is as follows:

  1. Find one red value in the Confusion Matrix.
  2. Click the predicted category in the same row where the red value is located to drill down to the “Smart Ticket Test Result” page, where you can see a whole list of test tickets that failed the test.
  3. Review the detail information of these test tickets.
  4. If a test ticket demonstrates some characteristics that are unknown to the system, click the Add to Tuning button to add this ticket to the training sample pool.
  5. If you find that a test ticket was put in this category by mistake, click the Remove Test Sample button to delete it to improve the test accuracy.

Term management

Sometimes you may know the typical terms for one specific category. In this case, you can add these terms directly to the system and bind them to this category.

It may also happen that when you review the terms for one specific category in the Category Terms tab, you know exactly that some terms should not have been related to this category. In this case, you can change its weight to 0 to remove it from the categorization process.

Category similarity analysis

With the Confusion Matrix tool, you can find out that some test errors are caused by the high similarity of some categories that the system can hardly distinguish.

Normally, if you notice some red blocks in the Confusion Matrix table, it means a category has a lot of test errors. You can then click those two relevant categories to drill down to the pages where you can find the terms that Smart Analytics extracted for them. Most likely, most of those terms are the same, which means the training samples that you provide to Smart Analytics for those two categories have very similar descriptions. You need to think about whether it is proper to keep them as two separate categories.