k-means is not the right technology for situations in which you have labels, except for the situation in which the labels have numeric values that can be made commensurate with the numberic coordinates. For example if you can say that having a label differ by no more than 1 is 10.28 times as important as having column 3 differ by 1, then you might be able to use k-means by adding the numeric value of the label as an additional coordinate. But this is not the usual case.
When you have matrices of numbers and a label associated with the matrix, then Deep Learning or (Shallow) Neural Network techniques are more appropriate. Consider that if you have a matrix of data and a label, and the matrices are all the same size, that that situation could be treated the same was as if the matrix of data were an "image"