Best approach to CV data set