This paper deals with the problem of temporal segmentation present in practical applications of action and gesture recognition. In order to separate different gestures from gesture sequences a novel method utilizing depth information, oriented gradients and supervised learning techniques is proposed in this paper. The temporal segmentation task is modeled as a two-class problem and histogram oriented gradients of the gesture boundary and non-boundary sample frames are incorporated in the feature table as positive and negative training vectors, respectively. The classification task is carried out using both Euclidean Distance based and Support Vector Machine classifiers. A clustering algorithm is employed thereafter to finally locate the temporal boundaries of gestures. Through extensive experimentation it is shown that, the proposed method can provide a high degree of accuracy in temporal gesture segmentation in comparison to a number of recent methods.