Abstract: Gesture typing is a mode of input where the user draws a continuous path
on a keyboard to interact with their device. Through proper data augmentation, a
smaller dataset can be taken and given additional data to be used for training a
supervised recognition model to recognize gestures. We propose a conditional GAN
architecture to generate realistic keyboard gestures after training on a dataset of real
user swipes. The model takes a gaussian noise vector and the straight-line path
prototype for the target word and outputs a new gesture, using the straight-line path as
a semantic map at multiple layers of the network. We show that this method can
generate multiple different gestures for a word, without the need for reference gestures
to generate, and even on words that have not been seen before. Through experiments
on a dataset with unseen words, we evaluate our model against the minimum jerk
model for generating gestures and show that our model outperforms it for generating
realistic gestures.