The technology disclosed introduces a concept of training a neural network to create an embedding space. The neural network is trained by providing a set of K+2 training documents, each training document being represented by a training vector x, the set including a target document represented by a vector xt, a favored document represented by a vector xs, and K>1 unfavored documents represented by vectors xiu, each of the vectors including input vector elements, passing the vector representing each document set through the neural network to derive an output vectors yt, ys and yiu, each output vector including output vector elements, the neural network including adjustable parameters which dictate an amount of influence imposed on each input vector element to derive each output vector element, adjusting the parameters of the neural network to reduce a loss, which is an average over all of the output vectors yiu of [D(yt,ys)−D(yt, yiu)].