Patent attributes
A protein searcher includes a pre-trained CNN, a feature extractor, a database and a KNN searcher. The pre-trained CNN, trained on a previously classified amino acid database, receives an unidentified amino acid sequence. The feature extractor extracts a feature vector of the unidentified amino acid sequence as a query feature vector. The database stores feature vectors of trained amino acid sequences and of at least one untrained amino acid sequence and stores associated classes of the trained amino acid sequences and associated tags of the at least one untrained amino acid sequence. The KNN searcher finds K feature vectors of the database which are close to the query feature vector and outputs the associated class or tag of each of the K feature vectors.