Generative Enzymes Model

Adapting the CbAS algorithm to generate plastic degrading enzymes with information from structure.

Animation of the CbAS algorithm by Yinqi Huang:

A VAE is trained over existing enzyme sequences to parameterize a prior distribution. Feed forward neural networks are trained to predict an enzyme’s catalytic activity and thermostability from its sequence. At generation time, 1000 Gaussian normal samples are drawn from the VAE’s latent space and decoded to generate novel protein sequences. Each one is scored by the feed forward neural networks to predict function. Samples are reweighed, with more weight going to samples with high predicted function and closeness to the prior. The VAE is retrained on the reweighed data for the next iteration.

Writeup

Code