Optimizing architecture and learning strategy for End-to-End memory networks

In this project, I explore how various optimization strategies, various learning strategies, and modifying model architecture affect a memory network’s performance. This work built a precursor for my work on EfficientBert.

This work has been published on Nvidia’s dev blog.

Resources

Previous
Previous

Building high performing models