Deep learning method has emerged as a competitive mesh-free method for solving partial differential equations (PDEs). The idea is to represent solutions of PDEs by neural networks to take advantage of the rich expressiveness of neural networks representation. In this talk, we will explore the applicability of this the powerful framework to the kinetic equations, with the emphasize in dealing with multiple scales and obtaining long time stability. The latter is especially useful for uncertainty quantification or inverse problems when repeated applications of the forward model is needed.