Show HN: Graphsignal – Machine learning profiler for training and inference https://ift.tt/bPIieUC
Show HN: Graphsignal – Machine learning profiler for training and inference Hi HN, I'm the founder of Graphsignal ( https://graphsignal.com ). Graphsignal is a machine learning profiler. We've created it to make ML profiling simple and usable. It provides performance summaries, ML operation and kernel level statistics as well as detailed resource usage information necessary for making training and inference faster and more efficient. Profilers help fix performance issues, improve user experience and reduce computation costs. Such improvements benefit machine learning profoundly; model training jobs that run for hours or days could be made much shorter and inference latency could be reduced resulting in significantly lower costs and improved user experience. I realized the benefits in one of my previous projects, where the model would have to be trained regularly and be used for inference on huge amount of data. Having spent last decade developing profiling and monitoring tools, it seemed logical for me to use a profiler for the task. But since the training and inference were running remotely, I had a hard time using existing ML profilers. TensorFlow and PyTorch provide built-in ML profilers, which utilize NVIDIA's profiling interface (CUPTI) under the hood for GPU profiling. One way to use those profilers is via locally installed TensorBoard or by logging the profiles. In turn, Graphsignal Profiler ( https://ift.tt/k4IRe9o ) uses the built-in profilers as well as other tools to enable automatic profiling in any environment, including notebooks, training pipelines, periodic batch jobs, model serving and so on, without installing additional servers/software. It also allows teams to share and collaborate online. Basically, the profiles along with environment and usage information are be automatically recorded and sent to Graphsignal where they are available for analysis. Trying it out is easy: 1) sign up for a free account; 2) add the profiler to your ML code and run it; 3) see and analyze the profiles at graphsignal.com. Everything is described in the Quick Start Guide https://ift.tt/6CIMmHJ . I'm very excited to show it to you here and will appreciate any thoughts, comments and feedback! https://ift.tt/WVrED9A March 10, 2022 at 08:18AM
Comments
Post a Comment