Job Description | Role and Responsibilities
Development and refinement of models in standard AI frameworks like PyTorch, Tensor flow. Work closely with AI Compiler and hardware accelerator teams and add support for compiler features covering optimization algorithms, code generation, etc. to fully utilize the hardware features for maximum efficiency. Be well acquainted with the latest trends in ML models and compiler technologies to build innovative solutions in our products.
- Develop and maintain highly efficient low-level parallel compute kernels for NN operators to support generative AI model architectures (LLM, CV, etc.)
- Integrate optimized kernels into machine learning frameworks for training and inference workloads
- Optimization of NN (Neural networks) work-loads in standard ML frameworks like PyTorch
- Conduct performance profiling, identify hotspots and resolve performance issues
- Develop and validate test cases for stability and performance measurement of kernels