company logo

Infra Architect - AI GPU

Mumbai
Bangalore
Delhi
Noida
Gurgaon
Pune
Full-Time
Executive: 10 to 30 years
Posted on Oct 29 2024

About the Job

Skills

Architecture Design
GPU
Artificial Intelligence (AI)
Machine Learning
Data Centre
Certified Kubernetes Administrator

Job Summary:


We are seeking an experienced AI/GPU Infrastructure Architect with Total 15 Years of experience to design and optimize GPU-based infrastructure for machine learning and artificial intelligence applications. The ideal candidate will have a deep understanding of AI workloads, GPU architecture, and infrastructure design, along with hands-on experience in deploying scalable and efficient solutions.


Key Responsibilities:


  • Architecture Design: Develop and implement architecture for AI and GPU infrastructure, ensuring scalability, reliability, and performance.
  • Infrastructure Optimization: Optimize GPU resource allocation and management to enhance performance for AI workloads.
  • Collaboration: Work closely with data scientists, software engineers, and IT teams to understand requirements and translate them into architectural solutions.
  • Performance Monitoring: Set up monitoring and benchmarking tools to assess system performance and make recommendations for improvement.
  • Research and Development: Stay updated with the latest advancements in AI and GPU technologies and assess their applicability to our infrastructure.
  • Documentation: Create and maintain architectural documentation, including design specifications, best practices, and deployment guides.
  • Security and Compliance: Ensure that infrastructure designs meet security standards and compliance requirements.
  • Training and Support: Provide guidance and training to teams on infrastructure usage and best practices.



Required Skills:


  • Proven experience as an infrastructure architect in the field of Data Centre Infrastructure (DC Rack Planning, Compute, Storage, Network) , specifically with AI and GPU technologies.
  • Strong understanding of GPU architecture and parallel processing concepts.
  • Experience of designing high performance storage solutions for AI kind of workloads with innovative solutions.
  • Knowledge of designing high performance network solutions for AI Workloads using technologies like InfiniBand, ROCE, GPU Direct etc..
  • Experience with distributed computing and microservices architecture
  • Proficiency in cloud platforms (e.g., AWS, Azure, Google Cloud) and containerization technologies (e.g., Docker, Kubernetes).
  • Familiarity with AI frameworks (e.g., TensorFlow, PyTorch) and ML libraries.
  • Experience with system performance tuning and optimization techniques.
  • Knowledge of security best practices related to AI and infrastructure management.
  • Excellent problem-solving skills and ability to work in a collaborative environment.
  • Strong communication skills, both verbal and written


Qualifications:

  • Bachelors in Engineers / MCA
  • Certification in relevant technologies (e.g., Azure Certified Solutions Architect, NVIDIA certifications, Certified Kubernetes Administrator).



About the company

We are the force behind the meteoric rise of Indias leading telecom operator Jio with 400 Million+ customers. In Addition to this we have also powered an exhaustive list of digital apps & services that have delivered functionality, usability, engagement, scale and loyalty. We provide solutions for customers (B2C) and enterprise (B2B). We have an end to end 5G solution consisting of 5G Radio, a com ...Show More

Industry

Media & Telecommunication...

Company Size

51-200 Employees

Headquarter

Navi Mumbai, Maharashtra

Other open jobs from Jio