Golang Developer with DevOps/LLM Experience - Remote / Telecommute

Remote Full-time
Job Description: Required Skills: • Proficiency in Golang for building scalable and performant backend services. • Deep experience building services in modern cloud environments on distributed systems (i.e., containerization (Kubernetes, Docker), infrastructure as code, CI/CD pipelines, APIs, authentication and authorization, data storage, deployment, logging, monitoring, alerting, etc.) • Experience working with Large Language Models (LLMs), particularly hosting them to run inference. • Strong verbal and written communication skills. • Candidates job will involve communicating with local and remote colleagues about technical subjects and writing detailed documentation. • Experience with building or using benchmarking tools for evaluating LLM inference for various models, engine, and GPU combinations. • Familiarity with various LLM performance metrics such as prefill throughput, decode throughput, TPOT, and TTFT. • Experience with one or more inference engines: e.g., vLLM, SGLang, and Modular Max. • Familiarity with one or more distributed inference serving frameworks: e.g., llm-d, NVIDIA Dynamo, and Ray Serve etc. • Experience with client and NVIDIA GPUs, using software like CUDA, ROCm, AITER, NCCL, Client, etc. • Knowledge of distributed inference optimization techniques - tensor/data parallelism, KV cache optimizations, smart routing etc. • Develop and maintain an inference platform for serving large language models optimized for the various GPU platforms they will be run on. • Work on complex AI and cloud engineering projects through the entire product development lifecycle (PDLC) - ideation, product definition, experimentation, prototyping, development, testing, release, and operations. • Build tooling and observability to monitor system health, and build auto tuning capabilities. • Build benchmarking frameworks to test model serving performance to guide system and infrastructure tuning efforts. • Build native cross platform inference support across NVIDIA and client GPUs for a variety of model architectures. • Contribute to open source inference engines to make them perform better on DigitalOcean cloud. Apply tot his job
Apply Now

Similar Opportunities

Go (Golang) Backend Developer

Remote

Google Ads Lead Generation Specialist job at SMB Team in Philadelphia, PA

Remote

100% Remote Golang Developer with Devops/LLM exp. W2 Consultant

Remote

Google Ads Specialist - Water Damage / Roofing Experience Required

Remote

Google Ads Specialist

Remote

Management Consultant - Remote, High-Income, Flexible Work

Remote

Senior RIM Consultant, Info Governance

Remote

[Remote] Senior Change Management Consultant (Manager or Director Level)

Remote

Business Growth Consultant

Remote

Management Consulting Expert

Remote

Digital Banking Fraud Analyst (in-office) - Midvale, UT

Remote

Remote Customer Experience Specialist – Amazon

Remote

E-Discovery Application Administrator II

Remote

Experienced Customer Support Response Specialist – Remote Opportunity for English, Spanish, and French Canadian Speakers at blithequark

Remote

Experienced Remote Data Entry Clerk and Typist – Detail-Oriented and Organized Professional for Accurate Data Management

Remote

Talent & Operations Assistant for Hiring, Candidate Management, and Onboarding

Remote

Part-Time Data Entry Clerk - Entry Level Opportunity for Career Growth in a Remote Setting

Remote

**Experienced Live Chat Specialist – Remote Customer Support Representative**

Remote

**Experienced Entry-Level Data Entry Specialist – E-commerce Operations (Part-Time)**

Remote

Experienced Remote Customer Service Representative – Delivering Exceptional Client Experiences and Driving Business Growth at arenaflex

Remote
← Back to Home