Posted 27 Sept 2024, 8:00 am

AI Simulation Architect at Tenstorrent

We are seeking a skilled and experienced Large-scale High-Performance Computing (HPC) and AI Simulation Architect to join our team. As an HPC Architect, you will lead the development of large-scale simulation environments for cutting-edge architectures in high-performance computing systems, enabling efficient and scalable computation for AI, scientific research, and data-intensive applications. You will work closely with cross-functional teams, including hardware engineers, software developers, and domain experts, to deliver optimized and efficient simulation environments that meet the demanding requirements of HPC workloads.

This role is Remote based out of The United States.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.

 

Responsibilities:

  • Design simulation models/environments for large-scale AI/HPC systems consisting of tens of thousands of computational nodes, scale-out/scale-up switches/interconnects, and heterogeneous caching/memory systems.
  • Define simulation abstraction layers to manage different levels of simulation hierarchies, from abstract analytical roofline models to detailed cycle-accurate models, balancing simulation speed and accuracy.
  • Conduct performance analysis and benchmarking, writing performance models to identify bottlenecks, optimize system parameters, and guide architectural enhancements.
  • Simulate, design, and lead the development of high-performance computing architectures that deliver exceptional computational performance, scalability, and energy efficiency.
  • Collaborate with hardware engineers to design and optimize computational components, including processors, accelerators, interconnects, and memory subsystems.
  • Work closely with software developers to define and implement software development frameworks, libraries, and tools that maximize performance and productivity on the target HPC architecture.
  • Define and recommend system-level requirements, including processing power, memory capacity, I/O bandwidth, and storage capabilities, ensuring compliance with industry standards and customer expectation
  • Evaluate and select appropriate technologies, including processors, accelerators, and network fabrics, based on application requirements, performance & power characteristics, and cost considerations.

 

Experience & Qualifications:

  • 15+ years of experience
  • Experience coding performance models in C++
  • Bachelor's or Master's degree in Computer Engineering, Electrical Engineering, or a related field. A Ph.D. is a plus.
  • Strong expertise in high-performance computing architecture design, including processors, accelerators, interconnects, and memory subsystems.
  • Experience developing new architectures using large scale performance simulation environments, for example GEM5 or SST
  • Experience analyzing workload behavior on large systems using open-source or custom software tools
  • Proven experience in designing and optimizing HPC architectures for scientific, research, or data-intensive applications.
  • Proficiency in parallel programming models and frameworks, such as OpenMP, MPI, CUDA, or OpenCL, and their application to HPC workloads.
  • Solid understanding of performance analysis and optimization techniques for parallel computing, including profiling, tracing, and performance counters.
  • Familiarity with industry-standard interconnects and network fabrics, such as InfiniBand, Ethernet, or Omni-Path, and their impact on HPC system performance.
  • Knowledge of memory subsystems and memory hierarchy designs, including cache coherence protocols, memory models, and NUMA architectures.
  • Experience with HPC software stack components, such as compilers, runtime systems, job schedulers, and scientific libraries.
  • Strong programming skills in languages commonly used in HPC, such as C, C++, Fortran, or Python.
  • Excellent problem-solving abilities and the ability to analyze and address complex performance and scalability challenges.
  • Strong communication and collaboration skills to work effectively with cross-functional teams and domain experts.

 

Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

Due to U.S. Export Control laws and regulations, Tenstorrent is required to ensure compliance with licensing regulations when transferring technology to nationals of certain countries that have been licensing conditions set  by the U.S. government.

Our engineering positions and certain engineering support positions require access to information, systems, or technologies that are subject to U.S. Export Control laws and regulations, please note that citizenship/permanent residency, asylee and refugee information and/or documentation will be required and considered as Tenstorrent moves through the employment process.

If a U.S. export license is required, employment will not begin until a license with acceptable conditions is granted by the U.S. government.  If a U.S. export license with acceptable conditions is not granted by the U.S. government, then the offer of employment will be rescinded.



Please mention the word **SUPPORTIVE** and tag RMTg0LjE0Ny4yNS4xMjM= when applying to show you read the job post completely (#RMTg0LjE0Ny4yNS4xMjM=). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.

The offering company is responsible for the content on this page / the job offer.
Source: Remote Ok