Taiwan AI System Alliance

Initiative

Font size:
Small
Medium
Large
Print

Initiative

Initiative

Market opportunities

  1. Merrill Lynch estimates that the global DNN training system market size grows from $650M in 2017 to $7B in 2020. Gartner estimates that 60% of DNN training systems are private and the annual growth rate is 30% ; so the total market size of private DNN training systems is expected to reach $12B in 2024.

  2. Private DNN training appliance = X86 server + GPUs + DNN training software stack.

Open AI Training System

  1. An open hardware architecture supporting a meshed PCIe network and multiple types of GPUs

  2. A DNN training computation compiler that features efficient scheduling and mapping, and advanced training algorithms

  3. A DNN Integrated Design Environment (IDE) that helps users to reduce unproductive training rounds

  4. A DNN model optimization tool that minimizes the space/time requirement of already trained DNN models.

Field Trial & Business Case

  1. System verification in AI Datacenter

  2. Business cases as the references

New Venture

  1. An innovative startup business as the open source-based AI system software provider.

Objective

Membership

Membership

*1Members are eligible to participate in TASA workshops. (At least 6 workshops in one year)

*2Members are eligible to provide one DNN training appliance for DNN training software install and test.

*3Members are eligible for system verification at ITRI DNN Farm and, if verified, consulting service as to operation in AI Datacenter.

*4Members are eligible for joint promotion with TASA at Computex, Cloud Computing Day Tokyo, and Big Data Expo. Expense is NOT included.

 

Appendix

OATS Architecture

Processor Type:

  1. Nvidia’s Tesla P100 and V100 (12GB, 4.7TFLOPs of FP64)

  2. Nvidia’s GeForce GTX 1080Ti (11GB, 11.3TFLOPs of FP32)

  3. AMD’s Radeon Instinct  MI25 (12.3 TFLOPS of FP32)

  4. AMD Radeon RX Vega 64 (12.6 TFLOPS of FP32)

  5. Intel multi-node Xeon

  6. FPGA

Number of “GPU”s: 16+

System Interconnect: Meshed PCIe network supporting disaggregate rack architecture.

Intelligent  thermal load management

Graphics driver API:

  1. CUDA

  2. OpenCL

DNN training framework:

  1. Tensorflow

  2. Caffe/Caffe2/NVCaffe

TOP