Appendix III

Large Model Compute Rankings and GPU Capacity Utilization

This appendix presents a comprehensive ranking of 182 notable AI models, combining data from Epoch AI’s “Notable AI Models” database (Epo ) with organizational compute capacity estimates from Appendix I. For each model, we track:

  • Model name and developing organization(s)
  • Training compute requirements (FLOPs)
  • Lab/Cloud provider responsible for training
  • Parent organization’s 2024 estimated peak annual FLOP capacity
  • Three metrics of organizational impact:
    • Share of organization’s publicly known models: Training FLOPs divided by total known training FLOPs for that organization
    • Share of peak annual FLOP budget: Training FLOPs divided by parent organization’s 2024 estimated peak annual FLOP capacity
    • Share of peak annual FLOP budget with 100x sweep: Same as above, but assuming each model required 100x more compute for development and testing

The data is presented in six tables, ordered by decreasing training compute requirements. This allows tracking the evolution of model scale over time and comparing relative organizational investments in different AI capabilities. Note that training compute estimates for the most recent models are based on publicly available information and may be incomplete or imprecise.

Table 7.1: AI Model Training Compute Requirements (Part 1 of 6)

Model Organization Lab/Cloud Train FLOPs Parent Org Peak
Annual FLOPs
Model/Public
Models (%)
Model/Peak Annual (%) Model/Peak
w/100x (%)
Gemini 1.0 Ultra Google DeepMind Google DeepMind $5.00 \times 10^{25}$ $3.87 \times 10^{28}$ 45.65 0.129 12.93
Claude 3.5 Sonnet Anthropic Anthropic/Amazon $4.98 \times 10^{25}$ $2.27 \times 10^{28}$ 69.74 0.220 21.96
GPT-4o OpenAI Microsoft/OpenAI $3.81 \times 10^{25}$ $4.35 \times 10^{28}$ 53.36 0.088 8.75
Llama 3.1-405B Meta AI Meta AI $3.80 \times 10^{25}$ $5.65 \times 10^{28}$ 66.32 0.067 6.72
GPT-4 OpenAI Microsoft/OpenAI $2.10 \times 10^{25}$ $4.35 \times 10^{25}$ 29.41 0.048 4.82
Gemini 1.0 Pro Google DeepMind Google DeepMind $1.83 \times 10^{25}$ $3.87 \times 10^{28}$ 16.71 0.047 4.73
Claude 3 Opus Anthropic Anthropic/Amazon $1.64 \times 10^{25}$ $2.27 \times 10^{28}$ 22.97 0.072 7.23
Gemini 1.5 Pro Google DeepMind Google DeepMind $1.58 \times 10^{25}$ $3.87 \times 10^{28}$ 14.43 0.041 4.09
Llama 3-70B Meta AI Meta AI $7.86 \times 10^{24}$ $5.65 \times 10^{28}$ 13.72 0.014 1.39
GPT-4o mini OpenAI Microsoft/OpenAI $7.36 \times 10^{24}$ $4.35 \times 10^{28}$ 10.31 0.017 1.69
PaLM 2 Google Google DeepMind $7.34 \times 10^{24}$ $3.87 \times 10^{28}$ 6.70 0.019 1.90
Llama 3.3 Meta AI Meta AI $6.86 \times 10^{24}$ $5.65 \times 10^{28}$ 11.98 0.012 1.21
Amazon Nova Pro Amazon Anthropic/Amazon $6.00 \times 10^{24}$ $2.27 \times 10^{28}$ 8.40 0.026 2.65
Amazon Titan Amazon Anthropic/Amazon $4.80 \times 10^{24}$ $2.27 \times 10^{28}$ 6.72 0.021 2.12
Claude 2 Anthropic Anthropic/Amazon $3.87 \times 10^{24}$ $2.27 \times 10^{28}$ 5.41 0.017 1.70
Minerva (540B) Google Google DeepMind $2.74 \times 10^{24}$ $3.87 \times 10^{28}$ 2.50 0.007 0.71
GPT-3.5 (text-davinci-003) OpenAI Microsoft/OpenAI $2.58 \times 10^{24}$ $4.35 \times 10^{28}$ 3.61 0.006 0.59
U-PaLM (540B) Google Google DeepMind $2.53 \times 10^{24}$ $3.87 \times 10^{28}$ 2.31 0.007 0.065
PaLM (540B) Google Research Google DeepMind $2.53 \times 10^{24}$ $3.87 \times 10^{28}$ 2.31 0.007 0.65
Flan-PaLM 540B Google Google DeepMind $2.50 \times 10^{24}$ $3.87 \times 10^{28}$ 2.28 0.006 0.65
FLAN 137B Google Research Google DeepMind $2.05 \times 10^{24}$ $3.87 \times 10^{28}$ 1.87 0.005 0.53
Meta Movie Gen Video Meta AI Meta AI $1.65 \times 10^{24}$ $5.65 \times 10^{28}$ 2.88 0.003 0.029
Megatron-Turing NLG 530B Microsoft, NVIDIA Microsoft/OpenAI $1.17 \times 10^{24}$ $4.35 \times 10^{28}$ 1.64 0.003 0.27
Llama 2-70B Meta AI Meta AI $8.10 \times 10^{23}$ $5.65 \times 10^{28}$ 1.41 0.003 0.14
Gopher (280B) DeepMind Google DeepMind $6.31 \times 10^{23}$ $3.87 \times 10^{28}$ 0.58 0.002 0.16
Chinchilla DeepMind Google DeepMind $5.76 \times 10^{23}$ $3.87 \times 10^{28}$ 0.53 0.001 0.15
LLaMA-65B Meta AI Meta AI $5.50 \times 10^{23}$ $5.65 \times 10^{28}$ 0.96 0.001 0.10
OPT-175B Meta AI Meta AI $4.30 \times 10^{23}$ $5.65 \times 10^{28}$ 0.75 0.001 0.08
BlenderBot 3 McGill University, Meta AI, Mila Meta AI $4.30 \times 10^{23}$ $5.65 \times 10^{28}$ 0.75 0.001 0.08
Parti Google Research Google DeepMind $3.96 \times 10^{23}$ $3.87 \times 10^{28}$ 0.36 0.001 0.10
FunSearch Google DeepMind Google DeepMind $3.87 \times 10^{23}$ $3.87 \times 10^{28}$ 0.35 0.001 0.10

Table 7.2: AI Model Training Compute Requirements (Part 2 of 6)

Model Organization Lab/Cloud Train FLOPs Parent Org Peak
Annual FLOPs
Model/Public
Models (%)
Model/Peak Annual (%) Model/Peak
w/100x (%)
GLaM Google Google DeepMind $3.64 \times 10^{23}$ $3.87 \times 10^{28}$ 0.33 0.001 0.09
LaMDA Google Google DeepMind $3.55 \times 10^{23}$ $3.87 \times 10^{28}$ 0.32 0.001 0.09
AlphaGo Zero DeepMind Google DeepMind $3.41 \times 10^{23}$ $3.87 \times 10^{28}$ 0.31 0.001 0.09
Galactica Meta AI Meta AI $3.24 \times 10^{23}$ $5.65 \times 10^{28}$ 0.57 0.001 0.06
InstructGPT 175B OpenAI Microsoft/OpenAI $3.19 \times 10^{23}$ $4.35 \times 10^{28}$ 0.45 0.001 0.07
GPT-3 175B (davinci) OpenAI Microsoft/OpenAI $3.14 \times 10^{23}$ $4.35 \times 10^{28}$ 0.44 0.001 0.07
ST-MoE Google, Google Brain,Google Research Google DeepMind $2.90 \times 10^{23}$ $3.87 \times 10^{28}$ 0.26 0.001 0.07
Flamingo DeepMind Google DeepMind $2.19 \times 10^{23}$ $3.87 \times 10^{28}$ 0.20 0.001 0.06
AlexaTM 20B Amazon Anthropic/Amazon $2.04 \times 10^{23}$ $2.27 \times 10^{28}$ 0.29 0.001 0.09
AlphaGo Master DeepMind Google DeepMind $2.00 \times 10^{23}$ $3.87 \times 10^{28}$ 0.18 0.001 0.05
ViT-22B Google Google DeepMind $1.93 \times 10^{23}$ $3.87 \times 10^{28}$ 0.18 0.001 0.05
PaLI Google Google DeepMind $1.69 \times 10^{23}$ $3.87 \times 10^{28}$ 0.15 0.000 0.04
AlphaCode DeepMind Google DeepMind $1.64 \times 10^{23}$ $3.87 \times 10^{28}$ 0.15 0.000 0.04
Llama Guard Meta AI Meta AI $1.60 \times 10^{23}$ $5.65 \times 10^{28}$ 0.28 0.000 0.03
UL2 Google Research,Google Brain Google DeepMind $1.20 \times 10^{23}$ $3.87 \times 10^{28}$ 0.11 0.000 0.03
Meena Google Brain Google DeepMind $1.12 \times 10^{23}$ $3.87 \times 10^{28}$ 0.10 0.000 0.03
OpenVLA Stanford,UC Berkeley,Toyota,DeepMind,MIT Google DeepMind $1.10 \times 10^{23}$ $3.87 \times 10^{28}$ 0.10 0.000 0.03
Llama 2-7B Meta AI Meta AI $8.40 \times 10^{22}$ $5.65 \times 10^{28}$ 0.15 0.000 0.01
Switch Google Google DeepMind $8.22 \times 10^{22}$ $3.87 \times 10^{28}$ 0.08 0.000 0.02
mT5-XXL Google, Google Research Google DeepMind $8.20 \times 10^{22}$ $3.87 \times 10^{28}$ 0.07 0.000 0.02
ByT5-XXL Google, Google Research Google DeepMind $8.10 \times 10^{22}$ $3.87 \times 10^{28}$ 0.07 0.000 0.02
LLaVA 1.5 UW Madison,Microsoft Research Microsoft/OpenAI $7.81 \times 10^{22}$ $4.35 \times 10^{28}$ 0.11 0.000 0.02
LLaVA UW Madison,Microsoft,Columbia Microsoft/OpenAI $7.80 \times 10^{22}$ $4.35 \times 10^{28}$ 0.11 0.000 0.02
ProtT5-XXL TU Munich,Med AI,NVIDIA,Oak Ridge,Google Google DeepMind $7.37 \times 10^{22}$ $3.87 \times 10^{28}$ 0.07 0.000 0.02
ESM2-15B Meta AI,NYU,Stanford,MIT Meta AI $7.35 \times 10^{22}$ $5.65 \times 10^{28}$ 0.13 0.000 0.01
Codex OpenAI Microsoft/OpenAI $7.34 \times 10^{22}$ $4.35 \times 10^{28}$ 0.10 0.000 0.02
CoCa Google Research Google DeepMind $7.30 \times 10^{22}$ $3.87 \times 10^{28}$ 0.07 0.000 0.02
OpenAI Five OpenAI Microsoft/OpenAI $6.70 \times 10^{22}$ $4.35 \times 10^{28}$ 0.09 0.000 0.02
AlphaStar DeepMind Google DeepMind $5.93 \times 10^{22}$ $3.87 \times 10^{28}$ 0.05 0.000 0.02
ViT-G/14 Google Brain,Google Research Google DeepMind $5.85 \times 10^{22}$ $3.87 \times 10^{28}$ 0.05 0.000 0.02
XGLM-7.5B Meta AI,Facebook AI Research Meta AI $2.25 \times 10^{22}$ $5.65 \times 10^{28}$ 0.04 0.000 0.00

Table 7.3: AI Model Training Compute Requirements (Part 3 of 6)

Model Organization Lab/Cloud Train FLOPs Parent Org Peak
Annual FLOPs
Model/Public
Models (%)
Model/Peak Annual (%) Model/Peak
w/100x (%)
GraphCast Google DeepMind Google DeepMind $2.10 \times 10^{22}$ $3.87 \times 10^{28}$ 0.02 0.000 0.01
NLLB Meta AI Meta AI $1.75 \times 10^{22}$ $5.65 \times 10^{28}$ 0.03 0.000 0.00
RETRO-7B DeepMind Google DeepMind $1.68 \times 10^{22}$ $3.87 \times 10^{28}$ 0.02 0.000 0.00
Turing-NLG Microsoft Microsoft/OpenAI $1.57 \times 10^{22}$ $4.35 \times 10^{28}$ 0.02 0.000 0.00
Imagen Google Brain Google DeepMind $1.46 \times 10^{22}$ $3.87 \times 10^{28}$ 0.01 0.000 0.00
OpenAI Five Rerun OpenAI Microsoft/OpenAI $1.30 \times 10^{22}$ $4.35 \times 10^{28}$ 0.02 0.000 0.00
CLIP (ViT L/14@336px) OpenAI Microsoft/OpenAI $1.05 \times 10^{22}$ $4.35 \times 10^{28}$ 0.01 0.000 0.00
AudioGen Meta AI, Hebrew University Meta AI $9.50 \times 10^{21}$ $5.65 \times 10^{28}$ 0.02 0.000 0.00
T5-3B Google Google DeepMind $9.00 \times 10^{21}$ $3.87 \times 10^{28}$ 0.01 0.000 0.00
iGPT-L OpenAI Microsoft/OpenAI $8.91 \times 10^{21}$ $4.35 \times 10^{28}$ 0.01 0.000 0.00
ContextNet + Noisy Student Google Google DeepMind $8.16 \times 10^{21}$ $3.87 \times 10^{28}$ 0.01 0.000 0.00
Segment Anything Model Meta AI Meta AI $7.80 \times 10^{21}$ $5.65 \times 10^{28}$ 0.01 0.000 0.00
Conformer + Wav2vec 2.0 Google, Google Research,Google Brain Google DeepMind $7.60 \times 10^{21}$ $3.87 \times 10^{28}$ 0.01 0.000 0.00
GNMT Google Google DeepMind $6.62 \times 10^{21}$ $3.87 \times 10^{28}$ 0.01 0.000 0.00
ADM OpenAI Microsoft/OpenAI $6.20 \times 10^{21}$ $4.35 \times 10^{28}$ 0.01 0.000 0.00
XLNet CMU,Google Brain Google DeepMind $6.19 \times 10^{21}$ $3.87 \times 10^{28}$ 0.01 0.000 0.00
NUWA Microsoft Research, Peking University Microsoft/OpenAI $4.84 \times 10^{21}$ $4.35 \times 10^{28}$ 0.01 0.000 0.00
AlphaFold-Multimer Google DeepMind,DeepMind Google DeepMind $4.35 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
ViT-Huge/14 Google Brain,Google Research Google DeepMind $4.26 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Whisper OpenAI Microsoft/OpenAI $4.21 \times 10^{21}$ $4.35 \times 10^{28}$ 0.01 0.000 0.00
Gato DeepMind Google DeepMind $4.02 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
ViT-G (model soup) UW, Columbia, Google, Meta, Tel Aviv Meta AI $3.40 \times 10^{21}$ $5.65 \times 10^{28}$ 0.01 0.000 0.00
ViT-G (model soup) UW, Columbia, Google, Meta, Tel Aviv Google DeepMind $3.40 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
ELECTRA Stanford,Google, Google Brain Google DeepMind $3.10 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
AlphaFold 2 DeepMind Google DeepMind $2.99 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00

Table 7.4: AI Model Training Compute Requirements (Part 4 of 6)

Model Organization Lab/Cloud Train FLOPs Parent Org Peak
Annual FLOPs
Model/Public
Models (%)
Model/Peak Annual (%) Model/Peak
w/100x (%)
ALBERT-xxlarge Toyota Tech Institute,Google Google DeepMind $2.39 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
NASv3 (CIFAR-10) Google Brain Google DeepMind $2.20 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
GPT-2 (1.5B) OpenAI Microsoft/OpenAI $1.92 \times 10^{21}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
EMDR Mila,McGill,DeepMind Google DeepMind $1.91 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
AlphaGo Lee DeepMind Google DeepMind $1.90 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
BigGAN-deep 512x512 Heriot-Watt,DeepMind Google DeepMind $1.80 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
MnasNet-A3 Google Google DeepMind $1.50 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
MnasNet-A1 + SSDLite Google Google DeepMind $1.50 \times 10^{21}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Swin Transformer V2 Microsoft Research Asia Microsoft/OpenAI $1.10 \times 10^{21}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
JFT Google Research,CMU Google DeepMind $8.43 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
OpenAI TI7 DOTA 1v OpenAI Microsoft/OpenAI $6.05 \times 10^{20}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
BERT-Large-CAS (PTB+WT2+WT103) Amazon Anthropic/Amazon $5.21 \times 10^{20}$ $2.27 \times 10^{28}$ 0.00 0.000 0.00

Table 7.5: AI Model Training Compute Requirements (Part 5 of 6)

Model Organization Lab/Cloud Train FLOPs Parent Org Peak
Annual FLOPs
Model/Public
Models (%)
Model/Peak Annual (%) Model/Peak
w/100x (%)
Big Transformer for Back-Translation Facebook AI Research,Google Brain Google DeepMind $4.78 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Xception Google Google DeepMind $4.36 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
AmoebaNet-A (F=448) Google Brain Google DeepMind $3.85 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
AlphaGo Fan DeepMind Google DeepMind $3.80 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
SNM-skip Google Google DeepMind $2.98 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
BERT-Large Google Google DeepMind $2.85 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
IMPALA DeepMind Google DeepMind $1.68 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Mesh-TensorFlow Transformer 4.9B Google Brain Google DeepMind $1.62 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Contriever Meta AI,UCL,PSL,Grenoble Meta AI $1.57 \times 10^{20}$ $5.65 \times 10^{28}$ 0.00 0.000 0.00
AlphaFold DeepMind Google DeepMind $1.00 \times 10^{20}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
EfficientNetV2-XL Google, Google Brain Google DeepMind $9.56 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
MoE-Multi Jagiellonian University,Google Brain Google DeepMind $9.39 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Adaptive Input Transformer + RD Microsoft Research Asia,Soochow Microsoft/OpenAI $8.20 \times 10^{19}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
DeiT-B Meta AI,Sorbonne University Meta AI $7.88 \times 10^{19}$ $5.65 \times 10^{28}$ 0.00 0.000 0.00
BEIT-3 Microsoft Microsoft/OpenAI $7.00 \times 10^{19}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
Mesh-TensorFlow Transformer 2.9B Google Brain Google DeepMind $6.84 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
PNASNet-5 Johns Hopkins,Google AI,Stanford Google DeepMind $6.63 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Sparse all-MLP Meta AI Meta AI $6.08 \times 10^{19}$ $5.65 \times 10^{28}$ 0.00 0.000 0.00
ConvS2S (ensemble of 8 models) Meta AI Meta AI $5.64 \times 10^{19}$ $5.65 \times 10^{28}$ 0.00 0.000 0.00
Seq2Seq LSTM Google Google DeepMind $5.60 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
MuZero DeepMind Google DeepMind $4.80 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Population-based DRL DeepMind Google DeepMind $3.49 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
QT-Opt Google Brain,UC Berkeley Google DeepMind $3.49 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
LSTM (Hebbian, Cache, MbPA) DeepMind,UCL Google DeepMind $3.33 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
ResNet-200 Microsoft Research Asia Microsoft/OpenAI $2.97 \times 10^{19}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
Segatron-XL large, M=384 + HCP Microsoft Research, Waterloo Microsoft/OpenAI $2.65 \times 10^{19}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
MultiBand Diffusion Meta AI, Hebrew U,LORIA Meta AI $2.60 \times 10^{19}$ $5.65 \times 10^{28}$ 0.00 0.000 0.00
Transformer local-attention (NesT-B) Google Cloud, Google Research Google DeepMind $2.41 \times 10^{19}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
MSRA (C, PReLU) Microsoft Research Microsoft/OpenAI $2.40 \times 10^{19}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
Detic Meta AI,UT Austin Meta AI $2.34 \times 10^{19}$ $5.65 \times 10^{28}$ 0.00 0.000 0.00
GPT-1 OpenAI Microsoft/OpenAI $1.76 \times 10^{19}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
TransE UTC-CNRS,Google Google DeepMind $1.34 \times 10^{18}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00

Table 7.6: AI Model Training Compute Requirements (Part 6 of 6)

Model Organization Lab/Cloud Train FLOPs Parent Org Peak
Annual FLOPs
Model/Public
Models (%)
Model/Peak Annual (%) Model/Peak
w/100x (%)
KN-LM Google Google DeepMind $7.73 \times 10^{17}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
WeNet (Penn Treebank) Amazon Anthropic/Amazon $7.30 \times 10^{17}$ $2.27 \times 10^{28}$ 0.00 0.000 0.00
Unsupervised High-level Feature Learner Google Google DeepMind $6.00 \times 10^{17}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
CT-MoS (WT2) Google,National Tsing Hua University Google DeepMind $5.62 \times 10^{17}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
DistBelief Speech Google Google DeepMind $3.11 \times 10^{17}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Mogrifier RLSTM (WT2) DeepMind Google DeepMind $1.40 \times 10^{17}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
ReLU-Speech Google,Toronto,NYU Google DeepMind $1.28 \times 10^{17}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
Large regularized LSTM NYU,Google Brain Google DeepMind $9.10 \times 10^{16}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
R-FCN Tsinghua,Microsoft Research Microsoft/OpenAI $6.15 \times 10^{16}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
ADAM (CIFAR-10) Amsterdam,OpenAI,Toronto Microsoft/OpenAI $6.05 \times 10^{16}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
Word2Vec (large) Google Google DeepMind $3.89 \times 10^{16}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
ENAS Google Brain,CMU,Stanford Google DeepMind $2.01 \times 10^{16}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
DARTS DeepMind,CMU Google DeepMind $1.10 \times 10^{16}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
NAS with base 8 and shared embeddings Google Brain Google DeepMind $1.05 \times 10^{16}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
ISS Duke University,Microsoft Microsoft/OpenAI $3.40 \times 10^{15}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00
Search-Proven Best LSTM Google Google DeepMind $3.34 \times 10^{15}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
DQN DeepMind Google DeepMind $2.30 \times 10^{15}$ $3.87 \times 10^{28}$ 0.00 0.000 0.00
RankNet Microsoft Research,Microsoft Microsoft/OpenAI $3.48 \times 10^{12}$ $4.35 \times 10^{28}$ 0.00 0.000 0.00

References

  1. Epoch AI. 2024. Data on machine learning hardware. Updated December 30, 2024.
University of Oxford