Issue creating a new model #38

luarss · 2024-09-03T07:09:04Z

Hi all, thanks for providing this excellent example codebase for integrating Cog with VLLM.

I am facing an issue creating an initial model using replicate/vllm page. The model I am using is meta-llama/Meta-Llama-3.1-8B-Instruct. Any ideas?

Errors out saying: Training failed. Failed to create trained image after successful training run.

Logs:

Logging in to Hugging Face Hub...
Token is valid (permission: fineGrained).
Your token has been saved to /root/.cache/huggingface/token
Login successful
Using model meta-llama/Meta-Llama-3.1-8B-Instruct with SHA 5206a32e0bd3067aef1ce90f5528ade7d866253f
Downloading 17 files...
0%| | 0.00/29.9G [00:00<?, ?B/s]
0%| | 0.00/29.9G [00:00<?, ?B/s, file=.gitattributes, n=1/17]
0%| | 1.48k/29.9G [00:00<908:59:43, 9.82kB/s, file=LICENSE, n=2/17]
0%| | 8.93k/29.9G [00:00<260:50:55, 34.2kB/s, file=README.md, n=3/17]
0%| | 51.9k/29.9G [00:00<60:34:38, 147kB/s, file=USE_POLICY.md, n=4/17]
0%| | 56.5k/29.9G [00:00<70:15:33, 127kB/s, file=config.json, n=5/17]
0%| | 57.4k/29.9G [00:00<82:46:11, 108kB/s, file=generation_config.json, n=6/17]
0%| | 57.5k/29.9G [00:00<95:05:17, 93.9kB/s, file=model-00001-of-00004.safetensors, n=7/17]
0%| | 44.8M/29.9G [00:01<11:23, 46.9MB/s, file=model-00001-of-00004.safetensors, n=7/17]
1%| | 339M/29.9G [00:02<02:38, 201MB/s, file=model-00001-of-00004.safetensors, n=7/17]
2%|▏ | 658M/29.9G [00:03<02:00, 262MB/s, file=model-00001-of-00004.safetensors, n=7/17]
3%|▎ | 997M/29.9G [00:04<01:44, 299MB/s, file=model-00001-of-00004.safetensors, n=7/17]
4%|▍ | 1.30G/29.9G [00:05<01:36, 318MB/s, file=model-00001-of-00004.safetensors, n=7/17]
5%|▌ | 1.63G/29.9G [00:06<01:31, 331MB/s, file=model-00001-of-00004.safetensors, n=7/17]
7%|▋ | 1.96G/29.9G [00:07<01:28, 338MB/s, file=model-00001-of-00004.safetensors, n=7/17]
8%|▊ | 2.29G/29.9G [00:08<01:26, 342MB/s, file=model-00001-of-00004.safetensors, n=7/17]
9%|▊ | 2.61G/29.9G [00:09<01:25, 344MB/s, file=model-00001-of-00004.safetensors, n=7/17]
10%|▉ | 2.94G/29.9G [00:10<01:23, 347MB/s, file=model-00001-of-00004.safetensors, n=7/17]
11%|█ | 3.27G/29.9G [00:11<01:22, 349MB/s, file=model-00001-of-00004.safetensors, n=7/17]
12%|█▏ | 3.60G/29.9G [00:12<01:21, 345MB/s, file=model-00001-of-00004.safetensors, n=7/17]
13%|█▎ | 3.92G/29.9G [00:13<01:20, 347MB/s, file=model-00001-of-00004.safetensors, n=7/17]
14%|█▍ | 4.25G/29.9G [00:14<01:19, 348MB/s, file=model-00001-of-00004.safetensors, n=7/17]
15%|█▌ | 4.58G/29.9G [00:15<01:17, 349MB/s, file=model-00001-of-00004.safetensors, n=7/17]
15%|█▌ | 4.63G/29.9G [00:17<01:17, 349MB/s, file=model-00002-of-00004.safetensors, n=8/17]
16%|█▋ | 4.90G/29.9G [00:18<02:19, 192MB/s, file=model-00002-of-00004.safetensors, n=8/17]
17%|█▋ | 5.23G/29.9G [00:19<01:58, 223MB/s, file=model-00002-of-00004.safetensors, n=8/17]
19%|█▊ | 5.56G/29.9G [00:20<01:44, 250MB/s, file=model-00002-of-00004.safetensors, n=8/17]
20%|█▉ | 5.88G/29.9G [00:21<01:34, 273MB/s, file=model-00002-of-00004.safetensors, n=8/17]
21%|██ | 6.18G/29.9G [00:22<01:31, 278MB/s, file=model-00002-of-00004.safetensors, n=8/17]
22%|██▏ | 6.49G/29.9G [00:23<01:26, 292MB/s, file=model-00002-of-00004.safetensors, n=8/17]
23%|██▎ | 6.79G/29.9G [00:24<01:22, 300MB/s, file=model-00002-of-00004.safetensors, n=8/17]
24%|██▍ | 7.12G/29.9G [00:25<01:17, 314MB/s, file=model-00002-of-00004.safetensors, n=8/17]
25%|██▍ | 7.45G/29.9G [00:26<01:14, 326MB/s, file=model-00002-of-00004.safetensors, n=8/17]
26%|██▌ | 7.78G/29.9G [00:27<01:10, 335MB/s, file=model-00002-of-00004.safetensors, n=8/17]
27%|██▋ | 8.12G/29.9G [00:28<01:08, 341MB/s, file=model-00002-of-00004.safetensors, n=8/17]
28%|██▊ | 8.45G/29.9G [00:29<01:06, 346MB/s, file=model-00002-of-00004.safetensors, n=8/17]
29%|██▉ | 8.78G/29.9G [00:30<01:05, 346MB/s, file=model-00002-of-00004.safetensors, n=8/17]
30%|███ | 9.10G/29.9G [00:31<01:04, 347MB/s, file=model-00002-of-00004.safetensors, n=8/17]
31%|███ | 9.29G/29.9G [00:35<01:03, 347MB/s, file=model-00003-of-00004.safetensors, n=9/17]
31%|███▏ | 9.43G/29.9G [00:36<02:10, 169MB/s, file=model-00003-of-00004.safetensors, n=9/17]
33%|███▎ | 9.75G/29.9G [00:37<01:48, 199MB/s, file=model-00003-of-00004.safetensors, n=9/17]
34%|███▎ | 10.1G/29.9G [00:38<01:33, 228MB/s, file=model-00003-of-00004.safetensors, n=9/17]
35%|███▍ | 10.4G/29.9G [00:39<01:22, 254MB/s, file=model-00003-of-00004.safetensors, n=9/17]
36%|███▌ | 10.7G/29.9G [00:40<01:18, 263MB/s, file=model-00003-of-00004.safetensors, n=9/17]
37%|███▋ | 11.0G/29.9G [00:41<01:11, 284MB/s, file=model-00003-of-00004.safetensors, n=9/17]
38%|███▊ | 11.3G/29.9G [00:42<01:06, 299MB/s, file=model-00003-of-00004.safetensors, n=9/17]
39%|███▉ | 11.6G/29.9G [00:43<01:03, 310MB/s, file=model-00003-of-00004.safetensors, n=9/17]
40%|███▉ | 12.0G/29.9G [00:44<01:00, 317MB/s, file=model-00003-of-00004.safetensors, n=9/17]
41%|████ | 12.3G/29.9G [00:45<00:58, 324MB/s, file=model-00003-of-00004.safetensors, n=9/17]
42%|████▏ | 12.6G/29.9G [00:46<00:56, 329MB/s, file=model-00003-of-00004.safetensors, n=9/17]
43%|████▎ | 12.9G/29.9G [00:47<00:54, 333MB/s, file=model-00003-of-00004.safetensors, n=9/17]
44%|████▍ | 13.2G/29.9G [00:48<00:53, 336MB/s, file=model-00003-of-00004.safetensors, n=9/17]
45%|████▌ | 13.6G/29.9G [00:49<00:52, 335MB/s, file=model-00003-of-00004.safetensors, n=9/17]
46%|████▋ | 13.9G/29.9G [00:50<00:53, 321MB/s, file=model-00003-of-00004.safetensors, n=9/17]
46%|████▋ | 13.9G/29.9G [00:53<00:53, 321MB/s, file=model-00004-of-00004.safetensors, n=10/17]
47%|████▋ | 14.2G/29.9G [00:54<01:34, 180MB/s, file=model-00004-of-00004.safetensors, n=10/17]
48%|████▊ | 14.5G/29.9G [00:55<01:17, 213MB/s, file=model-00004-of-00004.safetensors, n=10/17]
50%|████▉ | 14.8G/29.9G [00:56<01:06, 243MB/s, file=model-00004-of-00004.safetensors, n=10/17]
50%|████▉ | 15.0G/29.9G [00:57<01:06, 243MB/s, file=model.safetensors.index.json, n=11/17]
50%|████▉ | 15.0G/29.9G [00:57<01:06, 243MB/s, file=original/consolidated.00.pth, n=12/17]
51%|█████ | 15.1G/29.9G [00:58<01:11, 221MB/s, file=original/consolidated.00.pth, n=12/17]
52%|█████▏ | 15.4G/29.9G [00:59<01:02, 250MB/s, file=original/consolidated.00.pth, n=12/17]
53%|█████▎ | 15.8G/29.9G [01:00<00:55, 273MB/s, file=original/consolidated.00.pth, n=12/17]
54%|█████▍ | 16.1G/29.9G [01:01<00:50, 293MB/s, file=original/consolidated.00.pth, n=12/17]
55%|█████▍ | 16.4G/29.9G [01:02<00:47, 308MB/s, file=original/consolidated.00.pth, n=12/17]
56%|█████▌ | 16.7G/29.9G [01:03<00:44, 320MB/s, file=original/consolidated.00.pth, n=12/17]
57%|█████▋ | 17.1G/29.9G [01:04<00:44, 313MB/s, file=original/consolidated.00.pth, n=12/17]
58%|█████▊ | 17.4G/29.9G [01:05<00:41, 324MB/s, file=original/consolidated.00.pth, n=12/17]
59%|█████▉ | 17.7G/29.9G [01:06<00:39, 332MB/s, file=original/consolidated.00.pth, n=12/17]
60%|██████ | 18.0G/29.9G [01:07<00:37, 338MB/s, file=original/consolidated.00.pth, n=12/17]
61%|██████▏ | 18.4G/29.9G [01:08<00:36, 343MB/s, file=original/consolidated.00.pth, n=12/17]
62%|██████▏ | 18.7G/29.9G [01:09<00:35, 340MB/s, file=original/consolidated.00.pth, n=12/17]
64%|██████▎ | 19.0G/29.9G [01:10<00:33, 345MB/s, file=original/consolidated.00.pth, n=12/17]
65%|██████▍ | 19.4G/29.9G [01:11<00:32, 348MB/s, file=original/consolidated.00.pth, n=12/17]
66%|██████▌ | 19.7G/29.9G [01:12<00:31, 349MB/s, file=original/consolidated.00.pth, n=12/17]
67%|██████▋ | 20.0G/29.9G [01:13<00:30, 348MB/s, file=original/consolidated.00.pth, n=12/17]
68%|██████▊ | 20.3G/29.9G [01:14<00:29, 350MB/s, file=original/consolidated.00.pth, n=12/17]
69%|██████▉ | 20.7G/29.9G [01:15<00:28, 352MB/s, file=original/consolidated.00.pth, n=12/17]
70%|███████ | 21.0G/29.9G [01:16<00:27, 354MB/s, file=original/consolidated.00.pth, n=12/17]
71%|███████▏ | 21.3G/29.9G [01:17<00:25, 355MB/s, file=original/consolidated.00.pth, n=12/17]
72%|███████▏ | 21.7G/29.9G [01:18<00:26, 337MB/s, file=original/consolidated.00.pth, n=12/17]
74%|███████▎ | 22.0G/29.9G [01:19<00:24, 342MB/s, file=original/consolidated.00.pth, n=12/17]
75%|███████▍ | 22.3G/29.9G [01:20<00:23, 346MB/s, file=original/consolidated.00.pth, n=12/17]
76%|███████▌ | 22.7G/29.9G [01:21<00:22, 342MB/s, file=original/consolidated.00.pth, n=12/17]
77%|███████▋ | 23.0G/29.9G [01:22<00:21, 346MB/s, file=original/consolidated.00.pth, n=12/17]
78%|███████▊ | 23.3G/29.9G [01:23<00:20, 349MB/s, file=original/consolidated.00.pth, n=12/17]
79%|███████▉ | 23.7G/29.9G [01:24<00:19, 351MB/s, file=original/consolidated.00.pth, n=12/17]
80%|████████ | 24.0G/29.9G [01:25<00:18, 352MB/s, file=original/consolidated.00.pth, n=12/17]
81%|████████▏ | 24.3G/29.9G [01:26<00:17, 354MB/s, file=original/consolidated.00.pth, n=12/17]
82%|████████▏ | 24.7G/29.9G [01:27<00:15, 355MB/s, file=original/consolidated.00.pth, n=12/17]
83%|████████▎ | 25.0G/29.9G [01:28<00:14, 356MB/s, file=original/consolidated.00.pth, n=12/17]
85%|████████▍ | 25.3G/29.9G [01:29<00:14, 353MB/s, file=original/consolidated.00.pth, n=12/17]
86%|████████▌ | 25.7G/29.9G [01:30<00:12, 355MB/s, file=original/consolidated.00.pth, n=12/17]
87%|████████▋ | 26.0G/29.9G [01:31<00:11, 356MB/s, file=original/consolidated.00.pth, n=12/17]
88%|████████▊ | 26.3G/29.9G [01:32<00:10, 355MB/s, file=original/consolidated.00.pth, n=12/17]
89%|████████▉ | 26.6G/29.9G [01:33<00:10, 338MB/s, file=original/consolidated.00.pth, n=12/17]
90%|█████████ | 27.0G/29.9G [01:34<00:09, 343MB/s, file=original/consolidated.00.pth, n=12/17]
91%|█████████▏| 27.3G/29.9G [01:35<00:08, 347MB/s, file=original/consolidated.00.pth, n=12/17]
92%|█████████▏| 27.6G/29.9G [01:36<00:07, 350MB/s, file=original/consolidated.00.pth, n=12/17]
93%|█████████▎| 28.0G/29.9G [01:37<00:05, 352MB/s, file=original/consolidated.00.pth, n=12/17]
95%|█████████▍| 28.3G/29.9G [01:38<00:04, 354MB/s, file=original/consolidated.00.pth, n=12/17]
96%|█████████▌| 28.6G/29.9G [01:39<00:03, 355MB/s, file=original/consolidated.00.pth, n=12/17]
97%|█████████▋| 29.0G/29.9G [01:40<00:02, 357MB/s, file=original/consolidated.00.pth, n=12/17]
98%|█████████▊| 29.3G/29.9G [01:41<00:01, 357MB/s, file=original/consolidated.00.pth, n=12/17]
99%|█████████▉| 29.6G/29.9G [01:42<00:00, 357MB/s, file=original/consolidated.00.pth, n=12/17]
100%|█████████▉| 29.9G/29.9G [01:51<00:00, 357MB/s, file=original/params.json, n=13/17]
100%|█████████▉| 29.9G/29.9G [01:51<00:00, 357MB/s, file=original/tokenizer.model, n=14/17]
100%|█████████▉| 29.9G/29.9G [01:52<00:00, 357MB/s, file=special_tokens_map.json, n=15/17]
100%|█████████▉| 29.9G/29.9G [01:52<00:00, 357MB/s, file=tokenizer.json, n=16/17]
100%|█████████▉| 29.9G/29.9G [01:52<00:00, 357MB/s, file=tokenizer_config.json, n=17/17]
100%|██████████| 29.9G/29.9G [01:52<00:00, 285MB/s, file=tokenizer_config.json, n=17/17]
Downloaded 17 files in 112.88 seconds

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue creating a new model #38

Issue creating a new model #38

luarss commented Sep 3, 2024

Issue creating a new model #38

Issue creating a new model #38

Comments

luarss commented Sep 3, 2024