This repository was archived by the owner on Mar 4, 2026. It is now read-only.
Commit 36d8b40
Sync with r3.1 (#1921)
* update dlrmv2 BKC (#1476)
* use merge-emb-cat for int8 since acc issue is fixed in IPEX (#1477)
* forcing numpy to use a specific version (#1474)
* Restrict vit training to single socket (#1484)
* update dlrm/bert distribute training BKC (#1486)
* change batch size for int8 resnet50 and ssd-resnet34 (#1487)
* Added command to remove existing logs in output_dir (#1475)
* Added command to remove existing logs in output_dir
* Fix condition to checkout OneAPI tools repository (#1490)
* Hz/dlrm ddp (#1496)
* fix dlrm ddp
* fix time computation
---------
Co-authored-by: Weizhuo Zhang <weizhuo.zhang@intel.com>
* fix dlrm-v1 int8 thp (#1497)
* also use merged-emb-cat in dlrm-v2 int8 thp (#1498)
* updated tpp files for 2.12.1 release (#1479)
* updated tpp files
* added yolo5
* P0 models list (#1500)
* P0 models list
* replace master w/ tag
* correct framework name
---------
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
* another update to TPPs (#1503)
* Fixing SSD-Resnet34 training quickstart script to run right number of instances (#1493)
* Container GHA Pipeline Reformat (#1462)
* swap runner to mlops runner
* change ipex base image (#1440)
* rename tests yaml (#1450)
* New Test Runner (#1461)
* add execute perms to quickstart and add 140 tests to pytorch resnet with new runner
* add tests per new format
* add flex140 support
* Update Test Runner (#1467)
* Flex 140 tests for P0 (#1469)
* add previous m3 commits (#1478)
* GHA tests for flex 140 (#1499)
* add previous m3 commits (#1478)
* Added command to remove existing logs in output_dir (#1475)
* address PR review (#1501)
* remove makefile
* Remove caas reference (#1502)
* Add previous m3 commits in baremetal readme (#1480)
---------
Co-authored-by: Srikanth Ramakrishna <srikanth.ramakrishna@intel.com>
Co-authored-by: mahathis <36486206+Mahathi-Vatsal@users.noreply.github.com>
* refine dlrm ddp dataloader (#1504)
Co-authored-by: Weizhuo Zhang <weizhuo.zhang@intel.com>
* workaround oneccl bad termination issue for RN50 distributed training (#1508)
* Fix Test Pipeline (#1514)
* fix test pipeline
* Update container-pipeline-tester.yml
* Bump mlflow in /datasets/cloud_data_connector/samples/interoperability (#1492)
Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.5.0 to 2.6.0.
* Bump mlflow in /datasets/cloud_data_connector/samples/azure (#1491)
Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.5.0 to 2.6.0.
* Zufang/readme update for itex (#1485)
* add link to int8 PB for onednn graph
* refine readme for onednn graph option
* MaskRCNN GPU training (#1513)
* maskmrcnn model demo zero-bkc
* update readme
* added license header
* added HW requirement
* update docs
* support bf32 for SD finetune (#1521)
* fix dlrm-v1 ddp hang (#1512)
* fix dlrm-v1 ddp hang
* comment out two more barrier
* modify multi-node scripts for resnet50, maskrcnn and stable diffusion (#1537)
* modify multi-node scripts for RNNT and ssd-resnet34 (#1539)
* Modify Test Runner Run Dir from GHA (#1541)
* Adjust test runner path to be MLOps root
* move to test-runner dir
* merge dir and parent_dir
* remove parent dir
* use full paths
* get test name for artifact upload
* Stop workload tests and security scans on open PR event (#1543)
* PVC P0 RN50 PYT Inference (#1494)
* updated to latest BKC to add multi-card multi-tile support
* PVC P0 PYT BERT Large (#1495)
* Add support for multi-card multi-tile
* PVC P0 PYT DLRM training (#1518)
* add dlrm training pvc support
* Fix LKG pipeline for the new ipex and itex conda installers (#1525)
* update LKG pipeline for the new ipex and itex conda installers
* remove version subdir from bom file
* TF RN50V1_5 P0 Max Inference (#1529)
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
* modify SD inference scripts (#1553)
* Updates + New Transfer Learning Notebooks from TLT team (#1522)
* Update for all transfer learning notebooks
---------
Co-authored-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
* Rename the repo to Intel AI Reference Models for rebranding (#1473)
* Rename the repo for rebranding, remove k8s and tools directories
* Update README.md
---------
Co-authored-by: Clayne Robison <clayne.b.robison@intel.com>
* modify SD finetune scripts (#1555)
* fix CrossEntropyLoss target for dummy inputs (#1556)
* PVC P0 PYT DLRM inference (#1517)
* add dlrm pvc inference support
* Modified baremetal README(bert and rn50) for max series (#1520)
* fix itex (#1527)
* fix itex
* fix key error
* PVC BERT-Large P0 TF (#1534)
* adapt new BKC
* Modified Bert large TF baremetal readme for Mx series (#1557)
---------
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
* Add optimizations for BF16 transformer inference (#1489)
* Change the order of operations from dense->concat->split_heads to dense->split_heads->concat (attention_layer.py)
* Change the order of operations when calculating encoder-decoder k, v caches to avoid using matmul ops between large matrices(transformer.py)
*Reduce number of occurrences of split_heads with encoder-decoder k, v caches by performing split_heads in transformer.py
* Update README.md (#1516)
Change BFloat16 model description to point to new frozen graph.
* make llama training max-step flexible (#1563)
* Fix drops logic to be avoided if any workload test fails (#1564)
* Create selective PR validations tags-based (#1549)
* Create selective PR validations tags-based
* Add edit_pull_request as trigger for the PR validations
---------
Co-authored-by: Wafaa Taie <wafaa.s.taie@intel.com>
* PYT CPU Automation (#1544)
* make initial changes
* add tests for new base container
* add more new tests
* remove env var
* add more tests
* more test added
* add dlrm inference build and test
* add more tests
* add more tests
* add another model test
* add final tests
* add devcatalog (#1566)
* Change success condition from previous jobs before doing drop (#1568)
* Added readme for DLRM pytorch MAX series (#1570)
* update docs for AI Tools (#1567)
* ViT Train : Enable multi instance training for Tensorflow Vision Transformer model (#1569)
* Revert "Restrict vit training to single socket (#1484)"
This reverts commit 8d4eb7a.
* Add multi instance support
* Update README.md
* Remove useless files and add license title for DLRM v2 (#1574)
* Gda/step url (#1560)
* Added step url to result table
* Remove continue on error from workload tests
* Run performance checks even when a workload test failed
* add driver setup doc (#1571)
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
* LLM models using ipex.optimize_transformers for bf16/int8 (#1562)
* init pr
* revise v1, local test passed
* TF ResNet50v1.5 Fix (#1575)
* tf cpu r50 inf fixed
* actions.json added
* newline added for actions
* venv instllation added
* pip install fixed
* venv pip install fixed
* test workflow reverted back
---------
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
* added int8 support for graphsage (#1536)
* MEMREC DLRM Inference (#1547)
* Added omp_num_threads and cores_per_instance as env variable (#1573)
* esnet50v1.5 and bert large inference models
* TF Stable Diffusion: Download model files in start.sh and log latency & throughput (#1581)
* adding model download to start.sh to avoid failure during multi-instance execution
* downloading the clip tokenizer in start.sh also
* script changes to report both latency and throughput
* Doc fixes (#1582)
* add torchrec dlrm to the models table
* fix paths to run quickstart scripts
* Add RN50 INT8 Calibration file (#1545)
* Jupyter notebook for AI Reference models (#1583)
* Added jupyter notebook for AI Reference models
* Added README for AI_Reference jupyter notebook
* Supports Resnet50 v1.5 and mobilenet v1 inference workloads
* Upgrade Pillow version to 10.0.1 to fix high severity CVEs (#1584)
* remove workflows
* correct_release_tag (#1587)
* correct_release_tag
* revert a change
* Unexpose old TF models (#1593)
* tf cpu distilbert inf (#1612)
* 3D Unet MLPerf Inference Workload (#1595)
* 3D Unet MLPerf added
* docker compose added
* batch size corrected
* numactl added
* ubuntu Dockerfile updated
* output dir changed in tests
* yes flag added in Dockerfile
* default OS added
* BERT Large Inference CPU Workload Added (#1594)
* BERT Large Inf added
* TCMALLOC added
* ubuntu Dockerfile updated
* TCMalloc location updated
* test file updated
* yes flag added in Dockerfile
* TF CPU Bert Large Training Workload (#1596)
* bert large pretraining added
* extra OS removed from r50 inf service
* ubuntu Dockerfile updated
* ssh helper script added
* yum non tineractive update
* output dir fixed
* TF CPU DIEN Inference Workload (#1598)
* TF CPU MobileNet V1 Inference Workload (#1600)
* TF CPU ResNet v1.5 Training Workload (#1604)
* TF CPU SSD ResNet-34 Inference Workload (#1606)
* TF CPU SSD ResNet-34 Training Workload (#1607)
* TF CPU Transformer MLPerf Inference Workload (#1597)
* TF CPU Transformer MLPerf Training (#1608)
* TF CPU DistilBERT fixed (#1629)
* TF CPU SSD MobileNet Inference Workload (#1605)
* Fixed typo in readme for framework (#1631)
* Checkpoints added for TF CPU Workloads (#1637)
* TF CPU Dev Catalog READMEs Updated. (#1652)
* EMR PYT RN50 Infer (#1624)
---------
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
* EMR PYT RN50 Train (#1625)
* build rn50 train centos
* comment conda lines
* comment conda lines
* remove fp16 test,add devcat and intel-openmp
* add changes to ubuntu
* remove commented lines
* add cpu to tag
* EMR PYT ResNext Infer (#1619)
* add initial commits for emr resnext
* add dockerfiles
* build resnext
* remove extra precisions
* add devcat,openmp and more tests
* add cpu to tag
* EMR PYT MaskRCNN Infer (#1621)
* add maskrcnn inference
* add cmake and bf32 tests
* add openmpi,more tests and devcat
* add cpu to tag
* rearrange pip installs
* EMR PYT MaskRCNN Train (#1622)
* build maskrcnn training
* add cmake
* correct image names
* add ld_predload,devcat and tests changes
* EMR PYT SSD-ResNet34 Train (#1618)
* add compose changes
* ssd-resnet34 build
* correct tests file
* add more tests and devcat
* EMR PYT SSD-ResNet34 Infer (#1617)
* build ssd-resnet34 images
* add bf32 tests and update DEVCATALOG.md
* rename devcatalogs (#1671)
* EMR PYT BERT Large Infer (#1623)
* add bert-large build
* correct paths
* add devcatalog
* add more tests
* Rename EMR_DEVCATALOG.md to DEVCATALOG.md
* Update DEVCATALOG.md
* PYT EMR BERT-Large Train (#1639)
* build bert-large training
* add pretrained model env
* remove idsid
* add more tests and devcatalog
* correct env and rename
* Delete EMR_DEVCATALOG.md
* Update DEVCATALOG.md
* EMR PYT Distilbert Infer (#1620)
* build distilbert images
* validate distilbert
* add more tests and devcatalog
* remove MZ reference
* modify env params
* uncomment and remove idsid
* clarify core per instance
* clarify hf_datasets
* remove void env var
* Rename EMR_DEVCATALOG.md to DEVCATALOG.md
* Update DEVCATALOG.md
* EMR PYT RNNT Inference (#1616)
* add dockerfiles for rnnt
* fix pytorch binding error
* copy diff file to inference
* add librosa
* add more tests and devcatalog
* correct reatime cmd
* Rename EMR_DEVCATALOG.md to DEVCATALOG.md
* Update DEVCATALOG.md
* EMR PYT RNNT Train (#1615)
* Rename EMR_DEVCATALOG.md to DEVCATALOG.md
* Update DEVCATALOG.md
* EMR PYT DLRM Infer (#1626)
* Update DEVCATALOG.md
* EMR PYT DLRM Train (#1628)
* build dlrm training
* add num_batch
* add tcmalloc
* add tcmalloc
* add devcatalog
* re-locate the file
* Rename EMR_DEVCATALOG.md to DEVCATALOG.md
* Update DEVCATALOG.md
* make batch flexible (#1635)
* Change dataset for Transfer Learning LLM Notebook (#1576)
* update llm notebook with code alpaca
* push updates
* fixed broken link
* Refactor Transfer Learning Notebook folder to match TLT structure + small diff (#1670)
* refactor to match TL structure + small diff
* fixed table structure
* add landing page doc (#1653)
* add landing page doc
* simplify and add r3.1
* add precisions
* add tf landing page
* add precisions
---------
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
* r3.1 fixes (#1679)
* TF CPU ResNet 50 v1.5 Inference Model Checkpoints fixed (#1663)
* spr removed from workdir
* R50 Inf fixed
* fixed rn50 error
* fixed in docker compose yaml
---------
Co-authored-by: Sharvil Shah <sharvil.shah@intel.com>
* make minor corrections in devcatalog README (#1680)
* Remove old TF models (#1673)
* remove ResNet50, FasterCNN, RFCN, NCF, Wide and deep Large dataset training, waveNet, Inception v4, mlperf GNMT models
* remove relevant unit tests and update coverage precentage
* refine changes based on feedback (#1684)
* fix typo (#1686)
* fixing docker iamages names
* fixing TF centos docker images link
* unset KMP AFFINITY for accuracy scripts (#1689)
* Bump mlflow in /datasets/cloud_data_connector/samples/azure (#1698)
Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.6.0 to 2.8.1.
- [Release notes](https://github.com/mlflow/mlflow/releases)
- [Changelog](https://github.com/mlflow/mlflow/blob/master/CHANGELOG.md)
- [Commits](mlflow/mlflow@v2.6.0...v2.8.1)
---
updated-dependencies:
- dependency-name: mlflow
dependency-type: direct:production
* Bump mlflow in /datasets/cloud_data_connector/samples/interoperability (#1697)
Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.6.0 to 2.8.1.
- [Release notes](https://github.com/mlflow/mlflow/releases)
- [Changelog](https://github.com/mlflow/mlflow/blob/master/CHANGELOG.md)
- [Commits](mlflow/mlflow@v2.6.0...v2.8.1)
---
updated-dependencies:
- dependency-name: mlflow
dependency-type: direct:production
* add optional args to devcatalog pages (#1750)
* add optional env for tf
* add optional args
* add v2 to table (#1762)
* remove space
* Corrected updating OMP NUM THREADS (#1759)
* validate omp_num_threads and cores_per_instance
* revert changes
* validate omp num threads and cores per instance (#1789)
* add omp_num_threads and cores_per_instance (#1809)
* wsl2 documentation (#1815)
* add wsl2 stable diffusion documentation
* add wsl2 stable diffusion documentation
* add wsl2 base doc
* make minor tabular changes
* make minor tabular changes
* add ssh instructions
* re-word example
* update torch version outputs
* add entry in main readme
---------
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
* fix dlrm normal training to support SNC mode (#1699)
Co-authored-by: Weizhuo Zhang <weizhuo.zhang@intel.com>
* update torch-ccl branch (#1716)
* Pytorch ResNext32x16d baremetal EMR tests (#1640)
* Pytorch RN50 baremetal EMR inference and training tests (#1641)
* PyTorch SSD-Resnet34 EMR baremetal training and inference tests (#1647)
* PyTorch distilBERT baremetal tests (#1650)
* PyTorch BERT_LARGE_SQUAD inf baremetal tests (#1651)
* PyTorch BERT_LARGE Training baremetal tests (#1654)
* Pytorch MaskRCNN baremetal EMR tests (#1646)
* PyTorch DLRM baremetal tests (#1649)
* PyTorch RNN-T EMR baremetal tests (#1648)
* remove TF yolo v5, add cpu stable diffusion
* [Zero-BKC][ITEX][GPU]add itex stable diffusion,EfficientNet,wide and deep inference for ats-m (#1642)
Co-authored-by: XumingGai <xuming.gai@intel.com>
* remove pb files from the github repo (#1730)
* modify rn50 training script (#1732)
* Refactor new zero bkcs scripts for TF ResNet50 inf and Mask-RCNN to models_v2 (#1695)
* move new zero bkcs for TF resnet50 inf and maskrcnn to models_v2
* GHA tests for dGPU zero copy BKC format workloads (#1744)
---------
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
* modify Stable Diffusion finetune script (#1746)
* add utilities to parse result for pytorch (#1729)
* Wliao2/add rn50 (#1693)
* add dgpu resnet 50
* update intel tf version to be the latest (#1748)
* Enable Inductor path for Bert_large inference and training (#1733)
* Init Bert-large files from inductor path
* cherry pick Enable int8-mixed-bf16 for 5 transformer models (#1720)
* modify README
* Enable Inductor path for Distilbert-base inference (#1734)
* Init Bert-large files from inductor path
* cherry pick Enable int8-mixed-bf16 for 5 transformer models (#1720)
* modify README
* Init Distilbert base models
* Init DLRM Script (#1739)
* Enable Inductor path for RN50 inference and training (#1718)
* Enable Inductor path for RN50 inference and training
* add bf32
* add README for Torch inductor
---------
Co-authored-by: leslie-fang-intel <leslie.fang@intel.com>
* Move and add license headers P1 ATS-M ITEX models (#1754)
* move and add license headers to stable diffusion model
* move and add license headers to efficientnet
* update maskrcnn inference
* move and update license headers for wide and deep model
* Remove MLFlow dependency. Updates on functional tests (#1756)
* add v2 to table (#1762)
* Weizhuoz/fix bert accuracy (#1761)
* fix bert-large accuracy read issue
* fix bert_large accuracy issue
* inductor int8 could not use model.eval()
* remove ssdmobilenet and yolov4 (#1757)
* update maskrcnn, bert-large training (#1666)
* update maskrcnn training
* update maskrcnn and add bert-large
* move maskrcnn training to model_v2, update license
* code review changes for bert large training
---------
Co-authored-by: Wafaa Taie <wafaa.s.taie@intel.com>
* update mlflow version to use the latest (#1768)
* [Zero-BKC][ITEX][GPU] Add resnet50 and 3dd-unet training (#1664)
* add 3d-unet gpu training
* add gpu resnet50 training
* update license headers and move scripts to models_v2
* code review changes for 3d-unet training, update license, move to models_v2
* changes in readme for code review
---------
Co-authored-by: Wafaa Taie <wafaa.s.taie@intel.com>
* enable distributed training for DLRMv2 and some fix for inductor path (#1770)
* enable distributed training for DLRMv2 and some fix for inductor path
* add missing files
* remove license headers from .txt files in data-connector (#1774)
* fix data loader (#1780)
* Fix for TL Notebooks GHA (#1773)
* fixed file paths
* changed venv creation
* trying venv
* trying no venv
* reverted venv3
* testing apt get update
* added apt-get install
* venv3 --> venv
* uncommented apt
* without virtualenv
* added pip install virtualenv
* downgraded PyYaml to be conpatible with tf models official
* 2.12.0 --> 2.12.1
* removed package versions
* added back tf official version
* flipped
* addressed review comments
* Fix inductor path int8 bf16 realtime issue (#1767)
* Update inference_performance.sh (#1784)
* fix acc (#1798)
* add warm up iter for inference throughput (#1799)
* Predownload weights for SDv2.1 (#1797)
* predownload weights for SDv2.1
* update hash
* fix DLRM V1 syntax error (#1760)
* fix DLRM V1 syntax error
* fix dlrm inductor numpy.bool_ error (#1448)
* fix DLRM V1 train_ld issue
* Update dlrm_s_pytorch.py
---------
Co-authored-by: Chunyuan WU <chunyuan.wu@intel.com>
Co-authored-by: zhuhaozhe <haozhe.zhu@intel.com>
* Fix bert large inductor int8 accuray failure in last batch (#1803)
* fix bert large inductor int8 accuracy issue
* Format fixes
---------
Co-authored-by: jianan-gu <jianan.gu@intel.com>
* add bert (#1701)
* add bert IPEX
Co-authored-by: Wafaa Taie <wafaa.s.taie@intel.com>
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
* Wliao2/add dlrm kaggle (#1711)
* add dlrm kaggle
* fix license issue
* Update README.md
* remove unused files
* update
* add licence header for modified files
---------
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
* update bert-large for ARC (#1816)
* update scripts and Readme for ARC
* Update README.md
* Update quickstart scripts with env variables for Stable Diffusion and ResNet50v1.5 (#1791)
* adding env vars for SD and RN50
* updating accuracy quickstart and other minor changes
* using default values only if the env var are not set from the cmd line
* fix for coverage tests
* Added GHA for ITEX wide deep large (#1820)
* Added GHA for ITEX wide deep large
* Added stable diffusion inference ITEX (#1819)
* Added stable diffusion inference
* Changed file permissions
* Added GHA for EfficientNet ITEX (#1821)
* Added GHA for EfficientNet ITEX
* Update run_test.sh
* Update setup.sh
* Update README.md
* Added GHA for ITEX bert large Training (#1822)
* ssdmobilenet int8 accuracy fix (#1811)
* ssdmobilenet int8 accuracy fix
* added change in quickstart accuracy script
* modified public bucket link
* modified new args in unit test
* fixed unit test
* add BKC for DLRM-V2 convergence test (#1824)
* wsl2 documentation (#1815)
* add wsl2 stable diffusion documentation
* add wsl2 stable diffusion documentation
* add wsl2 base doc
* make minor tabular changes
* make minor tabular changes
* add ssh instructions
* re-word example
* update torch version outputs
* add entry in main readme
---------
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
* bert-large inductor uses int8_bf16 mix (#1792)
* Bert-large inductor use int8-bf16 mix
* ipex uses int8_bf16 mix in if
* merge develop
* inductor uses int8-bf16 mix
* Distilbert int8 optimization (#1830)
* optimize distilbert int8
* re-calibrate distilbert
* fix for inductor (#1834)
* TF- DistilBERT - Update quickstart scripts with env vars (#1818)
* Add Env variables to quickstart scripts
* Update # of cores for throughput script
* add distilbert (#1702)
* add distilbert
* Corrected refactored path for distilbert
* Added intel license header
* Update README.md
---------
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
* Update dlrm_s_pytorch.py (#1843)
* Wliao2/add stable diffusion (#1705)
* add stable_diffusion
* update some typo
* fix license issue
* update stable diffusion
* update for acc
* verify the result
* Refactored to new folder
* Update README.md
* add support for ARC
---------
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
* Fix Bert Large Int8 Latency Issue (#1859)
Co-authored-by: jianan-gu <jianan.gu@intel.com>
* [DistilBert] modify for masked_fill default value (#1868)
* Nhatle/bert large training x3 vs x1 (#1776)
* set num-inter-threads=2
* bert-large squad: Binding process to cores on 1 socket
* Enable multi-instance training for bert-large squad
* Fix incase users only run 1 instance
* Fix benchmark_command
* Molly/ddp bkc update (#1873)
* make num_iter flexbile
* bugfix for bert-large ddp
* bkc for rn50 ddp training update
* bkc for rn50 ddp training update
* bkc for dlrm_v1 ddp training update
---------
Co-authored-by: WeizhuoZhang-intel <weizhuo.zhang@intel.com>
* Corrected IPEX installer version (#1878)
* Changed IPEX installer version
* Update dlrm_s_pytorch.py (#1879)
* Update AI Bundle version in tests setup files for CI/CD pipeline (#1881)
* doc: document models_v2 contribution guideline (#1855)
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* Molly/inductor fp16 (#1875)
* make num_iter flexbile
* bugfix for bert-large ddp
* bkc for rn50 ddp training update
* bkc for rn50 ddp training update
* bkc for dlrm_v1 ddp training update
* rn50 fp16 torch.compile enabled
* fp16 autocast fix
* fix RN50 for fp16 torch.compile (#1849)
* enable stable-diffusion fp16 inductor path
* vit, bert-large fp16 enable
* merge to latest transformers patch
* Update enable_ipex_for_transformers.diff
* Update enable_ipex_for_transformers.diff
---------
Co-authored-by: Cao E <e.cao@intel.com>
Co-authored-by: WeizhuoZhang-intel <weizhuo.zhang@intel.com>
* Fix in case mpi_num_processes_per_socket=1 (#1885)
* Fix in case mpi_num_processes_per_socket=1
* small fix
* Update dlrm_s_pytorch.py (#1890)
* Modified dataset path (#1894)
* added GHA for ITEX bert large
* Stable Diffusion PYT Flex and Max (#1853)
* validate sd pyt
* add max tests and dockerfile
* Bump scipy in /models_v2/pytorch/stable_diffusion/inference/gpu (#1847)
Bumps [scipy](https://github.com/scipy/scipy) from 1.9.1 to 1.11.1.
- [Release notes](https://github.com/scipy/scipy/releases)
- [Commits](scipy/scipy@v1.9.1...v1.11.1)
---
updated-dependencies:
- dependency-name: scipy
dependency-type: direct:production
* Bump gitpython in /models_v2/pytorch/distilbert/inference/gpu (#1846)
Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.30 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases)
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES)
- [Commits](gitpython-developers/GitPython@3.1.30...3.1.41)
---
updated-dependencies:
- dependency-name: gitpython
dependency-type: direct:production
* Bump transformers in /models_v2/pytorch/distilbert/inference/gpu (#1840)
Bumps [transformers](https://github.com/huggingface/transformers) from 4.25.1 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.25.1...v4.36.0)
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
* Bump transformers in /models_v2/pytorch/bert_large/inference/gpu (#1810)
Bumps [transformers](https://github.com/huggingface/transformers) from 4.11.0 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.11.0...v4.36.0)
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
* Bump transformers (#1786)
Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.0 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.30.0...v4.36.0)
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
* Bump transformers (#1785)
Bumps [transformers](https://github.com/huggingface/transformers) from 4.30.0 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.30.0...v4.36.0)
---
updated-dependencies:
- dependency-name: transformers
dependency-type: direct:production
* Added GHA tests for stable diffusion (#1887)
* validate sd pyt
* update for ARC (#1781)
* update for ARC
* update log
* refactor to models_v2
* update path due to refactor
* sync with 2.1rc3
* Wliao2/add dlrm (#1704)
* add dlrm v2
* fix license issue
* Update dlrm_dataloader.py
* Update dist_models.py
* Update dlrm_dataloader.py
* Update dist_models.py
* refactored to a new folder
* update Readme
---------
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
* Max 3D-Unet container support (#1832)
* add masrkcnn container support
* add 3d-unet container support
* Added GHA for 3d Unet Training ITEX (#1823)
* Added Resnet50v1.5 and maskrcnn train GHA test (#1751)
* Refactored resnet50v1_5 for Zero copy BKC format (#1897)
* Added Necessary Metadata and Bug Fixes for Transfer Learning Notebooks (#1825)
* fixed file paths
* changed venv creation
* added pip install virtualenv
* downgraded PyYaml to be conpatible with tf models official
* 2.12.0 --> 2.12.1
* added back tf official version
* addressed review comments
* fixed version for fsspec, removed llm test
* changed accelerate version
* fix for tf-models-official
* added needed metadata
* made tests significantly less expensive
* fixed zip extract to tar extract
* fixed sms download
* fixed typo for csv path name
* addressed review comments
* simplified if statement
* Added oneapi path (#1902)
* validate sd pyt
* Update README.md with ipex version (#1903)
* Update README.md with ipex version
* Max MaskRCNN container support (#1831)
* add masrkcnn container support
* Max RN50 container validation (#1829)
* validate container for zero-bkc for rn50 max container
* Max BERT-Large container support (#1833)
* add bert-large container support
* Flex Wide and deep container (#1851)
* validate zero-copy bkc for itex stable diffusion
* validate zero-copy bkc for flex container (#1772)
* validate zero-copy bkc
* EfficientNet Container for flex (#1771)
* validate zero-copy bkc efficientnet
* TF MaskRCNN container for Flex GPU (#1755)
* adapt zero-copy bkc and validate maskrcnn
* validate bert-large inference PYT PVC (#1841)
* validate bert-large inference
* validate bert-large container PVC pytorch (#1838)
* validate bert-large container
* RN50 PYT Max container (#1904)
* validate refactor of zero-bkc training
* Latest updates to TF RN50 for Flex series (#1813)
* adapt zero-copy bkc for image build
* Update README.md for IPEX versions (#1907)
* Update README.md
* not cast and ramdomrized crossnet bias for inductor and make warmup iters as an arg (#1906)
* resolve merge conflicts (#1911)
* Wliao2/add ssdmbv1 (#1817)
* add ssd-mobilenetv1
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
* add IPEX Max 3dunet (#1706)
* add 3dunet for IPEX for 3DUnet for Max series
* Updated baremetal readme for distil bert IPEX (#1908)
* Updated baremetal readme distil bert for IPEX
* DistilBERT inference container PYT Flex and Max (#1854)
* add functional support
* docker: fix broken docker-compose.yml (#1913)
Fixes: 473d3b3 ("DistilBERT inference container PYT Flex and Max (#1854)")
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* docker/flex: fix build and run for tf maskrcnn (#1896)
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
* Flex PYT DLRM-v1 inference (#1895)
* build dlrmv1 container
* Updated baremetal readme for DLRM v1 (#1909)
* Updated baremetal readme for DLRM v1
* Update README.md (#1914)
* remove extra test (#1916)
* release docs for containers (#1915)
* Update release container table
* Updated main README table (#1919)
* Updated main README table
* clean up workflows
* restore git submodules
---------
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Co-authored-by: jianan-gu <jianan.gu@intel.com>
Co-authored-by: zhuhaozhe <haozhe.zhu@intel.com>
Co-authored-by: Om Thakkar <om.thakkar@intel.com>
Co-authored-by: sachinmuradi <sachin.muradi@intel.com>
Co-authored-by: Cao E <e.cao@intel.com>
Co-authored-by: mahathis <36486206+Mahathi-Vatsal@users.noreply.github.com>
Co-authored-by: lerealno <112975902+lerealno@users.noreply.github.com>
Co-authored-by: DiweiSun <105627594+DiweiSun@users.noreply.github.com>
Co-authored-by: zengxian <xiangdong.zeng@intel.com>
Co-authored-by: Weizhuo Zhang <weizhuo.zhang@intel.com>
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
Co-authored-by: Srikanth Ramakrishna <srikanth.ramakrishna@intel.com>
Co-authored-by: Mahmoud Abuzaina <mahmoud.abuzaina@intel.com>
Co-authored-by: Tyler Titsworth <tyler.titsworth@intel.com>
Co-authored-by: jiayisunx <jiayi.sun@intel.com>
Co-authored-by: zofia <110436990+zufangzhu@users.noreply.github.com>
Co-authored-by: Mahathi Vatsal <mahathi.vatsal.salopanthula@intel.com>
Co-authored-by: okhleif-IL <87550612+okhleif-IL@users.noreply.github.com>
Co-authored-by: Harsha Ramayanam <harsha.ramayanam@intel.com>
Co-authored-by: Clayne Robison <clayne.b.robison@intel.com>
Co-authored-by: jianyizh <jianyi.zhang@intel.com>
Co-authored-by: nhatle <105756286+nhatleSummer22@users.noreply.github.com>
Co-authored-by: gera-aldama <111396864+gera-aldama@users.noreply.github.com>
Co-authored-by: Real Novo, Luis <luis.real.novo@intel.com>
Co-authored-by: Sharvil Shah <shahsharvil96@gmail.com>
Co-authored-by: Ashiq Imran <ashiq.imran@intel.com>
Co-authored-by: Gopi Krishna Jha <96072995+gopikrishnajha@users.noreply.github.com>
Co-authored-by: leslie-fang-intel <leslie.fang@intel.com>
Co-authored-by: Sharvil Shah <sharvil.shah@intel.com>
Co-authored-by: Nick Camarena <91098672+nickcama@users.noreply.github.com>
Co-authored-by: xiangdong <40376367+zxd1997066@users.noreply.github.com>
Co-authored-by: wenjun liu <wenjun.liu@intel.com>
Co-authored-by: XumingGai <xuming.gai@intel.com>
Co-authored-by: wincent8 <wei.liao@intel.com>
Co-authored-by: Jesus Herrera Ledon <110855758+j3su5pro-intel@users.noreply.github.com>
Co-authored-by: XumingGai <108659240+XumingGai@users.noreply.github.com>
Co-authored-by: Chunyuan WU <chunyuan.wu@intel.com>
Co-authored-by: Syed Shahbaaz Ahmed <syed.shahbaaz.ahmed@intel.com>
Co-authored-by: Xuan Liao <xuan.liao@intel.com>
Co-authored-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>1 parent a36b1c2 commit 36d8b40
1,048 files changed
Lines changed: 68017 additions & 25603 deletions
File tree
- benchmarks
- common
- tensorflow
- diffusion
- tensorflow
- stable_diffusion
- inference
- bfloat16
- fp16
- fp32
- image_recognition/tensorflow/inceptionv4
- inference
- fp32
- int8
- language_modeling/tensorflow/bert_large/training
- bfloat16
- fp16
- fp32
- language_translation/tensorflow/mlperf_gnmt
- inference
- fp32
- object_detection/tensorflow
- faster_rcnn
- inference
- fp32
- int8
- rfcn
- inference
- fp32
- int8
- ssd-mobilenet/inference
- int8
- recommendation/tensorflow
- ncf
- inference
- fp32
- training
- bfloat16
- fp32
- wide_deep_large_ds/training
- fp32
- text_to_speech
- tensorflow
- wavenet
- inference
- fp32
- datasets/cloud_data_connector
- cloud_data_connector/int
- docker
- docker
- flex-gpu
- pytorch-distilbert-inference
- pytorch-dlrmv1-inference
- pytorch-stable-diffusion-inference
- tf-efficientnet-inference
- tf-maskrcnn-inference
- tf-resnet50v1-5-inference
- tf-stable-diffusion-inference
- tf-wide-deep-large-inference
- max-gpu
- pytorch-bert-large-inference
- pytorch-bert-large-training
- pytorch-distilbert-inference
- pytorch-resnet50v1-5-inference
- pytorch-resnet50v1-5-training
- pytorch-stable-diffusion-inference
- tf-3d-unet-training
- tf-bert-large-training
- tf-maskrcnn-training
- tf-resnet50v1-5-training
- pyt-cpu
- bert-large-inference
- bert-large-training
- distilbert-inference
- dlrm-inference
- dlrm-training
- maskrcnn-inference
- maskrcnn-training
- resnet50-inference
- resnet50-training
- resnext-32x16d-inference
- rnnt-inference
- rnnt-training
- ssd-resnet34-inference
- ssd-resnet34-training
- tf-cpu
- tf-3d-unet-mlperf-inference
- tf-bert-large-inference
- tf-bert-large-pretraining
- tf-dien-inference
- tf-distilbert-inference
- tf-mobilenet-v1-inference
- tf-resnet50v1-5-inference
- tf-resnet50v1-5-training
- tf-ssd-mobilenet-inference
- tf-ssd-resnet34-inference
- tf-ssd-resnet34-training
- tf-transformer-mlperf-inference
- tf-transformer-mlperf-training
- docs
- general
- pytorch
- notebooks/transfer_learning
- image_classification
- huggingface_image_classification
- pytorch_image_classification
- tf_image_classification
- object_detection/pytorch_object_detection
- pytorch_text_classification
- pytorch_text_generation
- question_answering
- text_classification
- bert_classifier_fine_tuning
- pytorch_text_classification
- tfhub_bert_text_classification
- text_generation/pytorch_text_generation
- models_v2
- common
- pytorch
- 3d_unet/inference/gpu
- 3d-unet
- bert_large
- inference/gpu
- training/gpu
- distilbert/inference/gpu
- demo-data
- training_configs
- dlrm/inference/gpu
- bench
- optim
- tricks
- resnet50v1_5
- inference/gpu
- training/gpu
- ssd-mobilenetv1/inference/gpu
- vision
- datasets
- nn
- ssd
- config
- transforms
- utils
- stable_diffusion/inference/gpu
- torchrec_dlrm
- inference/gpu
- data
- scripts
- sharding
- training/gpu
- data
- scripts
- sharding
- tensorflow
- 3d_unet/training/gpu
- bert_large/training/gpu
- efficientnet/inference/gpu
- maskrcnn
- inference/gpu
- training/gpu
- resnet50v1_5
- inference/gpu
- training
- gpu
- configure
- hvd_configure
- stable_diffusion/inference/gpu
- wide_deep_large_ds/inference/gpu
- models
- common/pytorch
- common
- consts
- patterns
- ddp
- inf
- train
- por
- inf
- train
- diffusion
- pytorch/stable_diffusion
- tensorflow
- stable_diffusion
- inference
- image_recognition
- pytorch/common
- tensorflow/inceptionv4/inference
- image_segmentation/tensorflow/maskrcnn/inference/gpu
- language_modeling
- pytorch
- bert_large
- inference/gpu
- training
- common
- tensorflow/distilbert_base/inference
- language_translation/tensorflow/mlperf_gnmt/fp32
- object_detection/tensorflow
- faster_rcnn
- inference
- fp32
- int8
- rfcn
- inference
- fp32
- int8
- ssd-mobilenet/inference
- int8
- recommendation
- pytorch
- dlrm/product
- torchrec_dlrm
- data_process
- ipex_optimized_model
- tensorflow
- ncf
- inference
- fp32
- training
- wide_deep_large_ds/training
- quickstart
- diffusion
- pytorch/stable_diffusion
- inference/cpu
- training/cpu
- tensorflow/stable_diffusion/inference/cpu
- generative-ai/pytorch/stable_diffusion/inference/gpu
- image_recognition
- pytorch
- resnet50
- inference/cpu
- bfloat16
- .docs
- fp32
- .docs
- training/cpu
- resnext-32x16d/inference/cpu
- tensorflow
- densenet169/inference/cpu/.docs
- inceptionv3/inference/cpu/.docs
- inceptionv4/inference/cpu
- .docs
- mobilenet_v1/inference/cpu
- resnet101/inference/cpu/.docs
- resnet50v1_5
- inference/cpu
- .docs
- training/cpu
- .docs
- resnet50/inference/cpu
- .docs
- image_segmentation/tensorflow/3d_unet_mlperf/inference/cpu
- language_modeling
- pytorch
- bert_large
- inference/cpu
- .docs
- training/cpu
- .docs
- distilbert_base/inference/cpu
- rnnt
- inference/cpu
- .docs
- training/cpu
- .docs
- tensorflow
- bert_large
- inference/cpu
- training/cpu
- distilbert_base/inference/cpu
- language_translation/tensorflow
- mlperf_gnmt/inference/cpu
- .docs
- transformer_lt_official/inference/cpu/.docs
- transformer_mlperf
- inference/cpu
- training/cpu
- object_detection
- pytorch
- maskrcnn
- inference/cpu
- training/cpu
- ssd-mobilenet/inference/gpu
- .docs
- ssd-resnet34
- inference/cpu
- training/cpu
- yolov4/inference/gpu
- .docs
- tensorflow
- faster_rcnn/inference/cpu
- fp32
- .docs
- int8
- .docs
- rfcn/inference/cpu
- .docs
- ssd-mobilenet/inference/cpu
- ssd-resnet34
- inference/cpu
- training/cpu
- recommendation
- pytorch
- dlrm
- inference/cpu
- .docs
- training/cpu
- .docs
- bfloat16
- .docs
- torchrec_dlrm
- inference/cpu
- training/cpu
- tensorflow
- dien/inference/cpu
- ncf/inference/cpu/fp32
- .docs
- wide_deep_large_ds
- inference/cpu
- .docs
- training/cpu
- .docs
- wide_deep/inference/cpu/.docs
- spr_base
- pytorch
- .docs
- tensorflow
- .docs
- text_to_speech/tensorflow/wavenet/inference/cpu/fp32
- .docs
- tests
- cicd
- IPEX-XPU
- stable_diffusion-inference
- ITEX-XPU
- 3d_unet-training
- bert_large-training
- efficientnet-inference
- maskrcnn-inference
- maskrcnn-training
- resnet50v1_5-inference
- resnet50v1_5-training
- stable_diffusion-inference
- wide_deep_large_ds-inference
- PyTorch
- bert-large-inference
- bert-large-training
- distilbert-inference
- dlrm-inference
- dlrm-training
- maskrcnn-inference
- maskrcnn-training
- resnet50-inference
- resnet50-training
- resnext32x16d-inference
- rnnt-inference
- rnnt-training
- ssd-resnet34-inference
- ssd-resnet34-training
- TensorFlow
- unit/common/tensorflow/tf_model_args
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Large diffs are not rendered by default.
This file was deleted.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
56 | 70 | | |
57 | 71 | | |
58 | 72 | | |
| |||
0 commit comments