A guided tour on how to install optimized pytorch
and optionally Apple's new MLX
and Google's JAX
on Apple Silicon Macs and how to use HuggingFace
large language models for your own experiments. Apple Silicon Macs show good performance for many machine learning tasks.
This guide was updated to Version 4: main change is the usage of the
uv
package manager. The previous version 3 that uses Python's standardpip
and package manager and explicitvenv
management, is still available here.
We will perform the following steps:
- Install
homebrew
- Install
pytorch
with MPS (metal performance shaders) support using Apple Silicon GPUs - Install Apple's new
mlx
framework - Install
JAX
with Apple's metal drivers (experimental is this point in time (2025-06), and not up-to-date.) - Install
tensorflow
with and Apple's metal pluggable metal driver optimizations - Install
jupyter lab
to run notebooks - Install
huggingface
and run some pre-trained language models usingtransformers
and just a few lines of code within jupyter lab for simple chat bot.
(skip to 1. Preparations if you know which framework you are going to use)
Tensorflow, JAX, Pytorch, and MLX are deep-learning frameworks that provide the required libraries to perform optimized tensor operations used in training and inference. On high level, the functionality of all four is equivalent. Huggingface builds on top of any of the those frameworks and provides a large library of pretrained models for many different use-cases, ready to use or to customize plus a number of convenience libraries and sample code for easy getting-started.
- Pytorch is the most general and currently most widely used deep learning framework. In case of doubt, use Pytorch. It supports many different hardware platforms (including Apple Silicon optimizations).
- JAX is a newer Google framework that is considered especially by researchers as the better alternative to Tensorflow. It support GPUs, TPUs, and Apple's Metal framework (still experimental) and is more 'low-level', especially when used without complementary neural network-layers such as flax. JAX on Apple Silicon is still 'exotic', hence for production projects, use Pytorch, and for research projects, both JAX and MLX are interesting: MLX has more dynamic development (at this point in time), JAX supports more hardware framework (GPUs and TPUs) besides Apple Silicon, but development of the Apple
jax-metal
drivers is often not up-to-date with the latest versions ofJAX
and requires the use of oldJAX
versions. (s.b.) - MLX is Apple's new kid on the block, and thus overall support and documentation is (currently) much more limited than for the other main frameworks. It is beautiful and well designed (they took lessons learned for torch and tensorflow), yet it is closely tied to Apple Silicon. It's currently best for students that have Apple hardware and want to learn or experiment with deep learning. Things you learn with MLX easily transfer to Pytorch, yet be aware that conversion of models and porting of training and inference code is needed in order to deploy whatever you developed into the non-Apple universe.
- corenet is Apple's training library that utilizes PyTorch and the HuggingFace infrastructure, and additionally contains examples how to migrate models to MLX. See the example: OpenElm (MLX).
- Tensorflow is the 'COBOL' of deep learning and it's practically silently EoL'ed by Google. Google themselves publishes new models for PyTorch and JAX/Flax, and not for Tensorflow. If you are not forced to use Tensorflow, because your organisation already uses it, ignore it. If your organziation uses TF, make a migration plan! Look at Pytorch for production and JAX for research. Another reason to still look into Tensorflow are embedded applications and Tensorflow's C-library.
HuggingFace publishes an Overview of model-support for each framework. Currently, Pytorch is the defacto standard, if you want to make use of existing models.
For the (probably too simplified) answer to the question "What's the fastest?" have a look at the Jupyter notebook 02-Benchmarks, and once you've completed the installation, you can test your own environment. The notebook allows to compare the speed of matrix multiplications for different frameworks. However, the difference between frameworks when performing 'standard' model training or inference tasks will most likely be less pronounced.
If you haven't done so, go to https://brew.sh/ and follow the instructions to install homebrew.
Once done, open a terminal and type brew --version
to check that it is installed correctly.
Now use brew
to install more recent versions of python
, uv
, and git
. The recommendation is to use Homebrew's Python 3.12, because that is currently the Python version that allows installation of all frameworks. The roadblock for version 3.13 is Tensorflow. Below, we'll install each framework separately using updated Python versions.
Use
[email protected]
, if you care about Tensorflow.
brew install [email protected] uv git
Apple does not put too much energy into keeping MacOS's python up-to-date. If you want to use an up-to-date default python, it makes sense to make homebrew's python the default system python. So, if, you want to use homebrew's Python 3.12 or 3.13 in Terminal, the easiest way way to do so (after
brew install [email protected]
):
Edit ~/.zshrc
and insert (again use [email protected]
, if you care about Tensorflow):
# This is OPTIONAL and only required if you want to make homebrew's Python 3.13 as the global version:
export PATH="/opt/homebrew/opt/[email protected]/bin:$PATH"
export PATH=/opt/homebrew/opt/[email protected]/libexec/bin:$PATH
(Restart your terminal to activate the path changes, or enter source ~/.zshrc
in your current terminal session.)
Now clone this project as a test project:
git clone https://github.com/domschl/HuggingFaceGuidedTourForMac
This clones the test-project into a directory HuggingFaceGuidedTourForMac
Now execute:
cd HuggingFaceGuidedTourForMac
uv sync
source .venv/bin/activate
This will install a virtual environment at HuggingFaceGuidedTourForMac/.venv
using the python version defined in the project's .python-version
file and install the dependencies defined in pyproject.toml
and finally activate that environment. Have a look at each of those locations and files to get an understanding what uv sync
installed. Check out uv
documentation for general information on uv
.
You have now a virtual environment with all of the mentioned deep learning frameworks installed. (Look at pyproject.toml for the installed versions.) This is only useful for a first overview, and below we will install each separately and step-by-step.
Execute:
uv run jupyter lab 00-SystemCheck.ipynb
This will open a jupyter notebook that will test each of the installed frameworks. Use shift-enter
to execute each notebook cell and verify that all tests complete successfully.
A very unintuitive property of virtual environments is the fact: while you enter an environment by activating it in the subdirectory of your project (with
.venv/bin/activate
oruv sync
), thevenv
stays active when you leave the project folder and start working on something completely different until you explicitly deactivate thevenv
withdeactivate
:
deactivate
There are a number of tools that modify the terminal system prompt to display the currently active
venv
, which is very helpful. Check out starship (recommended). Oncestarship
is active, your terminal prompt will show the active Python version and the name of the virtual environment.
We will now perform a step-by-step installation for a new pytorch
project. Check out <https://pytorch.org>, but here, we will install Pytorch with uv
.
Create a new directory for your test project and install Pytorch using the latest Python version:
mkdir torch_test
cd torch_test
uv init --python 3.13
uv venv
uv add torch numpy
source .venv/bin/activate
This: creates a new project directory, enters it, initializes a new project using Python 3.13 (which is support with Apple Metal acceleration)
and installs a torch
and numpy
in a new virtual environment.
We can now start python and enter a short test sequence to verify everything works:
python
Python 3.13.5 (main, Jun 11 2025, 15:36:57) [Clang 17.0.0 (clang-1700.0.13.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.backends.mps.is_available()
True
>>>
If you see True
as answer on torch.backends.mps.is_available()
, Metal acceleration is working ok.
Enter code main.py
. This will open the default 'Hello, world' created by uv
.
Change the code to:
import torch
def main():
print("Hello from torch-test!")
if torch.backends.mps.is_available():
print("Excellent! MPS backend is available.")
else:
print("MPS backend is not available: Something went wrong! Are you running this on a Mac with Apple Silicon chip?")
if __name__ == "__main__":
main()
Save and exit the editor and run with:
uv run main.py
Deactivate this environment with deactivate
, re-activate it with source .venv/bin/activate
.
In similar fashion, we create now a new MLX project:
Create a new directory for your test project and install pytorch using the latest Python version:
mkdir mlx_test
cd mlx_test
uv init --python 3.13
uv venv
uv add mlx
source .venv/bin/activate
Again, start python
and enter:
import mlx.core as mx
print(mx.__version__)
This should print a version, such as 0.26.1
(2025-06)
- Visit the Apple MLX project and especially mlx-examples!
- There is a vibrant MLX community on Huggingface that has ported many nets to MLX: Huggingface MLX-Community
- Apple's new corenet utilizes PyTorch and the HuggingFace infrastructure, and additionally contains examples how to migrate models to MLX. See the example: OpenElm (MLX).
Deactivate with deactivate
.
JAX is an excellent choice, if low-level optimization of algorithms and research beyond the boundaries of established deep-learning algorithms is your focus. Modelled after numpy
, it supports automatic differentiation of 'everything' (for optimization problems) and supports vectorization and parallelization of python algorithms beyond mere deep learning. To get functionality that is expected from other deep learning frameworks (layers, training-loop functions and similar 'high-level'), consider installing additional neural network library such as: Flax NNX
.
Unfortunately, the JAX metal drivers have started to lag behind JAX releases, and therefore you need to check the compatibility table for the supported versions of JAX that match the available jax-metal drivers. Currently, Jax is therefore pinned to the outdated version 0.4.34. Check for new jax-metal
releases for updates, compatibility table.
mkdir jax_test
cd jax_test
uv init --python 3.13
uv venv
uv add jax==0.4.34 jax-metal
source .venv/bin/activate
Start python
and enter:
import jax
print(jax.devices()[0])
This should output something like:
Python 3.13.5 (main, Jun 11 2025, 15:36:57) [Clang 17.0.0 (clang-1700.0.13.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import jax
... print(jax.devices()[0])
...
Platform 'METAL' is experimental and not all JAX functionality may be correctly supported!
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1750594202.639458 5239992 mps_client.cc:510] WARNING: JAX Apple GPU support is experimental and not all JAX functionality is correctly supported!
Metal device set to: Apple M2 Max
systemMemory: 32.00 GB
maxCacheSize: 10.67 GB
I0000 00:00:1750594202.655851 5239992 service.cc:145] XLA service 0x600002f0c500 initialized for platform METAL (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1750594202.655998 5239992 service.cc:153] StreamExecutor device (0): Metal, <undefined>
I0000 00:00:1750594202.657356 5239992 mps_client.cc:406] Using Simple allocator.
I0000 00:00:1750594202.657365 5239992 mps_client.cc:384] XLA backend will use up to 22906109952 bytes on device 0 for SimpleAllocator.
METAL:0
Note: uv
does a good job in resolving version dependencies between JAX and the required metal drivers. If you plan to use pip
, you will need to manually verify version compliance:
Check the compatibility table for the supported versions of JAX
that match the available jax-metal
drivers.
Tensorflow is losing support fast, and not even Google publishes new models for Tensorflow. A migration plan is recommended, if you plan to use this.
Tensorflow supports Python 3.12 and the metal drivers are only available for 3.12.
mkdir tensorflow_test
cd tensorflow_test
uv init --python 3.12
uv venv
uv add tensorflow tensorflow-metal
source .venv/bin/activate
To test that tensorflow
is installed correctly, open a terminal, type python
and within the python shell, enter:
import tensorflow as tf
tf.config.list_physical_devices('GPU')
You should see a GPU-type device:
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Let's now create a full project and do a first experiment by implementing a chat-bot in a Jupyter notebook:
mkdir chat_test
cd chat_test
uv init --python 3.13
uv venv
uv add torch numpy jupyterlab ipywidgets transformers accelerate "huggingface_hub[cli]"
source .venv/bin/activate
To start Jupyter lab, type:
jupyter lab
Either copy the notebook 01-ChatBot.ipynb
into your project chat_test
, or enter the code into a new notebook:
import torch
from transformers import pipeline
model_name = "Qwen/Qwen2.5-0.5B-Instruct"
pipeline = pipeline(task="text-generation", model=model_name, torch_dtype=torch.bfloat16, device_map="auto") # or request device_map="mps"
chat = [
{"role": "system", "content": "You are a super-intelligent assistant"},
]
first = True
while True:
if first is True:
first = False
print("Please press enter (not SHIFT-enter) after your input, enter 'bye' to end:")
try:
input_text = input("> ")
if input_text in ["", "bye", "quit", "exit"]:
break
print()
chat.append({"role": "user", "content": input_text})
response = pipeline(chat, max_new_tokens=512)
print(response[0]["generated_text"][-1]["content"])
chat = response[0]["generated_text"]
print()
except KeyboardInterrupt:
break
That all that is required to build a simple chat-bot with dialog history.
Try to:
- Change the chat model to larger versions: "Qwen/Qwen2.5-3B-Instruct", "meta-llama/Meta-Llama-3-8B-Instruct"
- Check out https://huggingface.co/models?pipeline_tag=text-generation&sort=trending for the latest chat models!
When experimenting with HuggingFace, you will download large models that will be stored in your home directory at:
~/.cache/huggingface/hub
. You can remove these models at any time by deleting this directory or parts of it's content.
"huggingface_hub[cli]"
installs the huggingface command line tools that are sometimes required to download (proprietary licensed) models.
- The fast-track to learning how neural network and specifically large languages models actually work, is Andrej Karpathy's course on Youtube: The spelled-out intro to neural networks and backpropagation: building micrograd. If you know some python and how to multiply a matrix with numpy, this is the course that takes you all the way to being able to build your own Large-language model from scratch.
- 2025-06-22: (Guide version 4): Version updates and usage of
uv
package manager. Old v3 version usingpip
available at v3. - 2024-09-10: Version updates for the platforms.
- 2024-07-26: Version updates for the platforms.
- 2024-04-28: Added JAX installation with Metal support and quick-test.
- 2024-04-26: Apple's corenet
- 2024-04-22: Llama 3.
- 2024-02-24: (Guide version 3.0) Updates for Python 3.12 and Apple MLX framework, Tensorflow is legacy-option.
- 2023-12-14: Pin python version of homebrew to 3.11.
- 2023-10-30: Re-tested with macOS 14.1 Sonoma, Tensorflow 2.14, Pytorch 2.1. Next steps added for more advanced projects.
- 2023-09-25: (Guide version 2.0) Switched from
conda
topip
andvenv
for latest versions of tensorflow 2.13, Pytorch 2, macOS Sonoma, installation is now much simpler. - 2023-03-16: Since
pytorch
v2.0 is now released, the channelpytorch-nightly
can now be replaced bypytorch
in the installation instructions. Thepytorch-nightly
channel is no longer needed for MPS support.