Download Gpt-j [verified] -
Use quantization (4-bit) or offload layers to CPU:
model = AutoModelForCausalLM.from_pretrained(...).to("mps") download gpt-j
After fine-tuning, the adapter weights are tiny (a few MB) and can be merged into the base model. Use quantization (4-bit) or offload layers to CPU:
Use wget with --no-check-certificate (temporary) or update your CA certificates: download gpt-j