process management blog posts

Democratizing Generative AI with CPU-based Inference

Blog: Oracle BPM

The Generative AI market faces a significant challenge regarding hardware availability worldwide. Much of the expensive GPU hardware capacity is being used for Large Language Model (LLM) training therefore creating an availability crunch for users wanting to deploy, evaluate foundation models in their own cloud tenancy/subscriptions for inference and fine tuning the ML models. CPUs are a choice for various workloads. Below is our experience working with CPUs including performance test results.