Deploy Falcon-7B with NVIDIA TensorRT-LLM on OCI

Blog: Oracle BPM

This blog details deploying the Falcon 7B Large Language Model on Oracle Cloud using Nvidia’s TensorRT LLM framework. Falcon 7B/40B/180B, comparable to Google’s PaLM 2 and GPT-4, is trained on 3.5 trillion tokens. The deployment, enhanced by NVIDIA’s Triton Inference Server, leverages Oracle Cloud’s robust infrastructure.

Leave a Comment Cancel reply

You must be logged in to post a comment.

Get the BPI Web Feed

Using the HTML code below, you can display this Business Process Incubator page content with the current filter and sorting inside your web site for FREE.

Copy/Paste this code in your website html code:

Customizing your BPI Web Feed

You can click on the Get the BPI Web Feed link on any of our page to create the best possible feed for your site. Here are a few tips to customize your BPI Web Feed.

Customizing the Content Filter
On any page, you can add filter criteria using the MORE FILTERS interface:

Customizing the Content Sorting
Clicking on the sorting options will also change the way your BPI Web Feed will be ordered on your site:

Some integration examples