Selecting the Right LLM for Salesforce - Part 1: Basics

Feb 24

In the ever-evolving landscape of AI tools, choosing the right large language model (LLM) is crucial for enhancing user experience, reducing costs, and improving efficiency.

LLMs may seem simple, but their discussions can get complex. To keep it relevant, we've divided the content into two parts. This first part covers the basics for Salesforce decision-makers and enthusiasts, while the second will focus on advanced Salesforce LLM research and benchmarks. With this blog, you'll build a solid foundation to easily understand the Salesforce LLM benchmark.

“Choosing the right LLM is crucial. The model affects user experience, costs, and efficiency. With many options, understanding their nuances is key for informed decision-making. It’s about selecting the most advanced model and aligning its features with your specific use case. ”

🔑 Three Key Factors for Assessing LLMs

When evaluating large language models, focus on three essential factors:

Speed: How quickly can the model process input and generate a response?
Memory: What is the context window size, and how much information can the model retain during interactions?
Modality: What types of inputs can the model handle (text, images, audio, etc.)?

⚡ Factor One: Speed

Speed, or inference speed, is perhaps the most critical factor when selecting an LLM. It determines how quickly a model can respond to queries. Generally, there's a trade-off between a model's intelligence and its speed. Faster models tend to be less complex but are more cost-effective, while slower models often exhibit greater intelligence but come at a higher price point.

Prioritizing speed is essential for real-time applications, such as customer service chatbots. Users typically prefer immediate responses, as delays can lead to frustration. For instance, if you operate an e-commerce platform, a fast model like CLAW3 Haiku will efficiently handle inquiries about order status or cancellation policies.

Here is a quick tabular representation of “SPEED” as a factor for common CRM / Enterprise usecases.

    Use Case
    Acceptable Latency
    Criticality Level
  


    Real-time Chatbots
    <2 seconds
    High
  

    Email Drafting
    5-10 seconds
    Medium
  

    Batch Analytics
    1-5 minutes
    Low
  

🧠 Factor Two: Memory/Context Window Size

“The memory or context window in LLMs refers to the amount of text (measured in tokens) the model can process and retain at once to generate coherent responses. ”

The context window size is crucial for maintaining the flow of conversation and retaining relevant information. A larger context window allows the model to process more extended dialogues without losing track of earlier exchanges. This capability is particularly important for applications like multi-turn chatbots or summarizing lengthy documents.

Credits - https://www.techtarget.com/whatis/definition/context-window

Without sufficient memory, models may struggle to recall past interactions, leading to fragmented conversations. Selecting a model with a large context window is advisable if your use case involves detailed discussions or extensive data processing.

Consider following a typical chat scenario between a customer (Neha) and support personnel (Aryan) on the LEFT, and number of tokens reported by Tokenizer playground offered by OpenAI - https://platform.openai.com/tokenizer

*Neha: Hi! My order hasn’t arrived yet, and it was supposed to be delivered by 10 AM today.*
*Aryan: Sorry for the delay. Let me check the status of your order. Can you share your order ID?*
*Neha: Sure, it’s QC12345678.*
*Aryan: Thanks! One moment, please.*
*Aryan: Your order is delayed due to unexpected traffic. It’s out for delivery and should reach you within 30 minutes.*
*Neha: This isn’t the first time my delivery has been late. Can you ensure it doesn’t happen again?*
*Aryan: I understand your frustration. I’ve flagged this to our logistics team. As an apology, I’ve added a ₹100 voucher to your account.*
*Neha: Thanks. Can I use the voucher for my next grocery order?*
*Aryan: Yes, the voucher will apply automatically. Anything else I can help you with?*
*Neha: No, that’s all. Thanks!*
*Aryan: You’re welcome! Have a great day 😊*

“A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).”

— https://platform.openai.com/tokenizer

Pros and Cons of a Large Context Window in AI

Pros:

Complete View: Processes all customer data at once for better personalization. Example: Recommending tailored products based on a customer’s entire purchase history.
Handles Complexity: Easily generates detailed reports or insights from big datasets. Example: Analyzing a year’s worth of sales data to create a detailed pipeline forecast.
Smooth Conversations: AI chatbots can manage long chats without losing track. Example: Resolving customer complaints spanning multiple tickets in one conversation.

Cons:

Higher Costs: More computing power is needed, which can be expensive. Example: Running large AI queries on historical customer data may increase server costs.
Not Always Needed: Many CRM tasks don’t require such large context windows. Example: Simple lead qualification doesn’t need data from unrelated customer interactions.
Risk of Irrelevance: This may include unnecessary details, reducing accuracy. Example: Pulling outdated information during a sales pitch, confusing the rep or customer.

Large context windows are powerful but should align with your CRM goals and resources.

📷 Factor Three: Modality in LLMs

Modality refers to the types of inputs that a model can accept and process. Traditionally, LLMs were limited to text-based inputs. However, recent advancements have led to the development of multimodal models capable of handling various input formats, including images, audio, and video.

This capability can significantly enhance user interactions. For example, in a technical support scenario, a customer might describe a malfunctioning product while also providing an image of the issue. A multimodal model can analyze both the text and the visual input, generating comprehensive troubleshooting steps or initiating a return process based on the combined context.

“78% of customer interactions now involve multiple formats (e.g., support tickets with screenshots, voice notes about product defects, audio(call recordings) in both Sales and Support)”

How a Multimodal LLM Impacts Salesforce CRM Usecases?

    Use Case
    Description
    Traditional Approach
    Multimodal LLM Advantage
    Business Impact
  


    Customer Service
    Resolving product returns
    Agents manually match text tickets to inventory
    Image analysis of defects + voice-to-text conversion of complaints
    65% faster case resolution
  

    Sales Prospecting
    Competitive deal analysis
    Manual review of competitor PDFs/websites
    PDF text + image extraction for auto-generated battlecards
    22% faster deal cycles
  

    Field Service
    Equipment repairs
    Technicians describe issues via text
    Video analysis of malfunctioning machinery
    40% fewer repeat visits
  

    Commerce Cloud
    Visual product search
    Keyword-based search
    Image-to-product matching from customer photos
    18% higher conversion rates
  

    Marketing Cloud
    Content localization
    Manual translation + image swaps
    Auto-translate text + region-specific image swaps
    30% faster campaign launches
  

FAQs: LLM Selection

Not necessarily. While a larger context window can improve continuity in long conversations, it can also require more computing resources. Balance your performance needs with infrastructure constraints.
Look at independent benchmark results and run pilot projects. Metrics such as response time (speed), maximum token context (memory), and supported data types (modality) are usually published in model specifications or third-party evaluations.
Some advanced models perform well across speed, memory, and modality, but it often involves trade-offs in cost or resource usage. Always align the LLM’s capabilities with your specific Salesforce/Business goals before making a choice.

💬 Lets Talk

Drop us a note, we’re happy to take the conversation forward

Name

First Name

Last Name

Subject

Message

Abhinav Gupta

First Indian Salesforce MVP, rewarded Eight times in a row, has been blogging about Salesforce, Cloud, AI, & Web3 since 2011. Founded 1st Salesforce Dreamin event in India, called “Jaipur Dev Fest”. A seasoned speaker at Dreamforce, Dreamin events, & local meets. Author of many popular GitHub repos featured in official Salesforce blogs, newsletters, and books.

https://abhinav.fyi

Selecting the Right LLM for Salesforce - Part 1: Basics

🔑 Three Key Factors for Assessing LLMs

⚡ Factor One: Speed

🧠 Factor Two: Memory/Context Window Size

Pros and Cons of a Large Context Window in AI

📷 Factor Three: Modality in LLMs

How a Multimodal LLM Impacts Salesforce CRM Usecases?

FAQs: LLM Selection

Q. Does a higher memory capacity always mean better performance?

Q. How do I measure these factors when evaluating an LLM?

Q. Can a single LLM excel at all three factors?

💬 Lets Talk

Mastering Salesforce QA: How AI is Reshaping Testing Strategies

Google Chrome Extensions for Salesforce in 2025 (Top 20)