Selecting the Right LLM for Salesforce - Part 1: Basics

In the ever-evolving landscape of AI tools, choosing the right large language model (LLM) is crucial for enhancing user experience, reducing costs, and improving efficiency.

LLMs may seem simple, but their discussions can get complex. To keep it relevant, we've divided the content into two parts. This first part covers the basics for Salesforce decision-makers and enthusiasts, while the second will focus on advanced Salesforce LLM research and benchmarks. With this blog, you'll build a solid foundation to easily understand the Salesforce LLM benchmark.

Choosing the right LLM is crucial. The model affects user experience, costs, and efficiency. With many options, understanding their nuances is key for informed decision-making. It’s about selecting the most advanced model and aligning its features with your specific use case.

🔑 Three Key Factors for Assessing LLMs

When evaluating large language models, focus on three essential factors:

  1. Speed: How quickly can the model process input and generate a response?

  2. Memory: What is the context window size, and how much information can the model retain during interactions?

  3. Modality: What types of inputs can the model handle (text, images, audio, etc.)?

⚡ Factor One: Speed

Speed, or inference speed, is perhaps the most critical factor when selecting an LLM. It determines how quickly a model can respond to queries. Generally, there's a trade-off between a model's intelligence and its speed. Faster models tend to be less complex but are more cost-effective, while slower models often exhibit greater intelligence but come at a higher price point.

Prioritizing speed is essential for real-time applications, such as customer service chatbots. Users typically prefer immediate responses, as delays can lead to frustration. For instance, if you operate an e-commerce platform, a fast model like CLAW3 Haiku will efficiently handle inquiries about order status or cancellation policies.

Here is a quick tabular representation of “SPEED” as a factor for common CRM / Enterprise usecases.

Use Case Acceptable Latency Criticality Level
Real-time Chatbots <2 seconds High
Email Drafting 5-10 seconds Medium
Batch Analytics 1-5 minutes Low

🧠 Factor Two: Memory/Context Window Size

The memory or context window in LLMs refers to the amount of text (measured in tokens) the model can process and retain at once to generate coherent responses.

The context window size is crucial for maintaining the flow of conversation and retaining relevant information. A larger context window allows the model to process more extended dialogues without losing track of earlier exchanges. This capability is particularly important for applications like multi-turn chatbots or summarizing lengthy documents.

Without sufficient memory, models may struggle to recall past interactions, leading to fragmented conversations. Selecting a model with a large context window is advisable if your use case involves detailed discussions or extensive data processing.

Consider following a typical chat scenario between a customer (Neha) and support personnel (Aryan) on the LEFT, and number of tokens reported by Tokenizer playground offered by OpenAI - https://platform.openai.com/tokenizer

Neha: Hi! My order hasn’t arrived yet, and it was supposed to be delivered by 10 AM today.

Aryan: Sorry for the delay. Let me check the status of your order. Can you share your order ID?

Neha: Sure, it’s QC12345678.

Aryan: Thanks! One moment, please.

Aryan: Your order is delayed due to unexpected traffic. It’s out for delivery and should reach you within 30 minutes.

Neha: This isn’t the first time my delivery has been late. Can you ensure it doesn’t happen again?

Aryan: I understand your frustration. I’ve flagged this to our logistics team. As an apology, I’ve added a ₹100 voucher to your account.

Neha: Thanks. Can I use the voucher for my next grocery order?

Aryan: Yes, the voucher will apply automatically. Anything else I can help you with?

Neha: No, that’s all. Thanks!

Aryan: You’re welcome! Have a great day 😊

A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).
— https://platform.openai.com/tokenizer

Pros and Cons of a Large Context Window in AI

Pros:

  1. Complete View: Processes all customer data at once for better personalization. Example: Recommending tailored products based on a customer’s entire purchase history.

  2. Handles Complexity: Easily generates detailed reports or insights from big datasets. Example: Analyzing a year’s worth of sales data to create a detailed pipeline forecast.

  3. Smooth Conversations: AI chatbots can manage long chats without losing track. Example: Resolving customer complaints spanning multiple tickets in one conversation.

Cons:

  1. Higher Costs: More computing power is needed, which can be expensive. Example: Running large AI queries on historical customer data may increase server costs.

  2. Not Always Needed: Many CRM tasks don’t require such large context windows. Example: Simple lead qualification doesn’t need data from unrelated customer interactions.

  3. Risk of Irrelevance: This may include unnecessary details, reducing accuracy. Example: Pulling outdated information during a sales pitch, confusing the rep or customer.

Large context windows are powerful but should align with your CRM goals and resources.

📷 Factor Three: Modality in LLMs

Modality refers to the types of inputs that a model can accept and process. Traditionally, LLMs were limited to text-based inputs. However, recent advancements have led to the development of multimodal models capable of handling various input formats, including images, audio, and video.

This capability can significantly enhance user interactions. For example, in a technical support scenario, a customer might describe a malfunctioning product while also providing an image of the issue. A multimodal model can analyze both the text and the visual input, generating comprehensive troubleshooting steps or initiating a return process based on the combined context.

78% of customer interactions now involve multiple formats (e.g., support tickets with screenshots, voice notes about product defects, audio(call recordings) in both Sales and Support)

How a Multimodal LLM Impacts Salesforce CRM Usecases?

Use Case Description Traditional Approach Multimodal LLM Advantage Business Impact
Customer Service Resolving product returns Agents manually match text tickets to inventory Image analysis of defects + voice-to-text conversion of complaints 65% faster case resolution
Sales Prospecting Competitive deal analysis Manual review of competitor PDFs/websites PDF text + image extraction for auto-generated battlecards 22% faster deal cycles
Field Service Equipment repairs Technicians describe issues via text Video analysis of malfunctioning machinery 40% fewer repeat visits
Commerce Cloud Visual product search Keyword-based search Image-to-product matching from customer photos 18% higher conversion rates
Marketing Cloud Content localization Manual translation + image swaps Auto-translate text + region-specific image swaps 30% faster campaign launches

FAQs: LLM Selection

  • Not necessarily. While a larger context window can improve continuity in long conversations, it can also require more computing resources. Balance your performance needs with infrastructure constraints.

  • Look at independent benchmark results and run pilot projects. Metrics such as response time (speed), maximum token context (memory), and supported data types (modality) are usually published in model specifications or third-party evaluations.

  • Some advanced models perform well across speed, memory, and modality, but it often involves trade-offs in cost or resource usage. Always align the LLM’s capabilities with your specific Salesforce/Business goals before making a choice.

💬 Lets Talk

Concretio Contact Us Banner

Drop us a note, we’re happy to take the conversation forward

Abhinav Gupta

First Indian Salesforce MVP, rewarded Eight times in a row, has been blogging about Salesforce, Cloud, AI, & Web3 since 2011. Founded 1st Salesforce Dreamin event in India, called “Jaipur Dev Fest”. A seasoned speaker at Dreamforce, Dreamin events, & local meets. Author of many popular GitHub repos featured in official Salesforce blogs, newsletters, and books.

https://abhinav.fyi
Previous
Previous

Mastering Salesforce QA: How AI is Reshaping Testing Strategies

Next
Next

Google Chrome Extensions for Salesforce in 2025 (Top 20)