Research Question

Research the core feature sets of Granola, Otter.ai, Fireflies.ai, and Fathom in depth — covering transcription accuracy, note quality and formatting, AI summarization style, meeting bot behavior (visible vs. silent), integrations (CRM, Slack, Notion, etc.), and platform support (Zoom, Teams, Meet, in-person). Produce a detailed comparison table with one row per tool and columns for each major feature category. Note which features are genuinely differentiated vs. table stakes.

Granola’s device-audio capture creates a genuine privacy moat that the other three tools cannot match without switching architectures. By transcribing directly from the user’s laptop or phone speakers and microphone in real time (discarding the raw audio afterward), Granola never appears in the participant list and triggers no platform recording announcements. This produces notes that feel like an extension of the user’s own thinking rather than a third-party dump.

  • Supports Zoom, Google Meet, Microsoft Teams, Webex, Slack Huddles, phone calls, and in-person conversations via system audio on macOS, Windows, and iOS.[1]
  • Independent testing shows 90-92% transcription accuracy in clean English environments; accuracy drops with heavy crosstalk or strong accents.[2]
  • Post-meeting AI blends the user’s rough typed notes with the transcript to generate structured summaries, action items, and follow-up emails using customizable “Recipes” or natural-language prompts.[3]

For anyone entering or competing in this space, a visible bot is now a liability in client-facing or sensitive conversations. Granola’s local-first approach also enables deeper privacy claims (no stored audio, GDPR-aligned), which sales and consulting teams cite as a decisive factor.

Otter.ai’s real-time collaborative layer and OtterPilot bot create a different kind of stickiness: live captions visible to everyone on the call plus searchable team knowledge bases. The tool pioneered live transcription that appears during meetings and has evolved into a shared workspace with Channels that group meetings by topic, project, or team.

  • Automatic OtterPilot bot joins Zoom, Google Meet, and Microsoft Teams; bot-free desktop/mobile recording is also available as an alternative.[4]
  • Claims 93-95% accuracy in good audio conditions with strong speaker identification; supports real-time captions in multiple languages.[5]
  • AI summaries include decisions and action items; Otter AI Chat lets users query across all past meetings and connected apps (Slack, Notion, Salesforce, HubSpot, Jira, Asana, Google Docs).[4]

Real-time collaboration and live captions remain Otter’s clearest differentiator. Most competitors still deliver notes only after the call ends; Otter’s live experience is table stakes for education, large team syncs, or any meeting where participants need to reference the transcript while speaking.

Fireflies.ai wins on conversation intelligence depth and CRM automation breadth, turning every meeting into searchable sales or operational data. Its “Fred” bot joins calls automatically via calendar integration and delivers not just transcripts but talk-time analytics, sentiment scores, topic trackers, and AI filters that surface patterns across hundreds of conversations.

  • Supports Zoom, Google Meet, Microsoft Teams, Webex plus desktop app, Chrome extension, and mobile for in-person or file uploads; 100+ languages with auto-detection.[6]
  • Accuracy consistently reported above 90-95% in typical business audio; strong multi-speaker labeling.[7]
  • Post-meeting summaries, Ask Fred chatbot, soundbites, and bookmarks; native sync to 5+ CRMs (HubSpot, Salesforce, etc.), Slack, project tools, and MCP for external AI agents.[8]

For sales teams, Fireflies’ ability to push structured insights directly into CRM fields and run cross-meeting analytics is the feature that justifies the cost. Pure note-taking is now table stakes; conversation intelligence layered on top is the real moat.

Fathom delivers the fastest, most generous free-tier experience with near-instant summaries and flexible capture modes that include true bot-free desktop recording. It processes calls in ~30 seconds and offers unlimited recordings and transcriptions on the free plan, making it the default choice for individuals and small teams who refuse to pay until they hit advanced needs.

  • Primarily Zoom, Google Meet, and Microsoft Teams; desktop app enables bot-free capture of system audio across any platform.[9]
  • Accuracy in the 85-95% range depending on conditions; summaries and action items appear almost immediately.[10]
  • Clean summaries, key moments, AI Scorecards for coaching, and Ask Fathom search across conversations; native HubSpot/Salesforce sync plus Slack, Asana, Gmail, and direct links to ChatGPT/Claude.[11]

Fathom proves that unlimited free recording plus speed can be a powerful acquisition engine. Most competitors still meter minutes or summaries on free plans; Fathom’s generosity forces the others to compete on depth rather than basic access.

The features that are now table stakes across all four tools include basic Zoom/Teams/Meet support, post-meeting AI summaries with action items, Slack/Notion sharing, and speaker identification. What remains genuinely differentiated is:

  • Truly silent/bot-free capture (Granola’s core advantage; Fathom offers it as an option).
  • Hybrid human + AI note enhancement (Granola only).
  • Live collaborative transcription and team Channels (Otter only).
  • Deep conversation intelligence + broad CRM field-level automation (Fireflies only).
  • Sub-30-second summaries and unlimited free recordings (Fathom only).

Anyone building or choosing a tool in 2026 must decide whether they are competing on privacy/invisibility, real-time collaboration, sales analytics depth, or frictionless free access. The days of “good enough transcription plus summary” as a standalone product are over; the winners are defined by the non-obvious layer they add on top of the transcript.


Recent Findings Supplement (May 2026)

Granola’s hybrid notepad model gained powerful natural-language editing and cross-tool connectivity in 2025–2026, turning user-written notes into the primary input rather than a raw transcript.[1]

By letting users type plain-English instructions (“make this shorter,” “rewrite in pirate voice,” or “fix the spelling of Niall”), Granola routes edits through its latest AI models while preserving the human structure the user already created. This mechanism eliminates post-meeting formatting drudgery and keeps the note owner in control—something bot-first tools cannot replicate because they start from a full transcript.

  • July 24, 2025 launch of natural-language editing; three core capabilities (tone, length, precise corrections) announced on LinkedIn and confirmed in product updates.[2]
  • December 4, 2025 introduction of “Recipes”—expert-written saved prompts that combine meeting notes with Claude/ChatGPT/Cursor via the new MCP protocol.[1]
  • February 2026 Series C ($125M at $1.5B valuation) added team Spaces, personal/enterprise APIs, and SOC 2 Type 2 compliance (July 2025).[3][4]
  • iOS App Store launch April 30, 2025 for in-person meetings; free plan now limited to last 30 days of notes (February 2026 rebrand).[5]

For competitors: Any tool still forcing users to edit raw transcripts or rely solely on bot-generated output now looks dated. The bar has moved to “AI that respects and augments human intent in real time.”

Fathom flipped from bot-only to fully flexible capture modes in April 2026, letting users choose bot-free, bot-free audio-only, or full video bot on a per-meeting basis while adding live summaries and account-wide AI search.[6]

The mechanism is a redesigned desktop app that records locally when bot-free is selected, streams live summaries as the call unfolds, and exposes the entire meeting corpus to “Ask Fathom” across personal, team, and organizational data. This directly addresses the privacy friction that visible bots create and matches Granola’s silent-capture advantage while retaining video optionality.

  • April 15, 2026 major platform update announced via Business Wire and TechCrunch: bot-free transcription/audio capture, live summaries, iOS app for in-person, MCP integrations with ChatGPT/Claude.[6][7]
  • October 2025 beta botless recording, Asana integration, public API, and AI coaching tools already shipping.[8]
  • Post-call integration upgrades and account-wide search rolled out in May 2026 beta.[9]

For competitors: Fathom has removed the “pick one: privacy or video” trade-off. Tools locked into visible bots (Otter, Fireflies) or purely local capture (Granola) must now justify why they cannot offer the same per-meeting flexibility.

Fireflies extended its bot into proactive “Voice Agents” and “AI Skills” that run two-way conversations and auto-execute CRM workflows, moving beyond passive transcription.[10]

The mechanism lets users pre-configure GPT-powered agents that join calls, answer questions live, and push structured data (notes, action items, CRM fields) to HubSpot/Salesforce/Notion/Slack without manual steps. This is enabled by 100+ native integrations and per-meeting AI-skill configuration.

  • 2026 launches: Live Assist + Desktop App, Voice Agents for fully automated two-way calls, and AI Skills with GPT model selection and direct CRM routing.[10][11]
  • Auto-language detection across 30+ languages and 50+ app integrations remain core.[10]

For competitors: Pure note-takers without live voice agents or one-click CRM automation now face a widening gap in sales and customer-success workflows. The new baseline is “the AI participates, not just records.”

Otter.ai introduced real-time “Meeting Agents” capable of speaking inside calls, but 2026 independent reviews continue to highlight persistent speaker-identification and accuracy shortfalls compared with newer entrants.[12]

The agent listens, answers questions, and can respond verbally during the meeting. However, multiple 2026 hands-on tests report speaker misattribution rates around 30% in multi-person calls, accuracy dropping to 60–70% with noise or accents, and only four languages supported.

  • 2026 rollout of voice-activated Meeting Agents (Sales, Education, etc.) with real-time assistance.[12]
  • Reviews from April 2026 consistently grade live transcription B– and speaker ID D.[13][14]

For competitors: Otter’s agent capability is differentiated, but accuracy and attribution remain table stakes that Granola and Fathom’s local-capture approaches largely sidestep. Closing this gap is now urgent.

Across the category, the decisive recent differentiator is no longer “does it transcribe” but “does it let the user choose capture style, edit with natural language, and push structured data downstream without friction.”[15]

Granola and Fathom’s bot-free/local options plus Fathom’s live summaries and Fireflies’ Voice Agents represent the new frontier; Otter’s agent feature is the only counter-move from the bot-native camp.

Implication for new entrants or incumbents: Any product still shipping a single-mode visible bot or requiring manual post-editing of raw transcripts is competing on features that have already been commoditized or surpassed in the last 12 months. The market now rewards architectural flexibility (capture choice + natural-language control + downstream automation) over raw transcription volume.