Binary file extensions supported in External Content Connectors

  • Release version: Yokohama
  • Updated October 28, 2025
  • 1 minute to read
  • Connector administrators can restrict the binary file types an external content connector retrieves by specifying file extensions in inclusion or exclusion filters.

    Binary file type Supported file extensions
    Comma-Separated Values .csv
    Extensible Markup Language .xml
    Hypertext Markup Language
    • .htm
    • .html
    Microsoft Excel
    • .xls
    • .xlsm
    • .xlsx
    • .xltx
    Microsoft PowerPoint
    • .potx
    • .pps
    • .ppsx
    • .ppt
    • .pptm
    • .pptx
    Microsoft Word
    • .doc
    • .docm
    • .docx
    • .dotx
    Plain Text .txt
    Portable Document Format .pdf
    Rich Text Format .rtf
    Note:
    Some external content connectors include support for indexing searchable content and metadata from additional content formats, such as video transcriptions or source-specific native document formats. Examples of source-specific native document formats include .aspx pages in Microsoft SharePoint Online and Wiki pages in Atlassian Confluence Cloud.

    For details on defining file-extension inclusion and exclusion filters for external content connectors, see Configuring crawl settings for external content connectors.

    Binary files may be retrieved as content items, or as attachments found on content items. The exact behavior depends on the connector type.
    Important:
    The maximum file size for binary files is 25 MB. Keyword indexing processes up to the first 1MB of text. Use semantic search to index data containing between 1MB and 25 MB of text.