The Microsoft SharePoint Online external content connector retrieves pages from sites in your Microsoft SharePoint Online source system and makes their content and metadata searchable in AI Search applications.
Search administrators can run or schedule content crawls to retrieve updated content and access permissions from your source system, or user permission crawls
to retrieve updated security principals from your source system. Both types of crawl feed their data to AI Search for indexing.
The indexed content and metadata are stored as records in a connector-specific indexed source. Search administrators can create search sources from this
indexed source and link them to search profiles to make the indexed records searchable in AI Search applications.
Microsoft SharePoint Online static and dynamic page content
Microsoft SharePoint Online pages built primarily with static text content authored in the Microsoft SharePoint Online editor produce searchable content that more closely matches what users see in a web browser. By contrast, pages that rely on dynamic web parts may not
contain all of the content that users see in a web browser.
Microsoft SharePoint Online pages are stored as .aspx files in a site's Site Pages library. These files can include static content in their CanvasContent1 and
WikiFields metadata fields, but they can also include scripts that call a server-side engine to dynamically render viewable content at request time. The exact content rendered depends on user context,
permissions, and web parts loaded as part of the page request.
Rendering a page's full viewable content requires an authenticated user session. The Microsoft SharePoint Online connector can't impersonate a user to trigger this rendering process. As a result, the connector cannot capture the final HTML output that a web browser
would display.
For each page retrieved, the Microsoft SharePoint Online connector queries the SharePoint REST API's _api/web/lists('<list-id>')/items(<item-id>) endpoint to access the page's underlying
list item. Via this API endpoint, the connector retrieves content primarily from the page's CanvasContent1 and WikiFields metadata fields, and also captures the page's other metadata fields such as title,
author, and modification date where available. Page content stored exclusively in dynamic web parts may be retrieved only partially or not at all, since that content doesn't exist in the list item
metadata.
To learn how to view the portion of a page's content that can be retrieved using the Microsoft SharePoint Online connector, see View retrievable page content using the Microsoft SharePoint Online REST API.