Configure crawl settings for a GitLab external content connector
Specify the groups, projects, and repositories you want your GitLab external content connector to crawl. Select the issues, wikis, merge requests, tags, branches, and commits you want the crawl to retrieve and feed to AI Search for indexing.
Avant de commencer
A connector admin must have already created the GitLab external content connector that you want to configure crawl settings for. To learn about this procedure, see Create a GitLab external content connector.
Role required: sn_ext_conn.xcc_admin
Pourquoi et quand exécuter cette tâche
- Inclusion or exclusion filters for the subgroups to crawl when running content crawls
- Inclusion or exclusion filters for the projects/repositories to crawl when running content crawls
- Inclusion or exclusion filters for the types of content to retrieve from the source system when running content crawls
- Inclusion or exclusion filters for the branches to retrieve from the source system when running content crawls
Content is only retrieved from the source system if it passes all of your configured crawl setting filters. If any crawl setting filter excludes a content item, the external content connector doesn't retrieve it.
By default, each external content connector can index up to ten million (10,000,000) content items from its source system. When a connector exceeds this limit, it continues to crawl the source system, but only sends content item deletions and updates to AI Search for indexing, ignoring new content items. The connector logs an error message for every 10,000 content items it crawls beyond the indexing limit.
When a connector's indexed content item count exceeds 800,000, a warning message appears in the connector's UI to indicate that it's approaching the indexing limit. If the connector reaches the indexing limit, an error message appears in its UI.
External content connectors that support user permissions crawls can retrieve up to five hundred thousand (500,000) users.
If one of your connectors reaches the content indexing limit, you can update its crawl settings and file inclusion/exclusion filters to reduce the number of content items it retrieves. Alternately, if you need a connector to index more than 10,000,000 content items or to retrieve more than 500,000 users, you can create a Customer Service and Support case at https://support.servicenow.com/now to request a limit increase for the connector.
Procédure
Résultats
The GitLab external content connector is updated with your modified crawl settings.
Que faire ensuite
To retrieve content from your GitLab source system using your modified crawl settings, create and run a one-time content crawl for your GitLab external content connector. To learn about creating and running one-time content crawls, see Create a content crawl for an external content connector.