【人工智能】Azure AI搜索的功能 | 架构师研究会

语言 Chinese, Simplified

SEO Title

Features of Azure AI Search

category

人工智能

Azure人工智能搜索提供信息检索，并使用可选的人工智能集成来提取更多的文本和结构内容。

下表按类别总结了功能。有关Azure AI搜索与其他搜索技术的比较的更多信息，请参阅比较搜索选项。

在所有Azure公共、私有和主权云中都有功能对等，但某些功能在特定地区不受支持。有关更多信息，请参阅按地区提供的产品。

Note

Looking for preview features? See the preview features list.

索引功能

Category	Features
Data sources	Search indexes can accept text from any source, provided it's submitted as a JSON document. Indexers are a feature that automates data import from supported data sources to extract searchable content in primary data stores. Indexers handle JSON serialization for you and most support some form of change and deletion detection. You can connect to a variety of data sources, including OneLake, Azure SQL Database, Azure Cosmos DB, or Azure Blob storage.
Hierarchical and nested data structures	Complex types and collections allow you to model virtually any type of JSON structure within a search index. One-to-many and many-to-many cardinality can be expressed natively through collections, complex types, and collections of complex types.
Linguistic analysis	Analyzers are components used for text processing during indexing and search operations. By default, you can use the general-purpose Standard Lucene analyzer, or override the default with a language analyzer, a custom analyzer that you configure, or another predefined analyzer that produces tokens in the format you require. Language analyzers from Lucene or Microsoft are used to intelligently handle language-specific linguistics including verb tenses, gender, irregular plural nouns (for example, 'mouse' vs. 'mice'), word decompounding, word-breaking (for languages with no spaces), and more. Custom lexical analyzers are used for complex query forms such as phonetic matching and regular expressions.

矢量和混合搜索

Category	Features
Vector indexing	Within a search index, add vector fields to support vector search scenarios. Vector fields can coexist with nonvector fields in the same search document.
Vector queries	Formulate single and multiple vector queries.
Vector search algorithms	Use Hierarchical Navigable Small World (HNSW) or exhaustive K-Nearest Neighbors (KNN) to find similar vectors in a search index.
Vector filters	Apply filters before or after query execution for greater precision during information retrieval.
Hybrid information retrieval	Search for concepts and keywords in a single hybrid query request. Hybrid search consolidates vector and text search, with optional semantic ranking and relevance tuning for best results.
Integrated data chunking and vectorization (preview)	Native data chunking through Text Split skill. Native vectorization through vectorizers and embedding skills such as AzureOpenAIEmbeddingModel, Azure AI Vision multimodal, and the AML skill that you can use to connect to endpoints in the Azure AI Studio model catalog. Integrated vectorization (preview) provides an end-to-end indexing pipeline from source files to queries.
Integrated vector compression and quantization	Use built-in scalar quantization to reduce vector index size in memory and on disk. You can also forego storage of vectors you don't need, or assign narrow data types to vector fields for reduced storage requirements.

应用人工智能与知识挖掘

Category	Features
AI processing during indexing	AI enrichment refers to embedded image and natural language processing in an indexer pipeline that extracts text and information from content that can't otherwise be indexed for full text search. AI processing is achieved by adding and combining skills in a skillset, which is then attached to an indexer. AI can be either built-in skills from Microsoft, such as text translation or Optical Character Recognition (OCR), or custom skills that you provide.
Storing enriched content for analysis and consumption in non-search scenarios	Knowledge store is persistent storage of enriched content, intended for non-search scenarios like knowledge mining and data science processing. A knowledge store is defined in a skillset, but created in Azure Storage as objects or tabular rowsets.
Cached enrichments	Incremental enrichment (preview) refers to cached enrichments that can be reused during skillset execution. Caching is particularly valuable in skillsets that include OCR and image analysis, which are expensive to process.

全文和其他查询表单

Category	Features
Free-form text search	Full-text search is a primary use case for most search-based apps. Queries can be formulated using a supported syntax. Simple query syntax provides logical operators, phrase search operators, suffix operators, precedence operators. Full Lucene query syntax includes all operations in simple syntax, with extensions for fuzzy search, proximity search, term boosting, and regular expressions.
Relevance	Simple scoring is a key benefit of Azure AI Search. Scoring profiles are used to model relevance as a function of values in the documents themselves. For example, you might want newer products or discounted products to appear higher in the search results. You can also build scoring profiles using tags for personalized scoring based on customer search preferences you've tracked and stored separately. Semantic ranker is premium feature that reranks results based on semantic relevance to the query. Depending on your content and scenario, it can significantly improve search relevance with almost minimal configuration or effort.
Geospatial search	Geospatial functions filter over and match on geographic coordinates. You can match on distance or by inclusion in a polygon shape.
Filters and facets	Faceted navigation is enabled through a single query parameter. Azure AI Search returns a faceted navigation structure you can use as the code behind a categories list, for self-directed filtering (for example, to filter catalog items by price-range or brand). Filters can be used to incorporate faceted navigation into your application's UI, enhance query formulation, and filter based on user- or developer-specified criteria. Create filters using the OData syntax.
User experience	Autocomplete can be enabled for type-ahead queries in a search bar. Search suggestions also works off of partial text inputs in a search bar, but the results are actual documents in your index rather than query terms. Synonyms associates equivalent terms that implicitly expand the scope of a query, without the user having to provide the alternate terms. Hit highlighting applies text formatting to a matching keyword in search results. You can choose which fields return highlighted snippets. Sorting is offered for multiple fields via the index schema and then toggled at query-time with a single search parameter. Paging and throttling your search results is straightforward with the finely tuned control that Azure AI Search offers over your search results.

安全功能

Category	Features
Data encryption	Microsoft-managed encryption-at-rest is built into the internal storage layer and is irrevocable. Customer-managed encryption keys that you create and manage in Azure Key Vault can be used for supplemental encryption of indexes and synonym maps. For services created after August 1 2020, CMK encryption extends to data on temporary disks, for full double encryption of indexed content.
Endpoint protection	IP rules for inbound firewall support allows you to set up IP ranges over which the search service will accept requests. Create a private endpoint using Azure Private Link to force all requests through a virtual network.
Inbound access	Role-based access control assigns roles to users and groups in Microsoft Entra ID for controlled access to search content and operations. You can also use key-based authentication if you don't have role assignments.
Outbound security (indexers)	Data access through private endpoints allows an indexer to connect to Azure resources that are protected through Azure Private Link. Data access using a trusted identity means that connection strings to external data sources can omit user names and passwords. When an indexer connects to the data source, the resource allows the connection if the search service was previously registered as a trusted service.

门户功能

Category

Features

Tools for prototyping and inspection

Add index is an index designer in the portal that you can use to create a basic schema consisting of attributed fields and a few other settings. After saving the index, you can populate it using an SDK or the REST API to provide the data.

Import data wizard creates indexes, indexers, skillsets, and data source definitions. If your data exists in Azure, this wizard can save you significant time and effort, especially for proof-of-concept investigation and exploration.

Import and vectorize data creates a full indexing pipeline that includes data chunking and vectorization. The wizard creates all of the objects and configuration settings.

Search explorer is used to test queries and refine scoring profiles.

Create demo app is used to generate an HTML page that can be used to test the search experience.

Debug Sessions is a visual editor that lets you debug a skillset interactively. It shows you dependencies, output, and transformations.

Monitoring and diagnostics

Enable monitoring features to go beyond the metrics-at-a-glance that are always visible in the portal. Metrics on queries per second, latency, and throttling are captured and reported in portal pages with no extra configuration required.

可编程性

Category	Features
REST	Service REST API is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. Management REST API is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.
Azure SDK for .NET	Azure.Search.Documents is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. Microsoft.Azure.Management.Search is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.
Azure SDK for Java	com.azure.search.documents is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. com.microsoft.azure.management.search is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.
Azure SDK for Python	azure-search-documents is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. azure-mgmt-search is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.
Azure SDK for JavaScript/TypeScript	azure/search-documents is for data plane operations, including all operations related to indexing, queries, and AI enrichment. You can also use this client library to retrieve system information and statistics. azure/arm-search is for service creation and provisioning through Azure Resource Manager. You can also use this API to manage keys and capacity.

See also

本文地址

https://architect.pub

登录发表评论
131 次浏览

发布日期

星期四, June 27, 2024 - 12:02

最后修改

星期四, June 27, 2024 - 12:02

Tags

Article

最新内容

Huiwen Han —— 论文与预印本目录 2026年7月
5 days 12 hours ago
AxisRobo-PAMP：一个面向企业架构治理的开源 EA 管理平台
1 week 2 days ago
Huiwen Han — Preprints Public Inventory v10.15
1 month 3 weeks ago
【Azure网络】什么是Azure虚拟网络？
5 months 3 weeks ago
【Azure APIM】关于Azure API管理器的API凭据和凭据管理器
5 months 3 weeks ago
【Azure APIM】凭证管理器中的OAuth 2.0连接-流程细节和流程
5 months 3 weeks ago
【Azure APIM】使用虚拟网络保护Azure API管理的入站或出站流量
5 months 3 weeks ago
【Azure APIM】将Azure API管理实例部署到虚拟网络-内部模式
5 months 3 weeks ago
【Azure云】Azure网络安全最佳实践
5 months 3 weeks ago
【人工智能】教程：如何使用Azure虚拟网络创建安全工作区
5 months 3 weeks ago

↑