多个网站对词嵌入 (Embedding) 的解释

https://www.ibm.com/topics/embedding

Embedding is a means of representing objects like text, images and audio as points in a continuous vector space where the locations of those points in space are semantically meaningful to machine learning (ML) algorithms.

https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/embeddings?tabs=console

An embedding is a special format of data representation that can be easily utilized by machine learning models and algorithms. The embedding is an information dense representation of the semantic meaning of a piece of text. Each embedding is a vector of floating point numbers, such that the distance between two embeddings in the vector space is correlated with semantic similarity between two inputs in the original format.

https://aws.amazon.com/cn/what-is/embeddings-in-machine-learning/

嵌入是真实世界对象的数字表示，机器学习（ML）和人工智能（AI）系统利用它来像人类一样理解复杂的知识领域。例如，计算算法了解 2 和 3 之间的差为 1，这表明与 2 和 100 相比，2 和 3 关系更为密切。但是，真实世界数据包含更复杂的关系。例如，鸟巢和狮穴是相似对，而昼夜是相反词。嵌入将真实世界的对象转换成复杂的数学表示，以捕捉真实世界数据之间的固有属性和关系。整个过程是自动化的，人工智能系统会在训练期间自我创建嵌入，并根据需要使用它们来完成新任务。

机器学习模型无法以原始格式明确解读信息，需要以数值数据作为输入。它们使用神经网络嵌入将实词信息转换为称为向量的数字表示形式。向量是以多维空间形式表示信息的数值。它们可以帮助机器学习模型找到稀疏分布的项目间的相似之处。

https://openai.com/index/introducing-text-and-code-embeddings/

Embeddings are numerical representations of concepts converted to number sequences, which make it easy for computers to understand the relationships between those concepts.

Embeddings that are numerically similar are also semantically similar.

https://huggingface.co/blog/getting-started-with-embeddings

An embedding is a numerical representation of a piece of information, for example, text, documents, images, audio, etc. The representation captures the semantic meaning of what is being embedded, making it robust for many industry applications.

… Since this list captures the meaning, we can do exciting things, like calculating the distance between different embeddings to determine how well the meaning of two sentences matches.

https://datascience.stackexchange.com/a/101720

Although the word thus originally meant the mapping from one space to another, it has metonymically shifted to mean the resulting dense vector in the latent space. and it is in this sense that we currently use the word.

《大语言模型》 P2-3 https://llmbook-zh.github.io/

图灵奖获得者 Yoshua Bengio 在一项早期工作中 [6] 引入了分布式词表示（DistributedWord Representation）这一概念，并构建了基于聚合上下文特征（即分布式词向量）的目标词预测函数。分布式词表示使用低维稠密向量来表示词汇的语义，这与基于词典空间的稀疏词向量表示（One-Hot Representation）有着本质的不同，能够刻画更为丰富的隐含语义特征。同时，稠密向量的非零表征对于复杂语言模型的搭建非常友好，能够有效克服统计语言模型中的数据稀疏问题。分布式词向量又称为 “词嵌入”（Word Embedding）。