Large Context Model

In modern large language models (LLMs), tokenizers break down a given input (i.e., a prompt) into smaller units called tokens. For example, the sentence The OEMs vehicle features are defined by software” would be split into tokens – e.g., every word a token . The LLM processes these tokens and generates a response based on the same tokenized vocabulary. However, this approach differs from how humans analyze information. Humans apply multiple levels of abstraction, such as visual representations like schematic diagrams.

This concept can be likened to creating a presentation slide deck, where visuals are often used instead of dense blocks of text. These layers of abstraction can be referred to as concepts. According to the paper Large Concept Models: Language Modeling in a Sentence Representation Space, concepts represent ideas or actions that are not tied to a specific language or modality. The paper provides an illustrative example of this idea.

Source: 2412.08821v2

One key advantage of this approach is improved handling of long-context information. Since large amounts of text are compressed into a smaller set of concepts, it enhances efficiency. Additionally, the use of concepts enables hierarchical reasoning, which is particularly useful in image-processing tasks where relationships between elements must be understood at different levels.

Concepts can be viewed as a form of compression, where words (or word vectors) are mapped into a concept space through dimensionality reduction. This transformation can also be achieved using neural networks, leading to what is known as neural composition (though this topic is beyond the scope of this discussion).

Now, let’s consider inference. Similar to traditional LLMs, where a sequence of tokens predicts the next token, in this approach, the sequence predicts the next concept instead of a token. The paper illustrates this with diagrams, further expanding on techniques such as diffusion (de-noising) and multi-tower (seperaton-of-concerns) architectures.

Source: 2412.08821v2

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *