1 مستودع
Visualizing high-dimensional document vectors in 2D space to analyze cluster distribution.
Distinct from 2D Histogram Visualizers: Candidates refer to histograms or 2D game assets, not vector space projections of documents.
Explore 1 awesome GitHub repository matching data & databases · Document Embedding Projections. Refine with filters or upvote what's useful.
BERTopic is a topic modeling library used to extract interpretable themes from collections of text documents and images. It functions as a document clustering framework that transforms unstructured data into numerical vectors to group semantically similar content. The project distinguishes itself through a multimodal embedding tool that allows for joint clustering of text and images in a shared vector space. It also features a class-based TF-IDF representation engine to identify representative words for clusters and an integrated system for using large language models to generate natural lang
Plots individual documents in a 2D plane to verify topic assignments and cluster distributions.