All activity
Will DePue
left a comment
Today, I'm announcing Alexandria, an open-source initiative to embed the internet.
To start, we're releasing the embeddings for every research paper on the Arxiv. That's over 4m items, 600m tokens, and 3.07 billion vector dimensions.
We're not stopping here.
A significant number of the world's problems are just search, clustering, recommendation, or classification; all things embeddings are...
The Alexandria Index
Massive internet datasets, embedded, open-sourced and free
Vector embeddings are incredibly powerful, compressing text into a low-dimensional 'meaning space', but they're also shockingly cheap. Alexandria is an open-source project to embed and release (for free) large datasets in research, law, medicine and more.
The Alexandria Index
Massive internet datasets, embedded, open-sourced and free