GenAI Architectures

LLM Architecture Summaries

https://a16z.com/emerging-architectures-for-llm-applications/
https://github.com/ray-project/llm-numbers - Numbers every LLM Developer should know

Architecture Deepdives

TODO

Cloud Offerings

https://aws.amazon.com/generative-ai/

Vector Databases

Overview: https://book.premai.io/state-of-open-source-ai/vector-db/
https://weaviate.io/
https://www.pinecone.io/
https://qdrant.tech/ & https://cloud.qdrant.io/ & https://github.com/qdrant/qdrant
https://milvus.io/ - Vector database built for scalable similarity search
FAISS - A library for efficient similarity search and clustering of dense vectors.
! https://github.com/chroma-core/chroma & https://www.trychroma.com/ & https://blog.langchain.dev/langchain-chroma/ & https://twitter.com/atroyn/status/1628557389762007040
? https://github.com/marqo-ai/marqo & https://www.marqo.ai/
? https://github.com/activeloopai/deeplake
https://opensearch.org/platform/search/vector-database.html
https://redis.com/solutions/use-cases/vector-database/

Embeddings

https://txt.cohere.com/embedding-archives-wikipedia/ & https://huggingface.co/Cohere?ref=txt.cohere.com

RAG (Retrieval Augmentation)

Arxiv paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
https://github.com/intellabs/fastrag - Build and explore efficient retrieval-augmented generative models and applications
https://haystack.deepset.ai/tutorials/22_pipeline_with_promptnode
https://github.com/whitead/paper-qa - LLM Chain for answering questions from documents with citations
https://gpt-index.readthedocs.io/en/latest/end_to_end_tutorials/low_level/root.html - Building RAG from Scratch
https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1

Function calling / Schema enforcement

Fine-tuning

Ludwig.ai, Classification example, Predibase, github - Tutorial: Creating a Generative QA Pipeline with Retrieval-Augmentation
Horovod.ai, github - distributed deep learning training framework
Replicate.ai:
- https://replicate.com/docs/guides/fine-tune-a-language-model
- https://replicate.com/blog/fine-tune-llama-2

Hallucinations

https://www.pinecone.io/learn/ai-hallucinations/
https://zapier.com/blog/ai-hallucinations/
https://arxiv.org/abs/2202.03629 - Survey of Hallucination in Natural Language Generation

Running models locally (OSX M1 etc)

Articles

Tools

Serving Production LLMs

https://github.com/vllm-project/vllm - Easy, fast, and cheap LLM serving for everyone
https://haystack.deepset.ai/ - Open-source LLM framework to build production-ready applications
https://huggingface.co/blog/optimize-llm - Optimizing your LLM in production

Testing / AIOps / ChatOps

https://www.confident-ai.com/