GenAI Architectures
LLM Architecture Summaries
- https://a16z.com/emerging-architectures-for-llm-applications/
- https://github.com/ray-project/llm-numbers - Numbers every LLM Developer should know
Architecture Deepdives
- TODO
Cloud Offerings
Vector Databases
- Overview: https://book.premai.io/state-of-open-source-ai/vector-db/
- https://weaviate.io/
- https://www.pinecone.io/
- https://qdrant.tech/ & https://cloud.qdrant.io/ & https://github.com/qdrant/qdrant
- https://milvus.io/ - Vector database built for scalable similarity search
- FAISS - A library for efficient similarity search and clustering of dense vectors.
- ! https://github.com/chroma-core/chroma & https://www.trychroma.com/ & https://blog.langchain.dev/langchain-chroma/ & https://twitter.com/atroyn/status/1628557389762007040
- ? https://github.com/marqo-ai/marqo & https://www.marqo.ai/
- ? https://github.com/activeloopai/deeplake
- https://opensearch.org/platform/search/vector-database.html
- https://redis.com/solutions/use-cases/vector-database/
Embeddings
- https://txt.cohere.com/embedding-archives-wikipedia/ & https://huggingface.co/Cohere?ref=txt.cohere.com
RAG (Retrieval Augmentation)
- Arxiv paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
- https://github.com/intellabs/fastrag - Build and explore efficient retrieval-augmented generative models and applications
- https://haystack.deepset.ai/tutorials/22_pipeline_with_promptnode
- https://github.com/whitead/paper-qa - LLM Chain for answering questions from documents with citations
- https://gpt-index.readthedocs.io/en/latest/end_to_end_tutorials/low_level/root.html - Building RAG from Scratch
- https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1
Function calling / Schema enforcement
- https://blog.simonfarshid.com/native-json-output-from-gpt-4
- https://github.com/openai/openai-cookbook/blob/main/examples/How_to_call_functions_with_chat_models.ipynb
Fine-tuning
- Ludwig.ai, Classification example, Predibase, github - Tutorial: Creating a Generative QA Pipeline with Retrieval-Augmentation
- Horovod.ai, github - distributed deep learning training framework
- Replicate.ai:
Hallucinations
- https://www.pinecone.io/learn/ai-hallucinations/
- https://zapier.com/blog/ai-hallucinations/
- https://arxiv.org/abs/2202.03629 - Survey of Hallucination in Natural Language Generation
Running models locally (OSX M1 etc)
Articles
Tools
- https://ollama.ai/library
- https://lmstudio.ai/ (Proprietary 😱)
- https://localai.io/
- https://github.com/oobabooga/text-generation-webui
- https://github.com/ggerganov/llama.cpp
- https://github.com/lostruins/koboldcpp
Serving Production LLMs
- https://github.com/vllm-project/vllm - Easy, fast, and cheap LLM serving for everyone
- https://haystack.deepset.ai/ - Open-source LLM framework to build production-ready applications
- https://huggingface.co/blog/optimize-llm - Optimizing your LLM in production