Abstract and keywords
Abstract (English):
The rise of generative artificial intelligence—from large language models to diffusion and adversarial architectures—has laid bare the strategic role of data. Models create new content, but the quality of their outputs is determined by how training and auxiliary datasets are collected, stored, indexed, and delivered. Traditional relational DBMSs and widely used NoSQL stores have provided the foundation for many projects; however, the specifics of generative systems have shifted the focus: there is a need to store vector representations (embeddings), to perform similarity-based (semantic) retrieval, and to support high-throughput reads during inference. This monograph systematizes database architectures applicable to generative scenarios, reviews indexing methods and integration with models, and analyzes risks and future directions.

Keywords:
generative models; databases; vector databases; embeddings; semantic search; RAG; indexing; machine learning; data security; federated learning
References

1. Garcia-Molina, H.; Ullman, J.; Widom, J. Database Systems: The Complete Book. – 2nd ed. – Pearson, 2023. – 1248 p.

2. Stonebraker, M.; Hellerstein, J. M. Readings in Database Systems (5th ed.). – MIT Press, 2024. – 620 p.

3. Pinecone. Vector Database for Machine Learning. – Rezhimdostupa: https://www.pinecone.io/, svobodnyy. – Dataobrascheniya: 10.09.2025.

4. Weaviate. Open-SourceVectorDatabase. – Rezhim dostupa: https://weaviate.io/, svobodnyy. – Data obrascheniya: 09.09.2025.

5. Milvus. Open-SourceVectorDatabaseforAI. – Rezhim dostupa: https://milvus.io/, svobodnyy. – Data obrascheniya: 12.09.2025.

6. FAISS (FacebookAISimilaritySearch). – Rezhim dostupa: https://github.com/facebookresearch/faiss, svobodnyy. – Data obrascheniya: 15.09.2025.

7. TimescaleInc. TimescaleDBDocumentation. – Rezhim dostupa: https://docs.timescale.com/, svobodnyy. – Data obrascheniya: 15.09.2025.

8. InfluxData. InfluxDBDocumentation – Rezhim dostupa: https://docs.influxdata.com/, svobodnyy. – Data obrascheniya: 15.09.2025.

9. OpenAI. Retrieval-AugmentedGeneration: BestPractices. – Rezhim dostupa: https://platform.openai.com/, svobodnyy. – Data obrascheniya: 15.09.2025.

Login or Create
* Forgot password?