I really did. I had been looking for a project to learn/build for some time now and I'm glad I found these blogs at the right time. Also, really appreciate the amount of best practises work you put into each part of the project, this isn't something you see in most tutorials/blogs available online.
Thank you for creating this amazing hands-on project where we can learn production level RAG.
I'm facing an issue while running the Week 2 notebook, I have setup the infrastructure as per Week 1 and have all services running via docker. But Airflow service does not start even if I manually try to start it from Docker desktop. Also, in part '4. Database Storage Testing' , I'm facing an error in Postgres database connection, it says 'could not translate host name "postgres" to address: nodename nor servname provided, or not known' . Any suggestions to resolve this is appreciated.
First of all, thanks for having such great content for free. I was able to first fetch 10 papers and run the whole pipeline, but I don't know why I am not able to fetch more papers when I tried 100 or more. Just wanted to know if there is a limit on how many papers we can extract in a day.
Amazing post, <3
Learn a lot
Great 2nd week content as well! Learned about repository pattern in this, pretty neat!
Gonna try to implement next items before week 3 blog comes out.
Amazing! Really hope that you are enjoying it :)
I really did. I had been looking for a project to learn/build for some time now and I'm glad I found these blogs at the right time. Also, really appreciate the amount of best practises work you put into each part of the project, this isn't something you see in most tutorials/blogs available online.
Why the settings is in python script than yaml file, any specific reasoning behind it?
We are using .env file, so the settings are imported from there. Config file is just doing validation of the settings.
Even if yaml was there we would need config to manage the settings that gets imported.
Where can I get the code
Its here - https://github.com/jamwithai/arxiv-paper-curator
Thank you sir
Looking forward to seeing week 3
Let us know your feedback 😃
I'm interested in your integration of OpenSearch into your rag. Looking forward to that part of all of this.
Thank you for creating this amazing hands-on project where we can learn production level RAG.
I'm facing an issue while running the Week 2 notebook, I have setup the infrastructure as per Week 1 and have all services running via docker. But Airflow service does not start even if I manually try to start it from Docker desktop. Also, in part '4. Database Storage Testing' , I'm facing an error in Postgres database connection, it says 'could not translate host name "postgres" to address: nodename nor servname provided, or not known' . Any suggestions to resolve this is appreciated.
First of all, thanks for having such great content for free. I was able to first fetch 10 papers and run the whole pipeline, but I don't know why I am not able to fetch more papers when I tried 100 or more. Just wanted to know if there is a limit on how many papers we can extract in a day.