Build a Local LLM-based RAG System for Your…

Oct 27, 2024

The Guide and the Code to building your own personal RAG System locally

7 Comments

Great & Meaningful project... as a novice, I tried following your detailed writeup.. faced some abstacles while installing & configuring the dependency packages (mainly sentence-transformers & torch), as for mac silicon processor, their is a separate URL... but somehow I could install almost all. Finally when I ran the Welcome.py, and tried to use chat, I am not getting any response... I checked that Ollama is installed with the said version...

Checked the logs & found error 'Unexpected chunk format in response stream.'.. Any quick help is appretiated

Expand full comment

Reply (1)

Shantanu Ladhwe

Jan 23

Hey @rajashekar - thanks for raising the issue. Could you debug and check what is getting passed into the LLM?

For simplicity you could just print the output of the functions in the terminal.

Expand full comment

Reply (1)

Rajashekar

Jan 23

Please find the Log & Debug Output recorded... The main problem is I am not getting any output on web interface

Log

2025-01-23 22:43:01,273 - INFO - Applied custom CSS styling.

2025-01-23 22:43:01,355 - INFO - Logo displayed.

2025-01-23 22:43:01,355 - INFO - Displayed sidebar content.

2025-01-23 22:43:01,355 - INFO - Displayed main welcome content.

2025-01-23 22:43:08,142 - INFO - Custom CSS applied.

2025-01-23 22:43:08,143 - INFO - OpenSearch client initialized.

2025-01-23 22:43:08,144 - INFO - Index configuration loaded from src/index_config.json.

2025-01-23 22:43:08,155 - INFO - HEAD http://localhost:9200/documents [status:200 request:0.011s]

2025-01-23 22:43:08,155 - INFO - Index documents already exists.

2025-01-23 22:43:08,170 - INFO - Logo displayed.

2025-01-23 22:43:08,170 - INFO - Sidebar configured with headers and footer.

2025-01-23 22:43:08,171 - INFO - Loading embedding model from path: sentence-transformers/all-mpnet-base-v2

2025-01-23 22:43:08,180 - INFO - Use pytorch device_name: mps

2025-01-23 22:43:08,180 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2

2025-01-23 22:43:10,551 - INFO - HTTP Request: GET http://127.0.0.1:11434/api/tags "HTTP/1.1 200 OK"

2025-01-23 22:43:10,551 - INFO - Model llama3.2:1b not found locally. Pulling the model...

2025-01-23 22:43:11,572 - INFO - HTTP Request: POST http://127.0.0.1:11434/api/pull "HTTP/1.1 200 OK"

2025-01-23 22:43:11,573 - INFO - Model llama3.2:1b has been pulled and is now available locally.

2025-01-23 22:43:11,575 - INFO - Embedding model loaded.

2025-01-23 22:43:47,390 - INFO - Custom CSS applied.

2025-01-23 22:43:47,391 - INFO - OpenSearch client initialized.

2025-01-23 22:43:47,392 - INFO - Index configuration loaded from src/index_config.json.

2025-01-23 22:43:47,400 - INFO - HEAD http://localhost:9200/documents [status:200 request:0.007s]

2025-01-23 22:43:47,400 - INFO - Index documents already exists.

2025-01-23 22:43:47,422 - INFO - Logo displayed.

2025-01-23 22:43:47,422 - INFO - Sidebar configured with headers and footer.

2025-01-23 22:43:47,423 - INFO - User input received.

2025-01-23 22:43:47,423 - INFO - Performing hybrid search.

2025-01-23 22:43:47,579 - INFO - OpenSearch client initialized.

2025-01-23 22:43:47,614 - INFO - POST http://localhost:9200/documents/_search?search_pipeline=nlp-search-pipeline [status:200 request:0.034s]

2025-01-23 22:43:47,614 - INFO - Hybrid search completed for query 'Hello How are you doing?' with top_k=5.

2025-01-23 22:43:47,614 - INFO - Hybrid search completed.

2025-01-23 22:43:47,614 - INFO - Prompt constructed with context and conversation history.

2025-01-23 22:43:47,614 - INFO - Streaming response from LLaMA model.

2025-01-23 22:43:50,259 - INFO - HTTP Request: POST http://127.0.0.1:11434/api/chat "HTTP/1.1 200 OK"

2025-01-23 22:43:50,259 - ERROR - Unexpected chunk format in response stream.

stream.

2025-01-23 22:43:51,311 - INFO - Response generated and displayed.

Debug

aj9297@Mac RAG % streamlit run Welcome.py --logger.level=debug

2025-01-23 22:43:00.747 Starting server...

2025-01-23 22:43:00.747 Serving static content from /Users/raj9297/.pyenv/versions/3.13.0/lib/python3.13/site-packages/streamlit/static

2025-01-23 22:43:00.750 Server started on port 8501

2025-01-23 22:43:00.750 Runtime state: RuntimeState.INITIAL -> RuntimeState.NO_SESSIONS_CONNECTED

You can now view your Streamlit app in your browser.

Local URL: http://localhost:8501

Network URL: http://192.168.1.139:8501

For better performance, install the Watchdog module:

$ xcode-select --install

$ pip install watchdog

2025-01-23 22:43:00.803 Setting up signal handler

2025-01-23 22:43:01.251 AppSession initialized (id=43aca887-7b97-4146-9ce2-77ed0073d5c1)

2025-01-23 22:43:01.251 Created new session for client 4333124448. Session ID: 43aca887-7b97-4146-9ce2-77ed0073d5c1

2025-01-23 22:43:01.251 Runtime state: RuntimeState.NO_SESSIONS_CONNECTED -> RuntimeState.ONE_OR_MORE_SESSIONS_CONNECTED

2025-01-23 22:43:01.251 Received the following back message:

rerun_script {

widget_states {

}

2025-01-23 22:43:01.252 Beginning script thread

2025-01-23 22:43:01.252 Running script RerunData(widget_states=)

2025-01-23 22:43:01.252 Disconnecting files for session with ID 43aca887-7b97-4146-9ce2-77ed0073d5c1

2025-01-23 22:43:01.252 Sessions still active: dict_keys([])

2025-01-23 22:43:01.252 Files: 0; Sessions with files: 0

2025-01-23 22:43:01.354 Adding media file 4f088e3beb9f196b20cb6f917a4abe63685fc1c690d258ce1c093183

2025-01-23 22:43:01.355 Removing orphaned files...

2025-01-23 22:43:01.375 MediaFileHandler: GET 4f088e3beb9f196b20cb6f917a4abe63685fc1c690d258ce1c093183.png

2025-01-23 22:43:01.375 MediaFileHandler: Sending image/png file 4f088e3beb9f196b20cb6f917a4abe63685fc1c690d258ce1c093183.png

2025-01-23 22:43:01.382 Script run finished successfully; removing expired entries from MessageCache (max_age=2)

2025-01-23 22:43:04.683 Received the following back message:

rerun_script {

widget_states {

}

page_script_hash: "a3f8e21bde273a4e75ea4cbb45393401"

}

2025-01-23 22:43:04.683 Beginning script thread

2025-01-23 22:43:04.684 Running script RerunData(widget_states=, page_script_hash='a3f8e21bde273a4e75ea4cbb45393401')

2025-01-23 22:43:04.684 Disconnecting files for session with ID 43aca887-7b97-4146-9ce2-77ed0073d5c1

2025-01-23 22:43:04.684 Sessions still active: dict_keys([])

2025-01-23 22:43:04.684 Files: 1; Sessions with files: 0

2025-01-23 22:43:08.170 Creating new ResourceCache (key=8df00b2421b608697e107b7831789ed5)

2025-01-23 22:43:08.170 Cache key: d41d8cd98f00b204e9800998ecf8427e

2025-01-23 22:43:10.546 Creating new ResourceCache (key=fcd3a59ac6a479e9d2bb42fff2f14349)

2025-01-23 22:43:10.546 Cache key: 114e66cd853e340fbb5d5f29e6427994

2025-01-23 22:43:11.577 Removing orphaned files...

2025-01-23 22:43:11.718 Examining the path of torch.classes raised: Tried to instantiate class '__path__._path', but it does not exist! Ensure that it is registered via torch::class_

2025-01-23 22:43:11.741 Script run finished successfully; removing expired entries from MessageCache (max_age=2)

2025-01-23 22:43:47.386 Received the following back message:

rerun_script {

widget_states {

widgets {

id: "$$ID-5fb5d8410eab05a2d7b363c6444eec89-None"

bool_value: true

}

widgets {

id: "$$ID-623db747e6c57f859f77470f3dec4efe-None"

int_value: 5

}

widgets {

id: "$$ID-90479d2a325ba8899aef9831666ed39d-None"

double_array_value {

data: 0.7

}

widgets {

id: "$$ID-5c029bfbe44c6cddf90c5ecfb80ce815-None"

string_trigger_value {

data: "Hello How are you doing?"

}

page_script_hash: "a3f8e21bde273a4e75ea4cbb45393401"

}

2025-01-23 22:43:47.388 Beginning script thread

2025-01-23 22:43:47.388 Running script RerunData(widget_states=widgets {

id: "$$ID-5fb5d8410eab05a2d7b363c6444eec89-None"

bool_value: true

}

widgets {

id: "$$ID-623db747e6c57f859f77470f3dec4efe-None"

int_value: 5

}

widgets {

id: "$$ID-90479d2a325ba8899aef9831666ed39d-None"

double_array_value {

data: 0.7

}

widgets {

id: "$$ID-5c029bfbe44c6cddf90c5ecfb80ce815-None"

string_trigger_value {

data: "Hello How are you doing?"

}

, page_script_hash='a3f8e21bde273a4e75ea4cbb45393401')

2025-01-23 22:43:47.388 Disconnecting files for session with ID 43aca887-7b97-4146-9ce2-77ed0073d5c1

2025-01-23 22:43:47.388 Sessions still active: dict_keys([])

2025-01-23 22:43:47.388 Files: 1; Sessions with files: 0

2025-01-23 22:43:47.424 Cache key: d41d8cd98f00b204e9800998ecf8427e

Batches: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 6.56it/s]

2025-01-23 22:43:51.311 Removing orphaned files...

2025-01-23 22:43:51.446 Script run finished successfully; removing expired entries from MessageCache (max_age=2)

Expand full comment

Himanshu Dharm

Jan 14

Hey! This is a pretty cool project. Thank you!

It got me thinking, since the model is going through a lot of docs and ofcourse it will handle it smartly. A nice option would be to also throw in a citation window?

Which could indicate the line/para/page no. of the doc where the answer is fetched from.

Happy to hear your thoughts on this.

Cheers!

Expand full comment

Reply (1)

Shantanu Ladhwe

Jan 14

Hey, yes definitely! there are sooo many ways you can change things! Maybe you can upgrade the project and create a PR ;)

Expand full comment

Perry Simms

Jan 13

I use the paper from the government and government-adjacent entities to light fires to heat my cave.

But thanks this is cool.

Are we anywhere near to some open format for the semantic web?

My idea is every web host should be able to vectorize their core content and serve it to the public in a way that can be used to build a modern 'semantic web'.

Expand full comment

Reply (1)

Shantanu Ladhwe

Jan 13

Actually love this thought! I dont see it yet - but this would be amazing

I was actually keep thinking from user perspective.

Imagine a future where we have a user persona into embedding - everything of that user. It is like their own credential. However they will be the owner of it and then can plug and play to personalize content when interacting with outside world.

Expand full comment