Spaces:
Configuration error
Configuration error
Upload 7 files
Browse files- README.md +110 -12
- __init__.py +0 -0
- app.py +354 -64
- graph_demo_ui.py +87 -0
- requirements.txt +10 -1
- webui-test-graph.py +283 -0
- webui-test.py +354 -0
README.md
CHANGED
@@ -1,12 +1,110 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Easy-RAG
|
2 |
+
一个适合学习、使用、自主扩展的RAG【检索增强生成】系统,可以联网做AI搜索!
|
3 |
+
|
4 |
+
|
5 |
+

|
6 |
+
|
7 |
+
更新历史
|
8 |
+
|
9 |
+
2024/9/04 增加 AI网络搜索 可以联网查询
|
10 |
+
2024/9/04 优化webui异步调用,提高响应速度
|
11 |
+
2024/8/21 增加对 Elasticsearch 支持,在config中设置
|
12 |
+
2024/7/23 参考 meet-libai 项目增加了一个知识图谱的时时提取工具,目前仅是提取,未存储 graph_demo_ui.py
|
13 |
+
2024/7/11 新增faiss向量数据库支持,目前支持(Chroma\FAISS)
|
14 |
+
2024/7/10 更新rerank搜索方式
|
15 |
+
2024/7/09 第一版发布
|
16 |
+

|
17 |
+
|
18 |
+
1、目前已有的功能
|
19 |
+
|
20 |
+
知识库(目前仅支持txt\csv\pdf\md\doc\docx\mp3\mp4\wav\excel\格式数据):
|
21 |
+
|
22 |
+
1、知识库的创建(目前仅支持Chroma\Faiss\Elasticsearch)
|
23 |
+
2、知识库的更新
|
24 |
+
3、删除知识库中某个文件
|
25 |
+
4、删除知识库
|
26 |
+
5、向量化知识库
|
27 |
+
6、支持音频视频的语音转文本然后向量化
|
28 |
+
语音转文本 使用的 funasr ,第一次启动时,会从魔塔下载模型,可能会慢一些,之后会自动加载模型
|
29 |
+
|
30 |
+
chat
|
31 |
+
|
32 |
+
1、支持纯大模型聊天多轮
|
33 |
+
2、支持知识库问答 ["复杂召回方式", "简单召回方式","rerank"]
|
34 |
+
|
35 |
+
AI网络搜索
|
36 |
+
|
37 |
+
支持网络搜素,大家可以优化 prompt 增加不同 程度的 总结
|
38 |
+
llm基于ollama可以选择不同模型
|
39 |
+
注意:联网基于 searxng,需要先本地或者服务启动 这个项目,我用docker 启动的
|
40 |
+
参考 https://github.com/searxng/searxng-docker
|
41 |
+
|
42 |
+

|
43 |
+
3、通过使用rerank重新排序来提高检索效率
|
44 |
+
|
45 |
+
本次rerank 使用了bge-reranker-large 模型,需要下载到本地,然后再 rag/rerank.py中配置路径
|
46 |
+
模型地址:https://hf-mirror.com/BAAI/bge-reranker-large
|
47 |
+
|
48 |
+
|
49 |
+
2、后续更新计划
|
50 |
+
|
51 |
+
知识库:
|
52 |
+
|
53 |
+
0、支持Elasticsearch、Milvus,MongoDB等向量数据
|
54 |
+
|
55 |
+
|
56 |
+
chat:
|
57 |
+
|
58 |
+
1、添加 语音回答输出
|
59 |
+
2、增加 问题路由知识库的 功能
|
60 |
+
|
61 |
+
|
62 |
+
安装使用
|
63 |
+
|
64 |
+
Ollma安装,在如下网址选择适合你机器的ollama 安装包,傻瓜式安装即可
|
65 |
+
|
66 |
+
https://ollama.com/download
|
67 |
+
Ollama 安装模型,本次直接安装我们需要的两个 cmd中执行
|
68 |
+
|
69 |
+
ollama run qwen2:7b
|
70 |
+
ollama run mofanke/acge_text_embedding:latest
|
71 |
+
|
72 |
+
下载bge-reranker-large 模型然后在 rag/rerank.py中配置路径
|
73 |
+
|
74 |
+
https://hf-mirror.com/BAAI/bge-reranker-large
|
75 |
+
|
76 |
+
选择你想使用的向量数据库 目前仅支持(Chroma和Faiss)
|
77 |
+
|
78 |
+
在 Config/config.py中配置你想用的 向量数据库
|
79 |
+
如果选择 Elasticsearch 请先启动 Elasticsearch,我是使用docker 启动的
|
80 |
+
docker run -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" -e "xpack.security.http.ssl.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.12.1
|
81 |
+
注意修改 es_url
|
82 |
+
|
83 |
+
构造python环境
|
84 |
+
|
85 |
+
conda create -n Easy-RAG python=3.10.9
|
86 |
+
conda activate Easy-RAG
|
87 |
+
|
88 |
+
项目开发使用的 python3.10.9 经测试 pyhon3.8以上皆可使用
|
89 |
+
|
90 |
+
git clone https://github.com/yuntianhe2014/Easy-RAG.git
|
91 |
+
安装依赖
|
92 |
+
|
93 |
+
pip3 install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple
|
94 |
+
|
95 |
+
部署依赖联网项目searxng
|
96 |
+
参考 https://github.com/searxng/searxng-docker
|
97 |
+
项目启动
|
98 |
+
|
99 |
+
python webui.py
|
100 |
+
|
101 |
+
知识图谱时时提取工具
|
102 |
+
python graph_demo_ui.py
|
103 |
+

|
104 |
+
|
105 |
+
更多介绍参考 公众号文章:世界大模型
|
106 |
+

|
107 |
+
|
108 |
+
项目参考:
|
109 |
+
https://github.com/BinNong/meet-libai
|
110 |
+
https://github.com/searxng/searxng-docker
|
__init__.py
ADDED
File without changes
|
app.py
CHANGED
@@ -1,64 +1,354 @@
|
|
1 |
-
import gradio as gr
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
)
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
)
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
)
|
59 |
-
|
60 |
-
)
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import gradio as gr
|
2 |
+
import threading
|
3 |
+
import asyncio
|
4 |
+
import logging
|
5 |
+
from concurrent.futures import ThreadPoolExecutor
|
6 |
+
from functools import lru_cache
|
7 |
+
import requests
|
8 |
+
import json
|
9 |
+
|
10 |
+
# 假设这些是您的自定义模块,需要根据实际情况进行调整
|
11 |
+
from Config.config import VECTOR_DB, DB_directory
|
12 |
+
from Ollama_api.ollama_api import *
|
13 |
+
from rag.rag_class import *
|
14 |
+
|
15 |
+
# 设置日志
|
16 |
+
logging.basicConfig(level=logging.INFO)
|
17 |
+
logger = logging.getLogger(__name__)
|
18 |
+
|
19 |
+
# 根据VECTOR_DB选择合适的向量数据库
|
20 |
+
if VECTOR_DB == 1:
|
21 |
+
from embeding.chromadb import ChromaDB as vectorDB
|
22 |
+
vectordb = vectorDB(persist_directory=DB_directory)
|
23 |
+
elif VECTOR_DB == 2:
|
24 |
+
from embeding.faissdb import FaissDB as vectorDB
|
25 |
+
vectordb = vectorDB(persist_directory=DB_directory)
|
26 |
+
elif VECTOR_DB == 3:
|
27 |
+
from embeding.elasticsearchStore import ElsStore as vectorDB
|
28 |
+
vectordb = vectorDB()
|
29 |
+
|
30 |
+
# 存储上传的文件
|
31 |
+
uploaded_files = []
|
32 |
+
|
33 |
+
@lru_cache(maxsize=100)
|
34 |
+
def get_knowledge_base_files():
|
35 |
+
cl_dict = {}
|
36 |
+
cols = vectordb.get_all_collections_name()
|
37 |
+
for c_name in cols:
|
38 |
+
cl_dict[c_name] = vectordb.get_collcetion_content_files(c_name)
|
39 |
+
return cl_dict
|
40 |
+
|
41 |
+
knowledge_base_files = get_knowledge_base_files()
|
42 |
+
|
43 |
+
def upload_files(files):
|
44 |
+
if files:
|
45 |
+
new_files = [file.name for file in files]
|
46 |
+
uploaded_files.extend(new_files)
|
47 |
+
update_knowledge_base_files()
|
48 |
+
logger.info(f"Uploaded files: {new_files}")
|
49 |
+
return update_file_list(), new_files, "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Upload successful!</div>"
|
50 |
+
update_knowledge_base_files()
|
51 |
+
return update_file_list(), [], "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Upload failed!</div>"
|
52 |
+
|
53 |
+
def delete_files(selected_files):
|
54 |
+
global uploaded_files
|
55 |
+
uploaded_files = [f for f in uploaded_files if f not in selected_files]
|
56 |
+
if selected_files:
|
57 |
+
update_knowledge_base_files()
|
58 |
+
logger.info(f"Deleted files: {selected_files}")
|
59 |
+
return update_file_list(), "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Delete successful!</div>"
|
60 |
+
update_knowledge_base_files()
|
61 |
+
return update_file_list(), "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Delete failed!</div>"
|
62 |
+
|
63 |
+
def delete_collection(selected_knowledge_base):
|
64 |
+
if selected_knowledge_base and selected_knowledge_base != "创建知识库":
|
65 |
+
vectordb.delete_collection(selected_knowledge_base)
|
66 |
+
update_knowledge_base_files()
|
67 |
+
logger.info(f"Deleted collection: {selected_knowledge_base}")
|
68 |
+
return update_knowledge_base_dropdown(), "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Collection deleted successfully!</div>"
|
69 |
+
return update_knowledge_base_dropdown(), "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Delete collection failed!</div>"
|
70 |
+
|
71 |
+
async def async_vectorize_files(selected_files, selected_knowledge_base, new_kb_name, chunk_size, chunk_overlap):
|
72 |
+
if selected_files:
|
73 |
+
if selected_knowledge_base == "创建知识库":
|
74 |
+
knowledge_base = new_kb_name
|
75 |
+
vectordb.create_collection(selected_files, knowledge_base, chunk_size=chunk_size, chunk_overlap=chunk_overlap)
|
76 |
+
else:
|
77 |
+
knowledge_base = selected_knowledge_base
|
78 |
+
vectordb.add_chroma(selected_files, knowledge_base, chunk_size=chunk_size, chunk_overlap=chunk_overlap)
|
79 |
+
|
80 |
+
if knowledge_base not in knowledge_base_files:
|
81 |
+
knowledge_base_files[knowledge_base] = []
|
82 |
+
knowledge_base_files[knowledge_base].extend(selected_files)
|
83 |
+
|
84 |
+
logger.info(f"Vectorized files: {selected_files} for knowledge base: {knowledge_base}")
|
85 |
+
await asyncio.sleep(0) # 允许其他任务执行
|
86 |
+
return f"Vectorized files: {', '.join(selected_files)}\nKnowledge Base: {knowledge_base}\nUploaded Files: {', '.join(uploaded_files)}", "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Vectorization successful!</div>"
|
87 |
+
return "", "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Vectorization failed!</div>"
|
88 |
+
|
89 |
+
def update_file_list():
|
90 |
+
return gr.update(choices=uploaded_files, value=[])
|
91 |
+
|
92 |
+
def search_knowledge_base(selected_knowledge_base):
|
93 |
+
if selected_knowledge_base in knowledge_base_files:
|
94 |
+
kb_files = knowledge_base_files[selected_knowledge_base]
|
95 |
+
return gr.update(choices=kb_files, value=[])
|
96 |
+
return gr.update(choices=[], value=[])
|
97 |
+
|
98 |
+
def update_knowledge_base_files():
|
99 |
+
global knowledge_base_files
|
100 |
+
knowledge_base_files = get_knowledge_base_files()
|
101 |
+
|
102 |
+
# 处理聊天消息的函数
|
103 |
+
chat_history = []
|
104 |
+
|
105 |
+
def safe_chat_response(model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, message):
|
106 |
+
try:
|
107 |
+
return chat_response(model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, message)
|
108 |
+
except Exception as e:
|
109 |
+
logger.error(f"Error in chat response: {str(e)}")
|
110 |
+
return f"<div style='color: red;'>Error: {str(e)}</div>", ""
|
111 |
+
|
112 |
+
def chat_response(model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, message):
|
113 |
+
global chat_history
|
114 |
+
if message:
|
115 |
+
chat_history.append(("User", message))
|
116 |
+
if chat_knowledge_base_dropdown == "仅使用模型":
|
117 |
+
rag = RAG_class(model=model_dropdown,persist_directory=DB_directory)
|
118 |
+
answer = rag.mult_chat(chat_history)
|
119 |
+
if chat_knowledge_base_dropdown and chat_knowledge_base_dropdown != "仅使用模型":
|
120 |
+
rag = RAG_class(model=model_dropdown, embed=vector_dropdown, c_name=chat_knowledge_base_dropdown, persist_directory=DB_directory)
|
121 |
+
if chain_dropdown == "复杂召回方式":
|
122 |
+
questions = rag.decomposition_chain(message)
|
123 |
+
answer = rag.rag_chain(questions)
|
124 |
+
elif chain_dropdown == "简单召回方式":
|
125 |
+
answer = rag.simple_chain(message)
|
126 |
+
else:
|
127 |
+
answer = rag.rerank_chain(message)
|
128 |
+
|
129 |
+
response = f" {answer}"
|
130 |
+
chat_history.append(("Bot", response))
|
131 |
+
return format_chat_history(chat_history), ""
|
132 |
+
|
133 |
+
def clear_chat():
|
134 |
+
global chat_history
|
135 |
+
chat_history = []
|
136 |
+
return format_chat_history(chat_history)
|
137 |
+
|
138 |
+
def format_chat_history(history):
|
139 |
+
formatted_history = ""
|
140 |
+
for user, msg in history:
|
141 |
+
if user == "User":
|
142 |
+
formatted_history += f'''
|
143 |
+
<div style="text-align: right; margin: 10px;">
|
144 |
+
<div style="display: inline-block; background-color: #DCF8C6; padding: 10px; border-radius: 10px; max-width: 60%;">
|
145 |
+
{msg}
|
146 |
+
</div>
|
147 |
+
<b>:User</b>
|
148 |
+
</div>
|
149 |
+
'''
|
150 |
+
else:
|
151 |
+
if "```" in msg: # 检测是否包含代码片段
|
152 |
+
code_content = msg.split("```")[1]
|
153 |
+
formatted_history += f'''
|
154 |
+
<div style="text-align: left; margin: 10px;">
|
155 |
+
<b>Bot:</b>
|
156 |
+
<div style="display: inline-block; background-color: #F1F0F0; padding: 10px; border-radius: 10px; max-width: 60%;">
|
157 |
+
<pre><code>{code_content}</code></pre>
|
158 |
+
</div>
|
159 |
+
</div>
|
160 |
+
'''
|
161 |
+
else:
|
162 |
+
formatted_history += f'''
|
163 |
+
<div style="text-align: left; margin: 10px;">
|
164 |
+
<b>Bot:</b>
|
165 |
+
<div style="display: inline-block; background-color: #F1F0F0; padding: 10px; border-radius: 10px; max-width: 60%;">
|
166 |
+
{msg}
|
167 |
+
</div>
|
168 |
+
</div>
|
169 |
+
'''
|
170 |
+
return formatted_history
|
171 |
+
|
172 |
+
def clear_status():
|
173 |
+
upload_status.update("")
|
174 |
+
delete_status.update("")
|
175 |
+
vectorize_status.update("")
|
176 |
+
delete_collection_status.update("")
|
177 |
+
|
178 |
+
def handle_knowledge_base_selection(selected_knowledge_base):
|
179 |
+
if selected_knowledge_base == "创建知识库":
|
180 |
+
return gr.update(visible=True, interactive=True), gr.update(choices=[], value=[]), gr.update(visible=False)
|
181 |
+
elif selected_knowledge_base == "仅使用模型":
|
182 |
+
return gr.update(visible=False, interactive=False), gr.update(choices=[], value=[]), gr.update(visible=False)
|
183 |
+
else:
|
184 |
+
return gr.update(visible=False, interactive=False), search_knowledge_base(selected_knowledge_base), gr.update(visible=True)
|
185 |
+
|
186 |
+
def update_knowledge_base_dropdown():
|
187 |
+
global knowledge_base_files
|
188 |
+
choices = ["创建知识库"] + list(knowledge_base_files.keys())
|
189 |
+
return gr.update(choices=choices)
|
190 |
+
|
191 |
+
def update_chat_knowledge_base_dropdown():
|
192 |
+
global knowledge_base_files
|
193 |
+
choices = ["仅使用模型"] + list(knowledge_base_files.keys())
|
194 |
+
return gr.update(choices=choices)
|
195 |
+
|
196 |
+
|
197 |
+
# SearxNG搜索函数
|
198 |
+
def search_searxng(query):
|
199 |
+
searxng_url = 'http://localhost:8080/search' # 替换为你的SearxNG实例URL
|
200 |
+
params = {
|
201 |
+
'q': query,
|
202 |
+
'format': 'json'
|
203 |
+
}
|
204 |
+
response = requests.get(searxng_url, params=params)
|
205 |
+
response.raise_for_status()
|
206 |
+
return response.json()
|
207 |
+
|
208 |
+
|
209 |
+
# Ollama总结函数
|
210 |
+
def summarize_with_ollama(model_dropdown,text, question):
|
211 |
+
prompt = """
|
212 |
+
根据下边的内容,回答用户问题,
|
213 |
+
内容为:‘{0}‘\n
|
214 |
+
问题为:{1}
|
215 |
+
""".format(text, question)
|
216 |
+
ollama_url = 'http://localhost:11434/api/generate' # 替换为你的Ollama实例URL
|
217 |
+
data = {
|
218 |
+
'model': model_dropdown,
|
219 |
+
"prompt": prompt,
|
220 |
+
"stream": False
|
221 |
+
}
|
222 |
+
response = requests.post(ollama_url, json=data)
|
223 |
+
response.raise_for_status()
|
224 |
+
return response.json()
|
225 |
+
|
226 |
+
|
227 |
+
# 处理函数
|
228 |
+
def ai_web_search(model_dropdown,user_query):
|
229 |
+
# 使用SearxNG进行搜索
|
230 |
+
search_results = search_searxng(user_query)
|
231 |
+
search_texts = [result['title'] + "\n" + result['content'] for result in search_results['results']]
|
232 |
+
combined_text = "\n\n".join(search_texts)
|
233 |
+
|
234 |
+
# 使用Ollama进行总结
|
235 |
+
summary = summarize_with_ollama(model_dropdown,combined_text, user_query)
|
236 |
+
# print(summary)
|
237 |
+
# 返回结果
|
238 |
+
return summary['response']
|
239 |
+
# 添加新的函数来处理AI网络搜索
|
240 |
+
# def ai_web_search(model_dropdown, query):
|
241 |
+
# try:
|
242 |
+
# # 这里添加实际的网络搜索和AI处理逻辑
|
243 |
+
# # 这只是一个示例,您需要根据实际情况实现
|
244 |
+
# search_result = f"搜索结果: {query}"
|
245 |
+
# ai_response = f"AI回答: 基于搜索结果,对于'{query}'的回答是..."
|
246 |
+
# return f"{search_result}\n\n{ai_response}"
|
247 |
+
# except Exception as e:
|
248 |
+
# logger.error(f"Error in AI web search: {str(e)}")
|
249 |
+
# return f"<div style='color: red;'>Error: {str(e)}</div>"
|
250 |
+
|
251 |
+
# 创建 Gradio 界面
|
252 |
+
with gr.Blocks() as demo:
|
253 |
+
with gr.Column():
|
254 |
+
# 添加标题
|
255 |
+
title = gr.HTML("<h1 style='text-align: center; font-size: 32px; font-weight: bold;'>RAG精致系统</h1>")
|
256 |
+
# 添加公告栏
|
257 |
+
announcement = gr.HTML("<div style='text-align: center; font-size: 18px; color: red;'>公告栏: RAG精致系统,【检索增强生成】系统!<br/>莫大大</div>")
|
258 |
+
|
259 |
+
with gr.Tabs():
|
260 |
+
with gr.TabItem("知识库"):
|
261 |
+
knowledge_base_dropdown = gr.Dropdown(choices=["创建知识库"] + list(knowledge_base_files.keys()),
|
262 |
+
label="选择知识库")
|
263 |
+
new_kb_input = gr.Textbox(label="输入新的知识库名称", visible=False, interactive=True)
|
264 |
+
file_input = gr.Files(label="Upload files")
|
265 |
+
upload_btn = gr.Button("Upload")
|
266 |
+
file_list = gr.CheckboxGroup(label="Uploaded Files")
|
267 |
+
delete_btn = gr.Button("Delete Selected Files")
|
268 |
+
with gr.Row():
|
269 |
+
chunk_size_dropdown = gr.Dropdown(choices=[50, 100, 200, 300, 500, 700], label="chunk_size", value=200)
|
270 |
+
chunk_overlap_dropdown = gr.Dropdown(choices=[20, 50, 100, 200], label="chunk_overlap", value=50)
|
271 |
+
vectorize_btn = gr.Button("Vectorize Selected Files")
|
272 |
+
delete_collection_btn = gr.Button("Delete Collection")
|
273 |
+
upload_status = gr.HTML()
|
274 |
+
delete_status = gr.HTML()
|
275 |
+
vectorize_status = gr.HTML()
|
276 |
+
delete_collection_status = gr.HTML()
|
277 |
+
|
278 |
+
with gr.TabItem("Chat"):
|
279 |
+
with gr.Row():
|
280 |
+
model_dropdown = gr.Dropdown(choices=get_llm(), label="模型")
|
281 |
+
vector_dropdown = gr.Dropdown(choices=get_embeding_model(), label="向量")
|
282 |
+
chat_knowledge_base_dropdown = gr.Dropdown(choices=["仅使用模型"] + vectordb.get_all_collections_name(), label="知识库")
|
283 |
+
chain_dropdown = gr.Dropdown(choices=["复杂召回方式", "简单召回方式","rerank"], label="chain方式", visible=False)
|
284 |
+
chat_display = gr.HTML(label="Chat History")
|
285 |
+
chat_input = gr.Textbox(label="Type a message")
|
286 |
+
chat_btn = gr.Button("Send")
|
287 |
+
clear_btn = gr.Button("Clear Chat History")
|
288 |
+
|
289 |
+
with gr.TabItem("AI网络搜索"):
|
290 |
+
with gr.Row():
|
291 |
+
web_search_model_dropdown = gr.Dropdown(choices=get_llm(), label="模型")
|
292 |
+
web_search_output = gr.Textbox(label="搜索结果和AI回答", lines=10)
|
293 |
+
web_search_input = gr.Textbox(label="输入搜索查询")
|
294 |
+
|
295 |
+
web_search_btn = gr.Button("搜索")
|
296 |
+
|
297 |
+
def handle_upload(files):
|
298 |
+
upload_result, new_files, status = upload_files(files)
|
299 |
+
threading.Thread(target=clear_status).start()
|
300 |
+
return upload_result, new_files, status, update_chat_knowledge_base_dropdown()
|
301 |
+
|
302 |
+
def handle_delete(selected_knowledge_base, selected_files):
|
303 |
+
tmp = []
|
304 |
+
cols_files_tmp = vectordb.get_collcetion_content_files(c_name=selected_knowledge_base)
|
305 |
+
for i in selected_files:
|
306 |
+
if i in cols_files_tmp:
|
307 |
+
tmp.append(i)
|
308 |
+
del cols_files_tmp
|
309 |
+
if tmp:
|
310 |
+
vectordb.del_files(tmp, c_name=selected_knowledge_base)
|
311 |
+
del tmp
|
312 |
+
delete_result, status = delete_files(selected_files)
|
313 |
+
threading.Thread(target=clear_status).start()
|
314 |
+
return delete_result, status, update_chat_knowledge_base_dropdown()
|
315 |
+
|
316 |
+
def handle_vectorize(selected_files, selected_knowledge_base, new_kb_name, chunk_size, chunk_overlap):
|
317 |
+
vectorize_result, status = asyncio.run(async_vectorize_files(selected_files, selected_knowledge_base, new_kb_name, chunk_size, chunk_overlap))
|
318 |
+
threading.Thread(target=clear_status).start()
|
319 |
+
return vectorize_result, status, update_knowledge_base_dropdown(), update_chat_knowledge_base_dropdown()
|
320 |
+
|
321 |
+
def handle_delete_collection(selected_knowledge_base):
|
322 |
+
result, status = delete_collection(selected_knowledge_base)
|
323 |
+
threading.Thread(target=clear_status).start()
|
324 |
+
return result, status, update_chat_knowledge_base_dropdown()
|
325 |
+
|
326 |
+
knowledge_base_dropdown.change(
|
327 |
+
handle_knowledge_base_selection,
|
328 |
+
inputs=knowledge_base_dropdown,
|
329 |
+
outputs=[new_kb_input, file_list, chain_dropdown]
|
330 |
+
)
|
331 |
+
upload_btn.click(handle_upload, inputs=file_input, outputs=[file_list, file_list, upload_status, chat_knowledge_base_dropdown])
|
332 |
+
delete_btn.click(handle_delete, inputs=[knowledge_base_dropdown, file_list], outputs=[file_list, delete_status, chat_knowledge_base_dropdown])
|
333 |
+
vectorize_btn.click(handle_vectorize, inputs=[file_list, knowledge_base_dropdown, new_kb_input, chunk_size_dropdown, chunk_overlap_dropdown],
|
334 |
+
outputs=[gr.Textbox(visible=False), vectorize_status, knowledge_base_dropdown, chat_knowledge_base_dropdown])
|
335 |
+
delete_collection_btn.click(handle_delete_collection, inputs=knowledge_base_dropdown,
|
336 |
+
outputs=[knowledge_base_dropdown, delete_collection_status, chat_knowledge_base_dropdown])
|
337 |
+
|
338 |
+
chat_btn.click(chat_response, inputs=[model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, chat_input], outputs=[chat_display, chat_input])
|
339 |
+
clear_btn.click(clear_chat, outputs=chat_display)
|
340 |
+
|
341 |
+
chat_knowledge_base_dropdown.change(
|
342 |
+
fn=lambda selected: gr.update(visible=selected != "仅使用模型"),
|
343 |
+
inputs=chat_knowledge_base_dropdown,
|
344 |
+
outputs=chain_dropdown
|
345 |
+
)
|
346 |
+
|
347 |
+
# 添加新的点击事件处理
|
348 |
+
web_search_btn.click(
|
349 |
+
ai_web_search,
|
350 |
+
inputs=[web_search_model_dropdown, web_search_input],
|
351 |
+
outputs=web_search_output
|
352 |
+
)
|
353 |
+
|
354 |
+
demo.launch(debug=True,share=True)
|
graph_demo_ui.py
ADDED
@@ -0,0 +1,87 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# -*- coding: utf-8 -*-
|
2 |
+
from flask import Flask, render_template, request, jsonify
|
3 |
+
import json
|
4 |
+
from dotenv import load_dotenv
|
5 |
+
from langchain_community.llms import Ollama
|
6 |
+
|
7 |
+
|
8 |
+
load_dotenv()
|
9 |
+
|
10 |
+
app = Flask(__name__)
|
11 |
+
|
12 |
+
# 测试了 llama3:8b,gemma2:9b,qwen2:7b,glm4:9b,arcee-ai/arcee-agent:latest 目前来看 qwen2:7 效果最好
|
13 |
+
llm = Ollama(model="qwen2:7b")
|
14 |
+
|
15 |
+
|
16 |
+
json_example = {'edges': [{'data': {'color': '#FFA07A',
|
17 |
+
'label': 'label 1',
|
18 |
+
'source': 'source 1',
|
19 |
+
'target': 'target 1'}},
|
20 |
+
{'data': {'color': '#FFA07A',
|
21 |
+
'label': 'label 2',
|
22 |
+
'source': 'source 2',
|
23 |
+
'target': 'target 2'}}
|
24 |
+
],
|
25 |
+
'nodes': [{'data': {'color': '#FFC0CB', 'id': 'id 1', 'label': 'label 1'}},
|
26 |
+
{'data': {'color': '#90EE90', 'id': 'id 2', 'label': 'label 2'}},
|
27 |
+
{'data': {'color': '#87CEEB', 'id': 'id 3', 'label': 'label 3'}}]}
|
28 |
+
|
29 |
+
|
30 |
+
|
31 |
+
__retriever_prompt = f"""
|
32 |
+
您是一名专门从事知识图谱创建的人工智能专家,目标是根据给定的输入或请求捕获关系。
|
33 |
+
基于各种形式的用户输入,如段落、电子邮件、文本文件等。
|
34 |
+
你的任务是根据输入创建一个知识图谱。
|
35 |
+
nodes必须具有label参数,并且label是来自输入的词语或短语,nodes必须具有id参数,id的格式是"id_数字",不能重复。
|
36 |
+
edges还必须有一个label参数,其中label是输入中的直接词语或短语,edges中的source和target取自nodes中的id。
|
37 |
+
仅使用JSON进行响应,其格式可以在python中进行jsonify,并直接输入cy.add(data),包括“color”属性,以在前端显示图形。
|
38 |
+
您可以参考给定的示例:{json_example}。存储node和edge的数组中,最后一个元素后边不要有逗号,
|
39 |
+
确保边的目标和源与现有节点匹配。
|
40 |
+
不要在JSON的上方和下方包含markdown三引号,直接用花括号括起来。
|
41 |
+
"""
|
42 |
+
|
43 |
+
|
44 |
+
def generate_graph_info(raw_text: str) -> str | None:
|
45 |
+
"""
|
46 |
+
generate graph info from raw text
|
47 |
+
:param raw_text:
|
48 |
+
:return:
|
49 |
+
"""
|
50 |
+
messages = [
|
51 |
+
{"role": "system", "content": "你现在扮演信息抽取的角色,要求根据用户输入和AI的回答,正确提取出信息,记得不多对实体进行翻译。"},
|
52 |
+
{"role": "user", "content": raw_text},
|
53 |
+
{"role": "user", "content": __retriever_prompt}
|
54 |
+
]
|
55 |
+
print("解析中....")
|
56 |
+
for i in range(3):
|
57 |
+
graph_info_result = llm.invoke(messages)
|
58 |
+
if len(graph_info_result)<10:
|
59 |
+
print("-------",i,"-------------------")
|
60 |
+
continue
|
61 |
+
else:
|
62 |
+
break
|
63 |
+
print(graph_info_result)
|
64 |
+
return graph_info_result
|
65 |
+
|
66 |
+
|
67 |
+
@app.route('/')
|
68 |
+
def index():
|
69 |
+
return render_template('index.html')
|
70 |
+
|
71 |
+
|
72 |
+
@app.route('/update_graph', methods=['POST'])
|
73 |
+
def update_graph():
|
74 |
+
raw_text = request.json.get('text', '')
|
75 |
+
try:
|
76 |
+
result = generate_graph_info(raw_text)
|
77 |
+
if '```' in result:
|
78 |
+
graph_data=json.loads(result.split('```',2)[1].replace("json", ''))
|
79 |
+
else:
|
80 |
+
graph_data=json.loads(result)
|
81 |
+
return graph_data
|
82 |
+
except Exception as e:
|
83 |
+
return {'error': f"Error parsing graph data: {str(e)}"}
|
84 |
+
|
85 |
+
|
86 |
+
if __name__ == '__main__':
|
87 |
+
app.run(host='0.0.0.0', port=7860)
|
requirements.txt
CHANGED
@@ -1 +1,10 @@
|
|
1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
gradio==4.29.0
|
2 |
+
langchain-community==0.2.6
|
3 |
+
langchain==0.2.6
|
4 |
+
langchain-core==0.2.11
|
5 |
+
requests
|
6 |
+
transformers==4.41.1
|
7 |
+
unstructured==0.7.12
|
8 |
+
funasr==1.0.24
|
9 |
+
modelscope
|
10 |
+
chromadb
|
webui-test-graph.py
ADDED
@@ -0,0 +1,283 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import gradio as gr
|
2 |
+
import threading
|
3 |
+
from Config.config import VECTOR_DB,DB_directory
|
4 |
+
|
5 |
+
if VECTOR_DB==1:
|
6 |
+
from embeding.chromadb import ChromaDB as vectorDB
|
7 |
+
vectordb = vectorDB(persist_directory=DB_directory)
|
8 |
+
elif VECTOR_DB==2:
|
9 |
+
from embeding.faissdb import FaissDB as vectorDB
|
10 |
+
vectordb = vectorDB(persist_directory=DB_directory)
|
11 |
+
from Ollama_api.ollama_api import *
|
12 |
+
from rag.rag_class import *
|
13 |
+
|
14 |
+
# 存储上传的文件
|
15 |
+
uploaded_files = []
|
16 |
+
|
17 |
+
# 模拟获取最新的知识库文件
|
18 |
+
def get_knowledge_base_files():
|
19 |
+
cl_dict = {}
|
20 |
+
cols = vectordb.get_all_collections_name()
|
21 |
+
for c_name in cols:
|
22 |
+
cl_dict[c_name] = vectordb.get_collcetion_content_files(c_name)
|
23 |
+
return cl_dict
|
24 |
+
|
25 |
+
knowledge_base_files = get_knowledge_base_files()
|
26 |
+
|
27 |
+
def upload_files(files):
|
28 |
+
if files:
|
29 |
+
new_files = [file.name for file in files]
|
30 |
+
uploaded_files.extend(new_files)
|
31 |
+
update_knowledge_base_files()
|
32 |
+
return update_file_list(), new_files, "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Upload successful!</div>"
|
33 |
+
update_knowledge_base_files()
|
34 |
+
return update_file_list(), [], "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Upload failed!</div>"
|
35 |
+
|
36 |
+
def delete_files(selected_files):
|
37 |
+
global uploaded_files
|
38 |
+
uploaded_files = [f for f in uploaded_files if f not in selected_files]
|
39 |
+
if selected_files:
|
40 |
+
update_knowledge_base_files()
|
41 |
+
return update_file_list(), "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Delete successful!</div>"
|
42 |
+
update_knowledge_base_files()
|
43 |
+
return update_file_list(), "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Delete failed!</div>"
|
44 |
+
|
45 |
+
def delete_collection(selected_knowledge_base):
|
46 |
+
if selected_knowledge_base and selected_knowledge_base != "创建知识库":
|
47 |
+
vectordb.delete_collection(selected_knowledge_base)
|
48 |
+
update_knowledge_base_files()
|
49 |
+
return update_knowledge_base_dropdown(), "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Collection deleted successfully!</div>"
|
50 |
+
return update_knowledge_base_dropdown(), "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Delete collection failed!</div>"
|
51 |
+
|
52 |
+
def create_graph(selected_files):
|
53 |
+
from Neo4j.neo4j_op import KnowledgeGraph
|
54 |
+
from Neo4j.graph_extract import update_graph
|
55 |
+
from Config.config import neo4j_host, neo4j_name, neo4j_pwd
|
56 |
+
import tqdm
|
57 |
+
|
58 |
+
kg = KnowledgeGraph(neo4j_host,neo4j_name,neo4j_pwd)
|
59 |
+
data = kg.split_files(selected_files)
|
60 |
+
for doc in tqdm.tqdm(data):
|
61 |
+
text = doc.page_content
|
62 |
+
try:
|
63 |
+
res = update_graph(text)
|
64 |
+
# 批量创建节点
|
65 |
+
nodes = kg.create_nodes("node", res["nodes"])
|
66 |
+
|
67 |
+
# 批量创建关系
|
68 |
+
relationships = kg.create_relationships([
|
69 |
+
("node", {"name": edge["source"]}, "node", {"name": edge["target"]}, edge["label"]) for edge in res["edges"]
|
70 |
+
])
|
71 |
+
except:
|
72 |
+
print("错误----------------------------------")
|
73 |
+
|
74 |
+
|
75 |
+
def vectorize_files(selected_files, selected_knowledge_base, new_kb_name,choice_graph, chunk_size, chunk_overlap):
|
76 |
+
if selected_files:
|
77 |
+
if selected_knowledge_base == "创建知识库":
|
78 |
+
knowledge_base = new_kb_name
|
79 |
+
vectordb.create_collection(selected_files, knowledge_base, chunk_size=chunk_size, chunk_overlap=chunk_overlap)
|
80 |
+
if choice_graph=='是':
|
81 |
+
create_graph(selected_files)
|
82 |
+
else:
|
83 |
+
knowledge_base = selected_knowledge_base
|
84 |
+
vectordb.add_chroma(selected_files, knowledge_base, chunk_size=chunk_size, chunk_overlap=chunk_overlap)
|
85 |
+
if choice_graph == '是':
|
86 |
+
create_graph(selected_files)
|
87 |
+
if knowledge_base not in knowledge_base_files:
|
88 |
+
knowledge_base_files[knowledge_base] = []
|
89 |
+
knowledge_base_files[knowledge_base].extend(selected_files)
|
90 |
+
|
91 |
+
return f"Vectorized files: {', '.join(selected_files)}\nKnowledge Base: {knowledge_base}\nUploaded Files: {', '.join(uploaded_files)}", "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Vectorization successful!</div>"
|
92 |
+
return "", "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Vectorization failed!</div>"
|
93 |
+
|
94 |
+
def update_file_list():
|
95 |
+
return gr.update(choices=uploaded_files, value=[])
|
96 |
+
|
97 |
+
def search_knowledge_base(selected_knowledge_base):
|
98 |
+
if selected_knowledge_base in knowledge_base_files:
|
99 |
+
kb_files = knowledge_base_files[selected_knowledge_base]
|
100 |
+
return gr.update(choices=kb_files, value=[])
|
101 |
+
return gr.update(choices=[], value=[])
|
102 |
+
|
103 |
+
def update_knowledge_base_files():
|
104 |
+
global knowledge_base_files
|
105 |
+
knowledge_base_files = get_knowledge_base_files()
|
106 |
+
|
107 |
+
# 处理聊天消息的函数
|
108 |
+
chat_history = []
|
109 |
+
|
110 |
+
def chat_response(model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, message):
|
111 |
+
global chat_history
|
112 |
+
if message:
|
113 |
+
chat_history.append(("User", message))
|
114 |
+
if chat_knowledge_base_dropdown == "仅使用模型":
|
115 |
+
rag = RAG_class(model=model_dropdown,persist_directory=DB_directory)
|
116 |
+
answer = rag.mult_chat(chat_history)
|
117 |
+
if chat_knowledge_base_dropdown and chat_knowledge_base_dropdown != "仅使用模型":
|
118 |
+
rag = RAG_class(model=model_dropdown, embed=vector_dropdown, c_name=chat_knowledge_base_dropdown, persist_directory=DB_directory)
|
119 |
+
if chain_dropdown == "复杂召回方式":
|
120 |
+
questions = rag.decomposition_chain(message)
|
121 |
+
answer = rag.rag_chain(questions)
|
122 |
+
elif chain_dropdown == "简单召回方式":
|
123 |
+
answer = rag.simple_chain(message)
|
124 |
+
else:
|
125 |
+
answer = rag.rerank_chain(message)
|
126 |
+
|
127 |
+
response = f" {answer}"
|
128 |
+
chat_history.append(("Bot", response))
|
129 |
+
return format_chat_history(chat_history), ""
|
130 |
+
|
131 |
+
def clear_chat():
|
132 |
+
global chat_history
|
133 |
+
chat_history = []
|
134 |
+
return format_chat_history(chat_history)
|
135 |
+
|
136 |
+
def format_chat_history(history):
|
137 |
+
formatted_history = ""
|
138 |
+
for user, msg in history:
|
139 |
+
if user == "User":
|
140 |
+
formatted_history += f'''
|
141 |
+
<div style="text-align: right; margin: 10px;">
|
142 |
+
<div style="display: inline-block; background-color: #DCF8C6; padding: 10px; border-radius: 10px; max-width: 60%;">
|
143 |
+
{msg}
|
144 |
+
</div>
|
145 |
+
<b>:User</b>
|
146 |
+
</div>
|
147 |
+
'''
|
148 |
+
else:
|
149 |
+
if "```" in msg: # 检测是否包含代码片段
|
150 |
+
code_content = msg.split("```")[1]
|
151 |
+
formatted_history += f'''
|
152 |
+
<div style="text-align: left; margin: 10px;">
|
153 |
+
<b>Bot:</b>
|
154 |
+
<div style="display: inline-block; background-color: #F1F0F0; padding: 10px; border-radius: 10px; max-width: 60%;">
|
155 |
+
<pre><code>{code_content}</code></pre>
|
156 |
+
</div>
|
157 |
+
</div>
|
158 |
+
'''
|
159 |
+
else:
|
160 |
+
formatted_history += f'''
|
161 |
+
<div style="text-align: left; margin: 10px;">
|
162 |
+
<b>Bot:</b>
|
163 |
+
<div style="display: inline-block; background-color: #F1F0F0; padding: 10px; border-radius: 10px; max-width: 60%;">
|
164 |
+
{msg}
|
165 |
+
</div>
|
166 |
+
</div>
|
167 |
+
'''
|
168 |
+
return formatted_history
|
169 |
+
|
170 |
+
def clear_status():
|
171 |
+
upload_status.update("")
|
172 |
+
delete_status.update("")
|
173 |
+
vectorize_status.update("")
|
174 |
+
delete_collection_status.update("")
|
175 |
+
|
176 |
+
def handle_knowledge_base_selection(selected_knowledge_base):
|
177 |
+
if selected_knowledge_base == "创建知识库":
|
178 |
+
return gr.update(visible=True, interactive=True), gr.update(choices=[], value=[]), gr.update(visible=False)
|
179 |
+
elif selected_knowledge_base == "仅使用模型":
|
180 |
+
return gr.update(visible=False, interactive=False), gr.update(choices=[], value=[]), gr.update(visible=False)
|
181 |
+
else:
|
182 |
+
return gr.update(visible=False, interactive=False), search_knowledge_base(selected_knowledge_base), gr.update(visible=True)
|
183 |
+
|
184 |
+
def update_knowledge_base_dropdown():
|
185 |
+
global knowledge_base_files
|
186 |
+
choices = ["创建知识库"] + list(knowledge_base_files.keys())
|
187 |
+
return gr.update(choices=choices)
|
188 |
+
|
189 |
+
def update_chat_knowledge_base_dropdown():
|
190 |
+
global knowledge_base_files
|
191 |
+
choices = ["仅使用模型"] + list(knowledge_base_files.keys())
|
192 |
+
return gr.update(choices=choices)
|
193 |
+
|
194 |
+
# 创建 Gradio 界面
|
195 |
+
with gr.Blocks() as demo:
|
196 |
+
with gr.Column():
|
197 |
+
# 添加标题
|
198 |
+
title = gr.HTML("<h1 style='text-align: center; font-size: 32px; font-weight: bold;'>RAG精致系统</h1>")
|
199 |
+
# 添加公告栏
|
200 |
+
announcement = gr.HTML("<div style='text-align: center; font-size: 18px; color: red;'>公告栏: 欢迎使用RAG精致系统</div>")
|
201 |
+
|
202 |
+
with gr.Tabs():
|
203 |
+
with gr.TabItem("知识库"):
|
204 |
+
knowledge_base_dropdown = gr.Dropdown(choices=["创建知识库"] + list(knowledge_base_files.keys()),
|
205 |
+
label="选择知识库")
|
206 |
+
new_kb_input = gr.Textbox(label="输入新的知识库名称", visible=False, interactive=True)
|
207 |
+
choice_graph = gr.Radio(choices=["否", "是"], value="否",label="是否同时提取知识图谱(会比较慢)")
|
208 |
+
file_input = gr.Files(label="Upload files")
|
209 |
+
upload_btn = gr.Button("Upload")
|
210 |
+
file_list = gr.CheckboxGroup(label="Uploaded Files")
|
211 |
+
delete_btn = gr.Button("Delete Selected Files")
|
212 |
+
with gr.Row():
|
213 |
+
chunk_size_dropdown = gr.Dropdown(choices=[50, 100, 200, 300, 500, 700], label="chunk_size", value=200)
|
214 |
+
chunk_overlap_dropdown = gr.Dropdown(choices=[20, 50, 100, 200], label="chunk_overlap", value=50)
|
215 |
+
vectorize_btn = gr.Button("Vectorize Selected Files")
|
216 |
+
delete_collection_btn = gr.Button("Delete Collection")
|
217 |
+
upload_status = gr.HTML()
|
218 |
+
delete_status = gr.HTML()
|
219 |
+
vectorize_status = gr.HTML()
|
220 |
+
delete_collection_status = gr.HTML()
|
221 |
+
|
222 |
+
with gr.TabItem("Chat"):
|
223 |
+
with gr.Row():
|
224 |
+
model_dropdown = gr.Dropdown(choices=get_llm(), label="模型")
|
225 |
+
vector_dropdown = gr.Dropdown(choices=get_embeding_model(), label="向量")
|
226 |
+
chat_knowledge_base_dropdown = gr.Dropdown(choices=["仅使用模型"] + vectordb.get_all_collections_name(), label="知识库")
|
227 |
+
chain_dropdown = gr.Dropdown(choices=["复杂召回方式", "简单召回方式","rerank"], label="chain方式", visible=False)
|
228 |
+
chat_display = gr.HTML(label="Chat History")
|
229 |
+
chat_input = gr.Textbox(label="Type a message")
|
230 |
+
chat_btn = gr.Button("Send")
|
231 |
+
clear_btn = gr.Button("Clear Chat History")
|
232 |
+
|
233 |
+
def handle_upload(files):
|
234 |
+
upload_result, new_files, status = upload_files(files)
|
235 |
+
threading.Thread(target=clear_status).start()
|
236 |
+
return upload_result, new_files, status, update_chat_knowledge_base_dropdown()
|
237 |
+
|
238 |
+
def handle_delete(selected_knowledge_base, selected_files):
|
239 |
+
tmp = []
|
240 |
+
cols_files_tmp = vectordb.get_collcetion_content_files(c_name=selected_knowledge_base)
|
241 |
+
for i in selected_files:
|
242 |
+
if i in cols_files_tmp:
|
243 |
+
tmp.append(i)
|
244 |
+
del cols_files_tmp
|
245 |
+
if tmp:
|
246 |
+
vectordb.del_files(tmp, c_name=selected_knowledge_base)
|
247 |
+
del tmp
|
248 |
+
delete_result, status = delete_files(selected_files)
|
249 |
+
threading.Thread(target=clear_status).start()
|
250 |
+
return delete_result, status, update_chat_knowledge_base_dropdown()
|
251 |
+
|
252 |
+
def handle_vectorize(selected_files, selected_knowledge_base, new_kb_name, choice_graph,chunk_size, chunk_overlap):
|
253 |
+
vectorize_result, status = vectorize_files(selected_files, selected_knowledge_base, new_kb_name, choice_graph,chunk_size, chunk_overlap)
|
254 |
+
threading.Thread(target=clear_status).start()
|
255 |
+
return vectorize_result, status, update_knowledge_base_dropdown(), update_chat_knowledge_base_dropdown()
|
256 |
+
|
257 |
+
def handle_delete_collection(selected_knowledge_base):
|
258 |
+
result, status = delete_collection(selected_knowledge_base)
|
259 |
+
threading.Thread(target=clear_status).start()
|
260 |
+
return result, status, update_chat_knowledge_base_dropdown()
|
261 |
+
|
262 |
+
knowledge_base_dropdown.change(
|
263 |
+
handle_knowledge_base_selection,
|
264 |
+
inputs=knowledge_base_dropdown,
|
265 |
+
outputs=[new_kb_input, file_list, chain_dropdown]
|
266 |
+
)
|
267 |
+
upload_btn.click(handle_upload, inputs=file_input, outputs=[file_list, file_list, upload_status, chat_knowledge_base_dropdown])
|
268 |
+
delete_btn.click(handle_delete, inputs=[knowledge_base_dropdown, file_list], outputs=[file_list, delete_status, chat_knowledge_base_dropdown])
|
269 |
+
vectorize_btn.click(handle_vectorize, inputs=[file_list, knowledge_base_dropdown, new_kb_input,choice_graph, chunk_size_dropdown, chunk_overlap_dropdown],
|
270 |
+
outputs=[gr.Textbox(visible=False), vectorize_status, knowledge_base_dropdown, chat_knowledge_base_dropdown])
|
271 |
+
delete_collection_btn.click(handle_delete_collection, inputs=knowledge_base_dropdown,
|
272 |
+
outputs=[knowledge_base_dropdown, delete_collection_status, chat_knowledge_base_dropdown])
|
273 |
+
|
274 |
+
chat_btn.click(chat_response, inputs=[model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, chat_input], outputs=[chat_display, chat_input])
|
275 |
+
clear_btn.click(clear_chat, outputs=chat_display)
|
276 |
+
|
277 |
+
chat_knowledge_base_dropdown.change(
|
278 |
+
fn=lambda selected: gr.update(visible=selected != "仅使用模型"),
|
279 |
+
inputs=chat_knowledge_base_dropdown,
|
280 |
+
outputs=chain_dropdown
|
281 |
+
)
|
282 |
+
|
283 |
+
demo.launch(debug=True,share=True)
|
webui-test.py
ADDED
@@ -0,0 +1,354 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import gradio as gr
|
2 |
+
import threading
|
3 |
+
import asyncio
|
4 |
+
import logging
|
5 |
+
from concurrent.futures import ThreadPoolExecutor
|
6 |
+
from functools import lru_cache
|
7 |
+
import requests
|
8 |
+
import json
|
9 |
+
|
10 |
+
# 假设这些是您的自定义模块,需要根据实际情况进行调整
|
11 |
+
from Config.config import VECTOR_DB, DB_directory
|
12 |
+
from Ollama_api.ollama_api import *
|
13 |
+
from rag.rag_class import *
|
14 |
+
|
15 |
+
# 设置日志
|
16 |
+
logging.basicConfig(level=logging.INFO)
|
17 |
+
logger = logging.getLogger(__name__)
|
18 |
+
|
19 |
+
# 根据VECTOR_DB选择合适的向量数据库
|
20 |
+
if VECTOR_DB == 1:
|
21 |
+
from embeding.chromadb import ChromaDB as vectorDB
|
22 |
+
vectordb = vectorDB(persist_directory=DB_directory)
|
23 |
+
elif VECTOR_DB == 2:
|
24 |
+
from embeding.faissdb import FaissDB as vectorDB
|
25 |
+
vectordb = vectorDB(persist_directory=DB_directory)
|
26 |
+
elif VECTOR_DB == 3:
|
27 |
+
from embeding.elasticsearchStore import ElsStore as vectorDB
|
28 |
+
vectordb = vectorDB()
|
29 |
+
|
30 |
+
# 存储上传的文件
|
31 |
+
uploaded_files = []
|
32 |
+
|
33 |
+
@lru_cache(maxsize=100)
|
34 |
+
def get_knowledge_base_files():
|
35 |
+
cl_dict = {}
|
36 |
+
cols = vectordb.get_all_collections_name()
|
37 |
+
for c_name in cols:
|
38 |
+
cl_dict[c_name] = vectordb.get_collcetion_content_files(c_name)
|
39 |
+
return cl_dict
|
40 |
+
|
41 |
+
knowledge_base_files = get_knowledge_base_files()
|
42 |
+
|
43 |
+
def upload_files(files):
|
44 |
+
if files:
|
45 |
+
new_files = [file.name for file in files]
|
46 |
+
uploaded_files.extend(new_files)
|
47 |
+
update_knowledge_base_files()
|
48 |
+
logger.info(f"Uploaded files: {new_files}")
|
49 |
+
return update_file_list(), new_files, "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Upload successful!</div>"
|
50 |
+
update_knowledge_base_files()
|
51 |
+
return update_file_list(), [], "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Upload failed!</div>"
|
52 |
+
|
53 |
+
def delete_files(selected_files):
|
54 |
+
global uploaded_files
|
55 |
+
uploaded_files = [f for f in uploaded_files if f not in selected_files]
|
56 |
+
if selected_files:
|
57 |
+
update_knowledge_base_files()
|
58 |
+
logger.info(f"Deleted files: {selected_files}")
|
59 |
+
return update_file_list(), "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Delete successful!</div>"
|
60 |
+
update_knowledge_base_files()
|
61 |
+
return update_file_list(), "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Delete failed!</div>"
|
62 |
+
|
63 |
+
def delete_collection(selected_knowledge_base):
|
64 |
+
if selected_knowledge_base and selected_knowledge_base != "创建知识库":
|
65 |
+
vectordb.delete_collection(selected_knowledge_base)
|
66 |
+
update_knowledge_base_files()
|
67 |
+
logger.info(f"Deleted collection: {selected_knowledge_base}")
|
68 |
+
return update_knowledge_base_dropdown(), "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Collection deleted successfully!</div>"
|
69 |
+
return update_knowledge_base_dropdown(), "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Delete collection failed!</div>"
|
70 |
+
|
71 |
+
async def async_vectorize_files(selected_files, selected_knowledge_base, new_kb_name, chunk_size, chunk_overlap):
|
72 |
+
if selected_files:
|
73 |
+
if selected_knowledge_base == "创建知识库":
|
74 |
+
knowledge_base = new_kb_name
|
75 |
+
vectordb.create_collection(selected_files, knowledge_base, chunk_size=chunk_size, chunk_overlap=chunk_overlap)
|
76 |
+
else:
|
77 |
+
knowledge_base = selected_knowledge_base
|
78 |
+
vectordb.add_chroma(selected_files, knowledge_base, chunk_size=chunk_size, chunk_overlap=chunk_overlap)
|
79 |
+
|
80 |
+
if knowledge_base not in knowledge_base_files:
|
81 |
+
knowledge_base_files[knowledge_base] = []
|
82 |
+
knowledge_base_files[knowledge_base].extend(selected_files)
|
83 |
+
|
84 |
+
logger.info(f"Vectorized files: {selected_files} for knowledge base: {knowledge_base}")
|
85 |
+
await asyncio.sleep(0) # 允许其他任务执行
|
86 |
+
return f"Vectorized files: {', '.join(selected_files)}\nKnowledge Base: {knowledge_base}\nUploaded Files: {', '.join(uploaded_files)}", "<div style='color: green; padding: 10px; border: 2px solid green; border-radius: 5px;'>Vectorization successful!</div>"
|
87 |
+
return "", "<div style='color: red; padding: 10px; border: 2px solid red; border-radius: 5px;'>Vectorization failed!</div>"
|
88 |
+
|
89 |
+
def update_file_list():
|
90 |
+
return gr.update(choices=uploaded_files, value=[])
|
91 |
+
|
92 |
+
def search_knowledge_base(selected_knowledge_base):
|
93 |
+
if selected_knowledge_base in knowledge_base_files:
|
94 |
+
kb_files = knowledge_base_files[selected_knowledge_base]
|
95 |
+
return gr.update(choices=kb_files, value=[])
|
96 |
+
return gr.update(choices=[], value=[])
|
97 |
+
|
98 |
+
def update_knowledge_base_files():
|
99 |
+
global knowledge_base_files
|
100 |
+
knowledge_base_files = get_knowledge_base_files()
|
101 |
+
|
102 |
+
# 处理聊天消息的函数
|
103 |
+
chat_history = []
|
104 |
+
|
105 |
+
def safe_chat_response(model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, message):
|
106 |
+
try:
|
107 |
+
return chat_response(model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, message)
|
108 |
+
except Exception as e:
|
109 |
+
logger.error(f"Error in chat response: {str(e)}")
|
110 |
+
return f"<div style='color: red;'>Error: {str(e)}</div>", ""
|
111 |
+
|
112 |
+
def chat_response(model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, message):
|
113 |
+
global chat_history
|
114 |
+
if message:
|
115 |
+
chat_history.append(("User", message))
|
116 |
+
if chat_knowledge_base_dropdown == "仅使用模型":
|
117 |
+
rag = RAG_class(model=model_dropdown,persist_directory=DB_directory)
|
118 |
+
answer = rag.mult_chat(chat_history)
|
119 |
+
if chat_knowledge_base_dropdown and chat_knowledge_base_dropdown != "仅使用模型":
|
120 |
+
rag = RAG_class(model=model_dropdown, embed=vector_dropdown, c_name=chat_knowledge_base_dropdown, persist_directory=DB_directory)
|
121 |
+
if chain_dropdown == "复杂召回方式":
|
122 |
+
questions = rag.decomposition_chain(message)
|
123 |
+
answer = rag.rag_chain(questions)
|
124 |
+
elif chain_dropdown == "简单召回方式":
|
125 |
+
answer = rag.simple_chain(message)
|
126 |
+
else:
|
127 |
+
answer = rag.rerank_chain(message)
|
128 |
+
|
129 |
+
response = f" {answer}"
|
130 |
+
chat_history.append(("Bot", response))
|
131 |
+
return format_chat_history(chat_history), ""
|
132 |
+
|
133 |
+
def clear_chat():
|
134 |
+
global chat_history
|
135 |
+
chat_history = []
|
136 |
+
return format_chat_history(chat_history)
|
137 |
+
|
138 |
+
def format_chat_history(history):
|
139 |
+
formatted_history = ""
|
140 |
+
for user, msg in history:
|
141 |
+
if user == "User":
|
142 |
+
formatted_history += f'''
|
143 |
+
<div style="text-align: right; margin: 10px;">
|
144 |
+
<div style="display: inline-block; background-color: #DCF8C6; padding: 10px; border-radius: 10px; max-width: 60%;">
|
145 |
+
{msg}
|
146 |
+
</div>
|
147 |
+
<b>:User</b>
|
148 |
+
</div>
|
149 |
+
'''
|
150 |
+
else:
|
151 |
+
if "```" in msg: # 检测是否包含代码片段
|
152 |
+
code_content = msg.split("```")[1]
|
153 |
+
formatted_history += f'''
|
154 |
+
<div style="text-align: left; margin: 10px;">
|
155 |
+
<b>Bot:</b>
|
156 |
+
<div style="display: inline-block; background-color: #F1F0F0; padding: 10px; border-radius: 10px; max-width: 60%;">
|
157 |
+
<pre><code>{code_content}</code></pre>
|
158 |
+
</div>
|
159 |
+
</div>
|
160 |
+
'''
|
161 |
+
else:
|
162 |
+
formatted_history += f'''
|
163 |
+
<div style="text-align: left; margin: 10px;">
|
164 |
+
<b>Bot:</b>
|
165 |
+
<div style="display: inline-block; background-color: #F1F0F0; padding: 10px; border-radius: 10px; max-width: 60%;">
|
166 |
+
{msg}
|
167 |
+
</div>
|
168 |
+
</div>
|
169 |
+
'''
|
170 |
+
return formatted_history
|
171 |
+
|
172 |
+
def clear_status():
|
173 |
+
upload_status.update("")
|
174 |
+
delete_status.update("")
|
175 |
+
vectorize_status.update("")
|
176 |
+
delete_collection_status.update("")
|
177 |
+
|
178 |
+
def handle_knowledge_base_selection(selected_knowledge_base):
|
179 |
+
if selected_knowledge_base == "创建知识库":
|
180 |
+
return gr.update(visible=True, interactive=True), gr.update(choices=[], value=[]), gr.update(visible=False)
|
181 |
+
elif selected_knowledge_base == "仅使用模型":
|
182 |
+
return gr.update(visible=False, interactive=False), gr.update(choices=[], value=[]), gr.update(visible=False)
|
183 |
+
else:
|
184 |
+
return gr.update(visible=False, interactive=False), search_knowledge_base(selected_knowledge_base), gr.update(visible=True)
|
185 |
+
|
186 |
+
def update_knowledge_base_dropdown():
|
187 |
+
global knowledge_base_files
|
188 |
+
choices = ["创建知识库"] + list(knowledge_base_files.keys())
|
189 |
+
return gr.update(choices=choices)
|
190 |
+
|
191 |
+
def update_chat_knowledge_base_dropdown():
|
192 |
+
global knowledge_base_files
|
193 |
+
choices = ["仅使用模型"] + list(knowledge_base_files.keys())
|
194 |
+
return gr.update(choices=choices)
|
195 |
+
|
196 |
+
|
197 |
+
# SearxNG搜索函数
|
198 |
+
def search_searxng(query):
|
199 |
+
searxng_url = 'http://localhost:8080/search' # 替换为你的SearxNG实例URL
|
200 |
+
params = {
|
201 |
+
'q': query,
|
202 |
+
'format': 'json'
|
203 |
+
}
|
204 |
+
response = requests.get(searxng_url, params=params)
|
205 |
+
response.raise_for_status()
|
206 |
+
return response.json()
|
207 |
+
|
208 |
+
|
209 |
+
# Ollama总结函数
|
210 |
+
def summarize_with_ollama(model_dropdown,text, question):
|
211 |
+
prompt = """
|
212 |
+
根据下边的内容,回答用户问题,
|
213 |
+
内容为:‘{0}‘\n
|
214 |
+
问题为:{1}
|
215 |
+
""".format(text, question)
|
216 |
+
ollama_url = 'http://localhost:11434/api/generate' # 替换为你的Ollama实例URL
|
217 |
+
data = {
|
218 |
+
'model': model_dropdown,
|
219 |
+
"prompt": prompt,
|
220 |
+
"stream": False
|
221 |
+
}
|
222 |
+
response = requests.post(ollama_url, json=data)
|
223 |
+
response.raise_for_status()
|
224 |
+
return response.json()
|
225 |
+
|
226 |
+
|
227 |
+
# 处理函数
|
228 |
+
def ai_web_search(model_dropdown,user_query):
|
229 |
+
# 使用SearxNG进行搜索
|
230 |
+
search_results = search_searxng(user_query)
|
231 |
+
search_texts = [result['title'] + "\n" + result['content'] for result in search_results['results']]
|
232 |
+
combined_text = "\n\n".join(search_texts)
|
233 |
+
|
234 |
+
# 使用Ollama进行总结
|
235 |
+
summary = summarize_with_ollama(model_dropdown,combined_text, user_query)
|
236 |
+
# print(summary)
|
237 |
+
# 返回结果
|
238 |
+
return summary['response']
|
239 |
+
# 添加新的函数来处理AI网络搜索
|
240 |
+
# def ai_web_search(model_dropdown, query):
|
241 |
+
# try:
|
242 |
+
# # 这里添加实际的网络搜索和AI处理逻辑
|
243 |
+
# # 这只是一个示例,您需要根据实际情况实现
|
244 |
+
# search_result = f"搜索结果: {query}"
|
245 |
+
# ai_response = f"AI回答: 基于搜索结果,对于'{query}'的回答是..."
|
246 |
+
# return f"{search_result}\n\n{ai_response}"
|
247 |
+
# except Exception as e:
|
248 |
+
# logger.error(f"Error in AI web search: {str(e)}")
|
249 |
+
# return f"<div style='color: red;'>Error: {str(e)}</div>"
|
250 |
+
|
251 |
+
# 创建 Gradio 界面
|
252 |
+
with gr.Blocks() as demo:
|
253 |
+
with gr.Column():
|
254 |
+
# 添加标题
|
255 |
+
title = gr.HTML("<h1 style='text-align: center; font-size: 32px; font-weight: bold;'>RAG精致系统</h1>")
|
256 |
+
# 添加公告栏
|
257 |
+
announcement = gr.HTML("<div style='text-align: center; font-size: 18px; color: red;'>公告栏: 欢迎使用RAG精致系统,一个适合学习、使用、自主扩展的【检索增强生成】系统!<br/>公众号:世界大模型</div>")
|
258 |
+
|
259 |
+
with gr.Tabs():
|
260 |
+
with gr.TabItem("知识库"):
|
261 |
+
knowledge_base_dropdown = gr.Dropdown(choices=["创建知识库"] + list(knowledge_base_files.keys()),
|
262 |
+
label="选择知识库")
|
263 |
+
new_kb_input = gr.Textbox(label="输入新的知识库名称", visible=False, interactive=True)
|
264 |
+
file_input = gr.Files(label="Upload files")
|
265 |
+
upload_btn = gr.Button("Upload")
|
266 |
+
file_list = gr.CheckboxGroup(label="Uploaded Files")
|
267 |
+
delete_btn = gr.Button("Delete Selected Files")
|
268 |
+
with gr.Row():
|
269 |
+
chunk_size_dropdown = gr.Dropdown(choices=[50, 100, 200, 300, 500, 700], label="chunk_size", value=200)
|
270 |
+
chunk_overlap_dropdown = gr.Dropdown(choices=[20, 50, 100, 200], label="chunk_overlap", value=50)
|
271 |
+
vectorize_btn = gr.Button("Vectorize Selected Files")
|
272 |
+
delete_collection_btn = gr.Button("Delete Collection")
|
273 |
+
upload_status = gr.HTML()
|
274 |
+
delete_status = gr.HTML()
|
275 |
+
vectorize_status = gr.HTML()
|
276 |
+
delete_collection_status = gr.HTML()
|
277 |
+
|
278 |
+
with gr.TabItem("Chat"):
|
279 |
+
with gr.Row():
|
280 |
+
model_dropdown = gr.Dropdown(choices=get_llm(), label="模型")
|
281 |
+
vector_dropdown = gr.Dropdown(choices=get_embeding_model(), label="向量")
|
282 |
+
chat_knowledge_base_dropdown = gr.Dropdown(choices=["仅使用模型"] + vectordb.get_all_collections_name(), label="知识库")
|
283 |
+
chain_dropdown = gr.Dropdown(choices=["复杂召回方式", "简单召回方式","rerank"], label="chain方式", visible=False)
|
284 |
+
chat_display = gr.HTML(label="Chat History")
|
285 |
+
chat_input = gr.Textbox(label="Type a message")
|
286 |
+
chat_btn = gr.Button("Send")
|
287 |
+
clear_btn = gr.Button("Clear Chat History")
|
288 |
+
|
289 |
+
with gr.TabItem("AI网络搜索"):
|
290 |
+
with gr.Row():
|
291 |
+
web_search_model_dropdown = gr.Dropdown(choices=get_llm(), label="模型")
|
292 |
+
web_search_output = gr.Textbox(label="搜索结果和AI回答", lines=10)
|
293 |
+
web_search_input = gr.Textbox(label="输入搜索查询")
|
294 |
+
|
295 |
+
web_search_btn = gr.Button("搜索")
|
296 |
+
|
297 |
+
def handle_upload(files):
|
298 |
+
upload_result, new_files, status = upload_files(files)
|
299 |
+
threading.Thread(target=clear_status).start()
|
300 |
+
return upload_result, new_files, status, update_chat_knowledge_base_dropdown()
|
301 |
+
|
302 |
+
def handle_delete(selected_knowledge_base, selected_files):
|
303 |
+
tmp = []
|
304 |
+
cols_files_tmp = vectordb.get_collcetion_content_files(c_name=selected_knowledge_base)
|
305 |
+
for i in selected_files:
|
306 |
+
if i in cols_files_tmp:
|
307 |
+
tmp.append(i)
|
308 |
+
del cols_files_tmp
|
309 |
+
if tmp:
|
310 |
+
vectordb.del_files(tmp, c_name=selected_knowledge_base)
|
311 |
+
del tmp
|
312 |
+
delete_result, status = delete_files(selected_files)
|
313 |
+
threading.Thread(target=clear_status).start()
|
314 |
+
return delete_result, status, update_chat_knowledge_base_dropdown()
|
315 |
+
|
316 |
+
def handle_vectorize(selected_files, selected_knowledge_base, new_kb_name, chunk_size, chunk_overlap):
|
317 |
+
vectorize_result, status = asyncio.run(async_vectorize_files(selected_files, selected_knowledge_base, new_kb_name, chunk_size, chunk_overlap))
|
318 |
+
threading.Thread(target=clear_status).start()
|
319 |
+
return vectorize_result, status, update_knowledge_base_dropdown(), update_chat_knowledge_base_dropdown()
|
320 |
+
|
321 |
+
def handle_delete_collection(selected_knowledge_base):
|
322 |
+
result, status = delete_collection(selected_knowledge_base)
|
323 |
+
threading.Thread(target=clear_status).start()
|
324 |
+
return result, status, update_chat_knowledge_base_dropdown()
|
325 |
+
|
326 |
+
knowledge_base_dropdown.change(
|
327 |
+
handle_knowledge_base_selection,
|
328 |
+
inputs=knowledge_base_dropdown,
|
329 |
+
outputs=[new_kb_input, file_list, chain_dropdown]
|
330 |
+
)
|
331 |
+
upload_btn.click(handle_upload, inputs=file_input, outputs=[file_list, file_list, upload_status, chat_knowledge_base_dropdown])
|
332 |
+
delete_btn.click(handle_delete, inputs=[knowledge_base_dropdown, file_list], outputs=[file_list, delete_status, chat_knowledge_base_dropdown])
|
333 |
+
vectorize_btn.click(handle_vectorize, inputs=[file_list, knowledge_base_dropdown, new_kb_input, chunk_size_dropdown, chunk_overlap_dropdown],
|
334 |
+
outputs=[gr.Textbox(visible=False), vectorize_status, knowledge_base_dropdown, chat_knowledge_base_dropdown])
|
335 |
+
delete_collection_btn.click(handle_delete_collection, inputs=knowledge_base_dropdown,
|
336 |
+
outputs=[knowledge_base_dropdown, delete_collection_status, chat_knowledge_base_dropdown])
|
337 |
+
|
338 |
+
chat_btn.click(chat_response, inputs=[model_dropdown, vector_dropdown, chat_knowledge_base_dropdown, chain_dropdown, chat_input], outputs=[chat_display, chat_input])
|
339 |
+
clear_btn.click(clear_chat, outputs=chat_display)
|
340 |
+
|
341 |
+
chat_knowledge_base_dropdown.change(
|
342 |
+
fn=lambda selected: gr.update(visible=selected != "仅使用模型"),
|
343 |
+
inputs=chat_knowledge_base_dropdown,
|
344 |
+
outputs=chain_dropdown
|
345 |
+
)
|
346 |
+
|
347 |
+
# 添加新的点击事件处理
|
348 |
+
web_search_btn.click(
|
349 |
+
ai_web_search,
|
350 |
+
inputs=[web_search_model_dropdown, web_search_input],
|
351 |
+
outputs=web_search_output
|
352 |
+
)
|
353 |
+
|
354 |
+
demo.launch(debug=True,share=True)
|