demo.ragflow.io upload pdf file, response not expected #1168
Unanswered
glider2019cn
asked this question in
Q&A
Replies: 6 comments 2 replies
-
Could you make an example? |
Beta Was this translation helpful? Give feedback.
0 replies
-
Hi Kevin,
Thanks for your reply! Really appricate your help!
I have install RagFlow in my local machine, also use demo.ragflow.io to do the test, they all use same pdf file.https://docs.fortinet.com/document/fortigate/7.4.2/fortios-release-notes/760203/introduction-and-supported-models
Have attached pdf version.
From Local Machine, some parts was missed, please see attachmentlocal-bce-Manual-chunk-1 to local-bce-Manual-chunk-3
In demo.ragflow.io, please see attachmentdemo.ragflow-bce-General-chunk-1 and demo.ragflow-bce-General-chunk-2
Looks both of them missed page 4 and page5.
I also chat in demo.ragflow.io, wired thing is I request Known Issue, but sometimes give Both Known Issue and Resolved issue, sometimes give Resolved issue. Wired.
From your documents, looks support store them use 3 method, I like it, but looks result not so good. Try to understand where missed? is that related to selected mode? Thanks!
If convert PDF to MarkDown mode, is that will keep the structure? Thanks!
Best regards,
Yi DONG On Tuesday 18 June 2024 at 03:33:48 pm AEST, KevinHuSh ***@***.***> wrote:
Could you make an example?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
-
Hi Kevin,
Thanks for your reply!
I asked question is about Proxy related Known issue.
In demo.ragflow, I got both Known Issue and Resolved issue, I only expect Known issue. You can find it in attached demo.ragflow-chat-2.png.
In my own build ragflow. it did not give me the output.
I am guessing the different is because the mode is Manual, or I use ollama qwen2 in local.
This is what I mean not good enough.
What I expect is chunk based on the title, for example, in page 5, Known issue is 1st level title, start from page 62, next 1st level title Built-in AV engine start from page 71, then big chunk should between 62-71, and give a vector as Known issue. inside chunk, use 2nd level title, like Proxy is page 66, then, this small chunk with a vector Proxy under Known issue vector. something like this. Then do the search will only give me Known issue, not both Known issuse and resolved issue.
The good thing is ragflow give me the output is Proxy, not like others, give me Explicit Proxy. that is amazing, I do not how you make it. Maybe as you use elastic search, vector and sql all 3 method, more accurate.
By the way, I try to loginto mysql database,SELECT * FROM rag_flow.knowledgebase show there is 115 chunk_num, how to dump the chunk content? Thanks!
Best regards,
Yi DONG On Saturday 22 June 2024 at 04:04:50 am AEST, KevinHuSh ***@***.***> wrote:
For page 4/5, they are table of content which will be removed by default since they are usually useless.
What is the question you input that is not good enough?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
-
Hi Kevin,
Thanks for your update! Will try to check from ES.
By the way, is there possible chunk based on the different level title and vector with title level? It will give more accure result. Thanks!
Best regards,
Yi DONG
On Monday 24 June 2024 at 05:39:35 pm AEST, KevinHuSh ***@***.***> wrote:
Chunk content is in ES. You need to write an ES dsl to fetch them, like:
{
"term": {"kb_id": "xxxxxxxxxxxxxxxxxxxxx"}
}
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
-
We're working on this. But this is a hard problem to define the level of title in NLP ^^ |
Beta Was this translation helpful? Give feedback.
0 replies
-
Hi Kevin,
Thanks a lot!
Agree, it is really hard but will exactly useful if it work as this way. No rush, I have follow you. :-) Thanks!
Best regards,
Yi DONG
On Thursday 27 June 2024 at 11:01:26 am AEST, KevinHuSh ***@***.***> wrote:
We're working on this. But this is a hard problem to define the level of title in NLP ^^
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi All,
When I use demo.ragflow.io, upload pdf file, it was success. But when ask question based on the document, it not provide correct response.
Ask known issue, it provide both known issue and resolved issue, and resolved issue only provide part, like resolved issue in 2 pages, only provide last page.
So how to let it only provide known issue? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions