[improvement][docs]Revise certain descriptions in README

This commit is contained in:
jerryjzhang
2023-12-16 22:12:36 +08:00
parent 59c21ea19a
commit 3db443f9b1
2 changed files with 7 additions and 7 deletions

View File

@@ -2,18 +2,18 @@
# SuperSonic (超音数)
**SupeSonic is a new-generation data analytics platform that integrates ChatBI and HeadlessBI**. SuperSonic provides a chat interface that empowers users to query data using natural language and visualize the results with suitable charts. To enable such experience, the only thing necessary is to build logical semantic models (definition of entities/metrics/dimensions/tags, along with their meaning, context and relationships) on top of physical data models, and **no data modification or copying** is required. Meanwhile, SuperSonic is designed to be **highly extensible**, allowing custom functionalities to be added and configured with Java SPI.
**SuperSonic is the next-generation LLM-powered data analytics platform that integrates ChatBI and HeadlessBI**. SuperSonic provides a chat interface that empowers users to query data using natural language and visualize the results with suitable charts. To enable such experience, the only thing necessary is to build logical semantic models (definition of entities/metrics/dimensions/tags, along with their meaning, context and relationships) on top of physical data models, and **no data modification or copying** is required. Meanwhile, SuperSonic is designed to be **highly extensible**, allowing custom functionalities to be added and configured with Java SPI.
<img src="./docs/images/supersonic_demo.gif" height="100%" width="100%" align="center"/>
## Motivation
The emergence of Large Language Model (LLM) like ChatGPT is reshaping the way information is retrieved. In the field of data analytics, both academia and industry are primarily focused on leveraging LLM to convert natural language into SQL (so called Text2SQL or NL2SQL). While some works exhibit promising results, their **reliability** is inadequate for real-world applications.
The emergence of Large Language Model (LLM) like ChatGPT is reshaping the way information is retrieved. In the field of data analytics, both academia and industry are primarily focused on leveraging LLM to convert natural language into SQL (so called Text2SQL or NL2SQL). While some approaches exhibit promising results, their **reliability** and **efficiency** are insufficient for real-world applications.
From our perspective, the key to filling the real-world gap lies in three aspects:
1. Introduce a semantic layer (so called HeadlessBI) encapsulating underlying data context(joins, formulas, etc) to reduce **complexity**.
2. Augment the LLM with schema mappers(as a kind of preprocessor) and semantic correctors(as a kind of postprocessor) to mitigate **hallucination**.
3. Utilize heuristic rules when necessary to improve **efficiency**(in terms of latency and cost).
1. Integrate ChatBI with HeadlessBI encapsulating underlying data context (joins, keys, formulas, etc) to **reduce complexity**.
2. Augment the LLM with schema mappers(as a kind of preprocessor) and semantic correctors(as a kind of postprocessor) to **mitigate hallucination**.
3. Utilize rule-based schema parsers when necessary to **improve efficiency**(in terms of latency and cost).
With these ideas in mind, we develop SuperSonic as a practical reference implementation and use it to power our real-world products. Additionally, to facilitate further development of ChatBI, we decide to open source SuperSonic as an extensible framework.

View File

@@ -9,7 +9,7 @@
大型语言模型LLMs如ChatGPT的出现正在重塑信息检索的方式。在数据分析领域学术界和工业界主要关注利用深度学习模型将自然语言查询转换为SQL查询。虽然一些工作显示出有前景的结果但它们的可靠性还达不到生产可用的要求。
在我们看来,为了在实际场景发挥价值,有三个关键点:
1. 引入语义模型层,封装底层数据的上下文(关联、公式等降低SQL生成的**复杂度**。
1. 融合HeadlessBI通过统一语义层封装底层数据细节关联、键值、公式等降低SQL生成的**复杂度**。
2. 通过一前一后的模式映射器和语义修正器来缓解LLM常见的**幻觉**现象。
3. 设计启发式的规则,在一些特定场景提升语义解析的**效率**。
@@ -47,7 +47,7 @@ SuperSonic的整体架构和主流程如下图所示
SuperSonic自带样例的语义模型和问答对话只需以下三步即可快速体验
- 从[release page](https://github.com/tencentmusic/supersonic/releases)下载预先构建好的发行包
- 运行 "bin/supersonic-daemon.sh"启动服务一个Java进程和一个Python进程
- 运行 "assembly/bin/supersonic-daemon.sh start"启动standalone模式的Java服务
- 在浏览器访问http://localhost:9080 开启探索
## 如何构建和部署