mirror of
https://github.com/tencentmusic/supersonic.git
synced 2025-12-10 02:46:56 +00:00
[improvement][docs]Update README and wechat group
This commit is contained in:
@@ -2,18 +2,18 @@
|
||||
|
||||
# SuperSonic (超音数)
|
||||
|
||||
**SuperSonic is an out-of-the-box yet highly extensible framework for building a data chatbot**. SuperSonic provides a chat interface that empowers users to query data using natural language and visualize the results with suitable charts. To enable such experience, the only thing necessary is to build logical semantic models (definition of metrics/dimensions/entities, along with their meaning, context and relationships) on top of physical data models, and no data modification or copying is required. Meanwhile, SuperSonic is designed to be pluggable, allowing new functionalities to be added through plugins and core components to be integrated with other systems.
|
||||
**SuperSonic is an out-of-the-box yet highly extensible framework for building ChatBI**. SuperSonic provides a chat interface that empowers users to query data using natural language and visualize the results with suitable charts. To enable such experience, the only thing necessary is to build logical semantic models (definition of metrics/dimensions/entities, along with their meaning, context and relationships) on top of physical data models, and no data modification or copying is required. Meanwhile, SuperSonic is designed to be pluggable, allowing new functionalities to be added through plugins and core components to be integrated with other systems.
|
||||
|
||||
<img src="./docs/images/supersonic_demo.gif" height="100%" width="100%" align="center"/>
|
||||
|
||||
## Motivation
|
||||
|
||||
The emergence of Large Language Model (LLM) like ChatGPT is reshaping the way information is retrieved. In the field of data analytics, both academia and industry are primarily focused on leveraging LLM to convert natural language queries into SQL queries. While some works show promising results, they are still not applicable to real-world scenarios.
|
||||
The emergence of Large Language Model (LLM) like ChatGPT is reshaping the way information is retrieved. In the field of data analytics, both academia and industry are primarily focused on leveraging LLM to convert natural language into SQL (so called text2sql or nl2sql). While some works show promising results, they are still not applicable to real-world scenarios.
|
||||
|
||||
From our perspective, the key to filling the real-world gap lies in three aspects:
|
||||
1. Complement the LLM-based semantic parser with rule-based semantic parsers to improve **efficiency**(in terms of latency and cost).
|
||||
1. Introduce a semantic layer encapsulating underlying data context(joins, formulas, etc) to reduce **complexity**.
|
||||
2. Augment semantic parsing with schema mappers(as a kind of preprocessor) and semantic correctors(as a kind of postprocessor) to improve **accuracy** and **stability**.
|
||||
3. Introduce a semantic layer encapsulating underlying data context(joins, formulas, etc) to reduce **complexity**.
|
||||
3. Complement the LLM-based semantic parser with rule-based semantic parsers to improve **efficiency**(in terms of latency and cost).
|
||||
|
||||
With these ideas in mind, we develop SuperSonic as a practical reference implementation and use it to power our real-world products. Additionally, to facilitate further development of data chatbot, we decide to open source SuperSonic as an extensible framework.
|
||||
|
||||
|
||||
@@ -9,9 +9,9 @@
|
||||
大型语言模型(LLMs)如ChatGPT的出现正在重塑信息检索的方式。在数据分析领域,学术界和工业界主要关注利用深度学习模型将自然语言查询转换为SQL查询。虽然一些工作显示出有前景的结果,但它们还并不适用于实际场景。
|
||||
|
||||
在我们看来,为了在实际场景发挥价值,有三个关键点:
|
||||
1. 在基于大模型语义解析器基础上,增加基于规则的解析器,提升语义解析的**效率**。
|
||||
1. 引入语义模型层,封装底层数据的上下文(关联、公式等),降低语义解析的**复杂性**。
|
||||
2. 加入模式映射器和语义修正器,来增强语义解析能力,提升语义解析的**准确性**和**稳定性**。
|
||||
3. 引入语义模型层,封装底层数据的上下文(关联、公式等),降低语义解析的**复杂性**。
|
||||
3. 在基于大模型语义解析器基础上,增加基于规则的解析器,提升语义解析的**效率**。
|
||||
|
||||
为了验证上述想法,我们开发了超音数项目,并将其应用在实际的内部产品中。与此同时,我们将超音数作为一个可扩展的框架开源,希望能够促进数据问答对话领域的进一步发展。
|
||||
|
||||
@@ -38,7 +38,7 @@
|
||||
|
||||
- **语义修正器(Semantic Corrector):** 检查语义信息的合法性,对不合法的信息做修正和优化处理。
|
||||
|
||||
- **语义模型层(Semantic Layer):** 根据语义信息生成物理SQL执行查询。
|
||||
- **语义翻译器(Semantic Layer):** 根据语义信息生成物理SQL执行查询。
|
||||
|
||||
- **问答插件(Chat Plugin):** 通过第三方工具扩展功能。给定所有配置的插件及其功能描述和示例问题,大语言模型将选择最合适的插件。
|
||||
|
||||
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 155 KiB After Width: | Height: | Size: 282 KiB |
Reference in New Issue
Block a user