diff --git a/README.md b/README.md
index 6bded88f0..4fe9598b3 100644
--- a/README.md
+++ b/README.md
@@ -4,7 +4,14 @@
# SuperSonic (超音数)
-**SuperSonic is the next-generation LLM-powered data analytics platform that integrates ChatBI and HeadlessBI**. SuperSonic provides a chat interface that empowers users to query data using natural language and visualize the results with suitable charts. To enable such experience, the only thing necessary is to build logical semantic models (definition of entities/metrics/dimensions/tags, along with their meaning, context and relationships) with semantic layer, and **no data modification or copying** is required. Meanwhile, SuperSonic is designed to be **highly extensible**, allowing custom functionalities to be added and configured with Java SPI.
+SuperSonic is the next-generation BI platform that integrates **Chat BI** (powered by LLM) and **Headless BI** (powered by semantic layer). Both paradigms benefit from the integration:
+
+- Chat BI's Text2SQL capability gets enhanced with semantic data models.
+- Headless BI's query interface gets augmented with natural language support.
+
+
+
+SuperSonic provides a chat interface that empowers users to query data using natural language and visualize the results with suitable charts. To enable such experience, the only thing necessary is to build logical semantic models (definition of metric/dimension/entity/tag, along with their meaning and relationships) with semantic layer, and **no data modification or copying** is required. Meanwhile, SuperSonic is designed to be **highly extensible**, allowing custom functionalities to be added and configured with Java SPI.
@@ -13,17 +20,16 @@
The emergence of Large Language Model (LLM) like ChatGPT is reshaping the way information is retrieved. In the field of data analytics, both academia and industry are primarily focused on leveraging LLM to convert natural language into SQL (so called Text2SQL or NL2SQL). While some approaches exhibit promising results, their **reliability** and **efficiency** are insufficient for real-world applications.
From our perspective, the key to filling the real-world gap lies in three aspects:
-1. Integrate ChatBI with HeadlessBI encapsulating underlying data context (joins, keys, formulas, etc) to **reduce complexity**.
-
-2. Augment the LLM with schema mappers(as a kind of preprocessor) and semantic correctors(as a kind of postprocessor) to **mitigate hallucination**.
-3. Utilize rule-based schema parsers when necessary to **improve efficiency**(in terms of latency and cost).
+1. Incorporate data semantics (such as business terms, column values, etc.) into the prompt, enabling LLM to better understand the semantics and **reduce hallucination**.
+2. Offload the generation of advanced SQL syntax (such as join, formula, etc.) from LLM to the semantic layer to **reduce complexity**.
+3. Utilize rule-based semantic parsers when necessary to **improve efficiency**(in terms of latency and cost).
-With these ideas in mind, we develop SuperSonic as a practical reference implementation and use it to power our real-world products. Additionally, to facilitate further development of ChatBI, we decide to open source SuperSonic as an extensible framework.
+With these ideas in mind, we develop SuperSonic as a practical reference implementation and use it to power our real-world products. Additionally, to facilitate further development we decide to open source SuperSonic as an extensible framework.
## Out-of-the-box Features
-- Built-in ChatBI interface for *business users* to enter natural language queries
-- Built-in HeadlessBI interface for *analytics engineers* to build semantic models
+- Built-in Chat BI interface for *business users* to enter natural language queries
+- Built-in Headless BI interface for *analytics engineers* to build semantic models
- Built-in GUI for *system administrators* to manage chat agents and third-party plugins
- Support input auto-completion as well as query recommendation
- Support multi-turn conversation and history context management
diff --git a/README_CN.md b/README_CN.md
index b85af8c8c..c03ee2a12 100644
--- a/README_CN.md
+++ b/README_CN.md
@@ -1,6 +1,13 @@
# SuperSonic (超音数)
-**SuperSonic融合ChatBI和HeadlessBI打造新一代的数据分析平台**。通过SuperSonic的问答对话界面,用户能够使用自然语言查询数据,系统会选择合适的可视化图表呈现结果。SuperSonic不需要修改或复制数据,只需要在物理数据模型之上构建逻辑语义模型(指标/维度/实体的定义,以及他们的业务含义、相互间关系等),即可开启数据问答体验。与此同时,SuperSonic被设计为可插拔的框架,采用Java SPI机制来扩展定制功能。
+**SuperSonic融合Chat BI(powered by LLM)和Headless BI(powered by 语义层)打造新一代的BI平台**。两种BI新范式都从融合中获得收益:
+
+- Chat BI的Text2SQL能力通过语义数据模型得到增强。
+- Headless BI的查询接口通过支持自然语言得到拓展。
+
+
+
+通过SuperSonic的问答对话界面,用户能够使用自然语言查询数据,系统会选择合适的可视化图表呈现结果。SuperSonic不需要修改或复制数据,只需要在物理数据模型之上构建逻辑语义模型(定义指标/维度/实体/标签,以及它们的业务含义、相互关系等),即可开启数据问答体验。与此同时,SuperSonic被设计为可插拔的框架,采用Java SPI机制来扩展定制功能。
@@ -9,18 +16,16 @@
大型语言模型(LLMs)如ChatGPT的出现正在重塑信息检索的方式。在数据分析领域,学术界和工业界主要关注利用深度学习模型将自然语言查询转换为SQL查询。虽然一些工作显示出有前景的结果,但它们的可靠性还达不到生产可用的要求。
在我们看来,为了在实际场景发挥价值,有三个关键点:
-1. 融合HeadlessBI,通过统一语义层封装底层数据细节(关联、键值、公式等),降低SQL生成的**复杂度**。
-
-
-2. 通过一前一后的模式映射器和语义修正器,来缓解LLM常见的**幻觉**现象。
-3. 设计启发式的规则,在一些特定场景提升语义解析的**效率**。
+1. 通过在提示词中增加数据语义(如业务术语、列取值等)使LLM对语义有更好的理解,以减少**幻觉**。
+2. 将高级SQL语法(如连接、公式等)的生成从LLM卸载到语义层,以降低**复杂性**。
+3. 在某些特定场景使用基于启发式规则的语义解析器,以提升**效率**。
为了验证上述想法,我们开发了SuperSonic项目,并将其应用在实际的内部产品中。与此同时,我们将SuperSonic作为一个可扩展的框架开源,希望能够促进数据问答对话领域的进一步发展。
## 开箱即用的特性
-- 内置ChatBI界面以便*业务用户*输入数据查询。
-- 内置HeadlessBI界面以便*分析工程师*构建语义模型。
+- 内置Chat BI界面以便*业务用户*输入数据查询。
+- 内置Headless BI界面以便*分析工程师*构建语义模型。
- 内置图形用户界面以便*系统管理员*管理第三方插件和对话助理。
- 支持文本输入的联想和查询问题的推荐。
- 支持多轮对话,根据语境自动切换上下文。