📜  Teradata解释

📅  最后修改于: 2021-01-11 11:33:26             🧑  作者: Mango

Teradata解释

EXPLAIN命令是AMP的解析引擎(PE)计划。 EXPLAIN命令以英语翻译返回解析引擎的执行计划。它可以与任何SQL语句一起使用,除非在另一个EXPLAIN命令上。

当查询前面带有EXPLAIN命令时,解析引擎的执行计划将返回给用户而不是AMP。解释计划可以清楚地显示优化器执行查询的方式。

在Teradata系统中运行新查询之前,最好对它的解释计划进行分析。

可以通过两种方式获得解释计划。首先,在任何查询之前添加“ EXPLAIN ”关键字,其次只需从键盘上按“ F6”。

解释计划可以将查询分解为最低级别,从而有助于分析查询的性能问题。解释计划提供了很多信息,例如:

  • 访问路径:如果使用全表扫描或使用任何索引(例如主索引路径,辅助索引路径或任何其他索引)来获取数据。
  • 置信度:优化器可以使用统计信息或缺少任何统计信息。
  • 联接信息:将发生哪种联接。
  • 时间估算:提供估算的查询完成时间。

如果我们在查询语句前传递一个explain命令或只需按F6键,优化器就会将以下估计的置信度消息传递给用户,例如:

  • 高可信度:索引或列上的统计信息。
  • 低置信度:随机抽取INDEX或可用的统计信息,但存在AND / OR条件。
  • 没有信心:基于AMP行计数的随机采样。没有收集统计信息。

解释计划关键字

要了解EXPLAIN计划,我们应该了解以下关键字。

Keyword Explanation
Locking Pseudo Table Serial lock on a symbolic table. Every table has one. It is used to prevent deadlocks situations between users.
Locking table for This indicates that ACCESS, READ, WRITE, or the EXCLUSIVE lock has been placed on the table.
Locking rows for Indicates that an ACCESS, READ or WRITE, the lock is placed on rows read or written.
Do an ABORT test Guarantees a transaction is not in progress for this user.
All AMPs retrieve All AMPs are receiving the AMP steps and are involved in providing the answer set.
By way of an all-rows scan Rows are read sequentially on all AMPs.
By way of the primary index Rows are read using the Primary index column(s).
By way of index number Rows are read using the Secondary index: number from HELP INDEX.
BMSMS Bit Map Set Manipulation Step, alternative direct access technique when multiple NUSI columns are referenced in the WHERE clause.
Residual conditions WHERE clause conditions, other than those of a join.
Eliminating duplicate rows Providing unique values usually result in DISTINCT, GROUP BY, or subquery.
Where unknown comparison will be ignored This indicates that NULL values will not compare to TRUE or FALSE. Seen in a subquery using NOT IN or NOT = ALL because no rows will be returned on the ignored comparison.
Nested join The fastest join is possible. It uses UPI to retrieve a single row after using a UPI or a USI in the WHERE to reduce the join to a single row.
Merge join Rows of one table are matched to the other table on common domain columns after being sorted into the same sequence, normally Row Hash.
Product join Rows of one table are matched to all rows of another table with no concern for domain match.
ROWID join A very fast join. It uses ROWID of a UPI to retrieve a single row after using a UPI or a USI in the WHERE to reduce the join to a single row.
Duplicated on all AMPs Participating rows for the table of a join are duplicated on all AMPS.
Hash redistributed on all AMPs. Participating rows of a join are hashed on the join column and sent to the same AMP that stores the matching row of the table to join.
SMS Set Manipulation Step, the result of an INTERSECT, UNION, EXCEPT, or MINUS operation.
Last use SPOOL file is no longer needed after the step and space are released.
Built locally on the AMPs As rows are read, they are put into SPOOL on the same AMP.
Aggregate Intermediate Results computed locally. The aggregation values are all on the same AMP and, therefore, no need to redistribute them to work with rows on other AMPs.
Aggregate Intermediate Results computed globally. The aggregation values are not all on the same AMP and must be redistributed on one AMP to accompany the same value with the other AMPs.

如何解释计划的工作原理

任何SQL请求之前的EXPLAIN请求修饰符都会使Teradata数据库显示该请求的执行计划。该请求本身未提交执行。

  • 当我们对任何SQL请求执行EXPLAIN时,该请求将被解析和优化。
  • 由优化器生成的访问和联接计划以文本文件的形式返回,该文本文件解释了在执行请求中使用的(可能是并行的)步骤。
  • 考虑到优化程序必须使用的统计信息,它还包括完成请求所需的相对成本。
  • 如果统计信息不够准确,则成本估算可能不准确。

使用EXPLAIN的好处

以下是使用“解释计划”的一些基本好处,例如:

  • EXPLAIN帮助评估复杂的查询并开发替代的,更有效的处理策略。
  • 我们可以通过在更多列上收集更多统计信息或定义其他索引来获得更好的计划。
  • 了解实际的人口统计信息可以允许识别似乎严重错误的行计数估计值,并帮助确定可能需要其他统计信息的区域。

范例说明

考虑具有以下定义的Employee表。

CREATE SET TABLE EMPLOYEE, FALLBACK ( 
   Employee_Id INTEGER, 
   First_Name VARCHAR(10), 
   Last_Name VARCHAR(10),
   DOB DATE FORMAT 'YYYY-MM-DD', 
   
   Department_No BYTEINT 
) 
UNIQUE PRIMARY INDEX ( Employee_Id );

全表扫描(FTS)

如果在SELECT语句中未指定任何条件,则优化器可以使用“全表扫描”,在该表中访问表的每一行。

以下是优化程序可以选择FTS的查询。

EXPLAIN SELECT * FROM Employee;

执行上述查询后,将产生以下输出。可以看出,优化器选择访问所有AMP和AMP中的所有行。

1. First, we lock a distinct TDUSER."pseudo table" for reading on a RowHash to prevent global deadlock for TDUSER.Employee.  
2. Next, we lock TDUSER.Employee to read.   
3. We do an all-AMPs RETRIEVE step from TDUSER.Employee by way of an all-rows scan with no residual conditions into Spool 1 (group_amps) built locally on the AMPs.  The size of Spool 1 is estimated with low confidence to be 2 rows (116 bytes). The estimated time for this step is 0.03 seconds.
4. Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. 

//The contents of Spool 1 are sent back to the user as the result of statement 1.
//The total estimated time is 0.03 seconds.

唯一主索引

使用唯一主索引访问行时,这是一项AMP操作。

EXPLAIN SELECT * FROM Employee WHERE Employee_Id = 1001;

执行上述查询后,它会产生单AMP检索,并且优化程序使用唯一的主索引来访问该行。

1. First, we do a single-AMP RETRIEVE step from TDUSER.Employee by way of the unique primary index "TDUSER.Employee.Employee_Id = 1001" with no residual conditions.  

//The row is sent directly back to the user as the result of statement 1.  
//The total estimated time is 0.01 seconds.

唯一二级索引

当使用“唯一二级索引”访问行时,这是两安培操作。

考虑具有以下定义的薪水表。

CREATE SET TABLE SALARY, FALLBACK 
( 
   Employee_Id INTEGER, 
   Gross INTEGER, 
   Deduction INTEGER, 
   NetPay INTEGER 
)
PRIMARY INDEX (Employee_Id) 
UNIQUE INDEX (Employee_Id);

考虑下面的SELECT语句。

EXPLAIN SELECT * FROM Salary WHERE Employee_Id = 1001;

当执行以上查询时,它会使优化器使用唯一的二级索引在两次amp操作中检索该行。

1. First, we do a two-AMP RETRIEVE step from TDUSER.Salary by way of unique index # 4 "TDUSER.Salary.Employee_Id = 1001" with no residual conditions.   

//The row is sent directly back to the user as the result of statement 1.  
//The total estimated time is 0.01 seconds.