在数据分析中,将不同表中的数据合并起来是一项常见任务。这就像将拼图块拼在一起,让能够一次性分析和报告所有信息。本文将探讨如何使用SQL查询语句,如JOIN、UNION等,来合并表。
首先,需要创建表并插入示例数据。以下是创建员工表和部门表的SQL语句。
创建员工表:
CREATE TABLE employees (
employee_id INT,
employee_name VARCHAR(50),
department_id INT
);
向员工表插入数据:
INSERT INTO employees (employee_id, employee_name, department_id) VALUES
(1, 'Alice', 1),
(2, 'Bob', 2),
(3, 'Carol', 1),
(4, 'David', 3),
(5, 'Eve', 2);
创建部门表:
CREATE TABLE departments (
department_id INT,
department_name VARCHAR(50)
);
向部门表插入数据:
INSERT INTO departments (department_id, department_name) VALUES
(1, 'HR'),
(2, 'IT'),
(3, 'Sales'),
(4, 'Marketing');
1. INNER JOIN:INNER JOIN检索两个表中具有匹配值的记录。
SELECT e.employee_id, e.employee_name, d.department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id;
2. LEFT JOIN:LEFT JOIN检索左表(employees)的所有记录,以及右表(departments)中匹配的记录。
SELECT e.employee_id, e.employee_name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id;
3. RIGHT JOIN:RIGHT JOIN检索右表的所有记录和左表中匹配的记录。
SELECT e.employee_id, e.employee_name, d.department_name
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id;
4. FULL OUTER JOIN:FULL OUTER JOIN检索左表或右表中所有匹配的记录。
SELECT e.employee_id, e.employee_name, d.department_name
FROM employees e
FULL OUTER JOIN departments d ON e.department_id = d.department_id;
5. CROSS JOIN:CROSS JOIN检索两个表的笛卡尔积。
SELECT e.employee_id, e.employee_name, d.department_name
FROM employees e
CROSS JOIN departments d;
6. SELF JOIN:SELF JOIN将表与自身进行连接。
SELECT e1.employee_id, e1.employee_name, e2.employee_name AS manager_name
FROM employees e1
LEFT JOIN employees e2 ON e1.department_id = e2.department_id AND e1.employee_id != e2.employee_id;
7. SEMIJOIN:SEMI JOIN检索左表中在右表中至少有一个匹配的行。
SELECT e.employee_id, e.employee_name
FROM employees e
WHERE EXISTS (
SELECT 1
FROM departments d
WHERE e.department_id = d.department_id
);
8. ANTI JOIN:ANTI JOIN检索左表中在右表中没有匹配的行。
SELECT e.employee_id, e.employee_name
FROM employees e
WHERE NOT EXISTS (
SELECT 1
FROM departments d
WHERE e.department_id = d.department_id
);
9. UNION:UNION将两个或多个SELECT语句的结果集合并,获取不重复的行。
SELECT employee_id, employee_name, NULL AS department_name
FROM employees
UNION
SELECT NULL AS employee_id, NULL AS employee_name, department_name
FROM departments;
10.UNIONALL:UNION ALL将两个SELECT语句的结果集合并,包括重复的行。
SELECT employee_id, employee_name, department_id
FROM employees
UNION ALL
SELECT department_id AS employee_id, department_name AS employee_name, NULL AS department_id
FROM departments;
11. INTERSECT:INTERSECT检索两个SELECT语句的交集。
SELECT employee_id, employee_name, department_id
FROM employees
INTERSECT
SELECT department_id AS employee_id, department_name AS employee_name, NULL AS department_id
FROM departments;
Q1. INNER JOIN和OUTER JOIN有什么区别?
A. INNER JOIN仅检索两个表中匹配的记录,而OUTER JOIN检索匹配的记录以及一个或两个表中的所有记录,具体取决于类型(LEFT、RIGHT、FULL)。
Q2. 什么时候应该使用CROSS JOIN?
A. 当需要两个表的笛卡尔积时,即第一个表的每一行都与第二个表的每一行配对。
Q3. UNION和UNION ALL有什么不同?
A. UNION移除重复记录,而UNION ALL包含所有重复记录。
Q4. SEMI JOIN和ANTI JOIN分别用于什么?
A. SEMIJOIN返回左表中至少有一个匹配的行。ANTI JOIN返回左表中没有匹配的行。