机器学习面试:Leetcode SQL 刷题与答案

数据科学家 算法工程师 面试准备 全套-github.com/LongxingTan/Machine-learning-interview

  • https://dbeaver.io/download/
  • primary key: unique & not null
  • foreign key: refer to primary key in another table

>>> Basic

1050. 合作过至少三次的演员和导演

SELECT actor_id, director_id
FROM ActorDirector
GROUP BY actor_id, director_id
HAVING COUNT(*) >= 3;

1076. Project Employees II

SELECT TOP 1 WITH TIES project_id
FROM Project
GROUP BY project_id
ORDER BY COUNT(employee_id) DESC;

1082. Sales Analysis I

SELECT TOP 1 WITH TIES seller_id
FROM Sales
GROUP BY seller_id
ORDER BY SUM(price) DESC;

1141. 查询近30天活跃用户数

SELECT activity_date as day, COUNT(DISTINCT user_id) as active_users
FROM Activity
WHERE activity_date between '2019-06-28' and '2019-07-27'
GROUP BY activity_date;

1148. 文章浏览 I

SELECT DISTINCT author_id as id
FROM Views
WHERE author_id = viewer_id
ORDER BY id;

1149. Article Views II

SELECT DISTINCT viewer_id as id
FROM Views
GROUP BY viewer_id, view_date
HAVING COUNT(DISTINCT article_id) > 1
ORDER BY id;

182. 查找重复的电子邮箱

聚合函数(如 COUNT)通常需要与 GROUP BY 子句一起使用,并且过滤条件应该放在 HAVING 子句中。直接在 WHERE 子句中使用聚合函数会导致语法错误

SELECT email as email
FROM Person
GROUP BY email
HAVING COUNT(email) > 1;

511. 游戏玩法分析 I

处理聚合查询时,MIN 是一个更通用的解决方案,适用于所有 SQL 数据库。TOP 1 则更适合用于非聚合查询中选择排序后的第一行记录

SELECT player_id, MIN(event_date) as first_login
FROM Activity
GROUP BY player_id;

578. Get Highest Answer Rate Question

SELECT TOP 1 question_id as survey_log
FROM survey_log
GROUP BY question_id
ORDER BY COUNT(answer_id) * 1.0 / (COUNT(*) - COUNT(answer_id)) DESC;

584. 寻找用户推荐人

SELECT name
FROM Customer
WHERE referee_id != 2 OR referee_id IS NULL;

586. 订单最多的客户

SELECT customer_number
FROM orders
GROUP BY customer_number
ORDER BY COUNT(*) DESC
LIMIT 1;

595. 大的国家

SELECT name, population, area
FROM World
WHERE area >= 3000000 OR population >= 25000000;

596. 超过5名学生的课

SELECT class
FROM Courses
GROUP BY class
HAVING COUNT(*) >= 5;

619. 只出现一次的最大数字

多一层为了空表格时输出null

SELECT (
    SELECT num
    FROM MyNumbers
    GROUP BY num
    HAVING COUNT(*) = 1
    ORDER BY num DESC
    LIMIT 1
) as num;

620. 有趣的电影

SELECT *
FROM cinema
WHERE description != 'boring' AND id % 2 = 1
ORDER BY rating DESC;

>>> Case-when

610. 判断三角形

SELECT *, 
    (CASE WHEN x + y > z AND x + z > y AND y + z > x THEN 'Yes' ELSE 'No' END) AS triangle
FROM Triangle;

627. 变更性别

-- SELECT id, name,
--     (CASE WHEN sex = 'f' THEN 'm' ELSE 'f' END) AS sex, salary
-- FROM salary;

UPDATE salary
SET
    sex = CASE sex
        WHEN 'm' THEN 'f'
        ELSE 'm'
    END;

1126. 查询活跃业务

WITH tb1 AS (
	SELECT *, AVG(occurances * 1.0) OVER (PARTITION BY event_type) AS avg_oc
	FROM Events
)
SELECT business_id 
FROM tb1
GROUP BY business_id
HAVING SUM(CASE WHEN occurances > avg_oc THEN 1 ELSE 0 END) > 1;

1142_User_Activity_for_the_Past_30_Days_II


1158. 市场分析 I

SELECT u.user_id AS buyer_id, u.join_date AS join_date, 
    SUM(CASE WHEN YEAR(o.order_date) = 2019 THEN 1 ELSE 0 END) AS orders_in_2019
FROM Users u
LEFT JOIN Orders o 
ON u.user_id = o.buyer_id
GROUP BY u.user_id, u.join_date;

1159_Market_Analysis_II


1173_Immediate_Food_Delivery_I

1174. 即时食物配送 II


>>> JOIN

简单回顾下pandas中的merge

  • 默认的how=‘inner’, left_on,right_on的行为并不符合我之前的预期
  • https://pandas.pydata.org/docs/reference/api/pandas.merge.html

在这里插入图片描述

175. 组合两个表

SELECT l.firstName, l.lastName, r.city, r.state
FROM Person l
LEFT JOIN Address r
ON l.personId = r.personID;

181. 超过经理收入的员工

self join需要给自身两个不同的alias

SELECT e1.name AS Employee
FROM Employee e1
LEFT JOIN Employee e2
ON e1.managerId = e2.id 
WHERE e1.salary > e2.salary;

183. 从不订购的客户

SELECT c.name AS Customers
FROM Customers c 
LEFT JOIN ORDERS o 
ON c.id = o.customerId
WHERE o. id IS NULL;

577. 员工奖金

SELECT e.name, b.bonus
FROM Employee e 
LEFT JOIN Bonus b 
ON e.empId = b.empId 
WHERE b.bonus < 1000 OR b.bonus IS NULL;

613. 直线上的最近距离

>>> WINDOW FUNCTION

603. 连续空余座位

>>> PIVOT

BAT机器学习面试1000系列 1 前言 1 BAT机器学习面试1000系列 2 1 归一化为什么能提高梯度下降法求解最优解的速度? 22 2 归一化有可能提高精度 22 3 归一化的类型 23 1)线性归一化 23 2)标准差标准化 23 3)非线性归一化 23 35. 什么是熵。机器学习 ML基础 易 27 熵的引入 27 3.1 无偏原则 29 56. 什么是卷积。深度学习 DL基础 易 38 池化,简言之,即取区域平均或最大,如下图所示(图引自cs231n) 40 随机梯度下降 46 批量梯度下降 47 随机梯度下降 48 具体步骤: 50 引言 72 1. 深度有监督学习在计算机视觉领域的进展 73 1.1 图像分类(Image Classification) 73 1.2 图像检测(Image Dection) 73 1.3 图像分割(Semantic Segmentation) 74 1.4 图像标注–看图说话(Image Captioning) 75 1.5 图像生成–文字转图像(Image Generator) 76 2.强化学习(Reinforcement Learning) 77 3深度无监督学习(Deep Unsupervised Learning)–预测学习 78 3.1条件生成对抗网络(Conditional Generative Adversarial Nets,CGAN) 79 3.2 视频预测 82 4 总结 84 5 参考文献 84 一、从单层网络谈起 96 二、经典的RNN结构(N vs N) 97 三、N VS 1 100 四、1 VS N 100 五、N vs M 102 Recurrent Neural Networks 105 长期依赖(Long-Term Dependencies)问 106 LSTM 网络 106 LSTM 的核心思想 107 逐步理解 LSTM 108 LSTM 的变体 109 结论 110 196. L1L2范数。机器学习 ML基础 易 163 218. 梯度下降法的神经网络容易收敛到局部最优,为什么应用广泛?深度学习 DL基础 中 178 @李振华,https://www.zhihu.com/question/68109802/answer/262143638 179 219. 请比较下EM算法、HMM、CRF。机器学习 ML模型 中 179 223. Boosting和Bagging 181 224. 逻辑回归相关问 182 225. 用贝叶斯机率说明Dropout的原理 183 227. 什么是共线性, 跟过拟合有什么关联? 184 共线性:多变量线性回归中,变量之间由于存在高度相关关系而使回归估计不准确。 184 共线性会造成冗余,导致过拟合。 184 解决方法:排除变量的相关性/加入权重正则。 184 勘误记 216 后记 219
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

YueTann

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值