Skip to content

Commit ffca226

Browse files
authored
feat: add solutions to lc problem: No.3554 (#4418)
1 parent 741bea8 commit ffca226

File tree

4 files changed

+174
-2
lines changed

4 files changed

+174
-2
lines changed

solution/3500-3599/3554.Find Category Recommendation Pairs/README.md

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,14 +171,74 @@ product_id 是这张表的唯一主键。
171171

172172
<!-- solution:start -->
173173

174-
### 方法一
174+
### 方法一:连接 + 分组聚合
175+
176+
我们先将表 `ProductPurchases` 和表 `ProductInfo` 按照 `product_id` 进行连接,得到由 `user_id``category` 组成的表 `user_category`。接着,我们在 `user_category` 表中自连接,得到每个用户购买的所有类别对。最后,我们对这些类别对进行分组,统计每个类别对的用户数量,并筛选出用户数量大于等于 3 的类别对。
177+
178+
最后,我们按照用户数量降序、`category1` 升序、`category2` 升序的顺序进行排序,得到最终结果。
175179

176180
<!-- tabs:start -->
177181

178182
#### MySQL
179183

180184
```sql
185+
# Write your MySQL query statement below
186+
WITH
187+
user_category AS (
188+
SELECT DISTINCT
189+
user_id,
190+
category
191+
FROM
192+
ProductPurchases
193+
JOIN ProductInfo USING (product_id)
194+
),
195+
pair_per_user AS (
196+
SELECT
197+
a.user_id,
198+
a.category AS category1,
199+
b.category AS category2
200+
FROM
201+
user_category AS a
202+
JOIN user_category AS b ON a.user_id = b.user_id AND a.category < b.category
203+
)
204+
SELECT category1, category2, COUNT(DISTINCT user_id) AS customer_count
205+
FROM pair_per_user
206+
GROUP BY 1, 2
207+
HAVING customer_count >= 3
208+
ORDER BY 3 DESC, 1, 2;
209+
```
181210

211+
#### Pandas
212+
213+
```python
214+
import pandas as pd
215+
216+
217+
def find_category_recommendation_pairs(
218+
product_purchases: pd.DataFrame, product_info: pd.DataFrame
219+
) -> pd.DataFrame:
220+
df = product_purchases[["user_id", "product_id"]].merge(
221+
product_info[["product_id", "category"]], on="product_id", how="inner"
222+
)
223+
user_category = df.drop_duplicates(subset=["user_id", "category"])
224+
pair_per_user = (
225+
user_category.merge(user_category, on="user_id")
226+
.query("category_x < category_y")
227+
.rename(columns={"category_x": "category1", "category_y": "category2"})
228+
)
229+
pair_counts = (
230+
pair_per_user.groupby(["category1", "category2"])["user_id"]
231+
.nunique()
232+
.reset_index(name="customer_count")
233+
)
234+
result = (
235+
pair_counts.query("customer_count >= 3")
236+
.sort_values(
237+
["customer_count", "category1", "category2"], ascending=[False, True, True]
238+
)
239+
.reset_index(drop=True)
240+
)
241+
return result
182242
```
183243

184244
<!-- tabs:end -->

solution/3500-3599/3554.Find Category Recommendation Pairs/README_EN.md

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,14 +170,74 @@ Each row assigns a category and price to a product.
170170

171171
<!-- solution:start -->
172172

173-
### Solution 1
173+
### Solution 1: Join + Group Aggregation
174+
175+
First, we join the `ProductPurchases` table and the `ProductInfo` table on `product_id` to obtain a `user_category` table consisting of `user_id` and `category`. Next, we self-join the `user_category` table to get all category pairs purchased by each user. Finally, we group these category pairs, count the number of users for each pair, and filter out the pairs with at least 3 users.
176+
177+
Lastly, we sort the final result by customer count in descending order, then by `category1` in ascending order, and then by `category2` in ascending order.
174178

175179
<!-- tabs:start -->
176180

177181
#### MySQL
178182

179183
```sql
184+
# Write your MySQL query statement below
185+
WITH
186+
user_category AS (
187+
SELECT DISTINCT
188+
user_id,
189+
category
190+
FROM
191+
ProductPurchases
192+
JOIN ProductInfo USING (product_id)
193+
),
194+
pair_per_user AS (
195+
SELECT
196+
a.user_id,
197+
a.category AS category1,
198+
b.category AS category2
199+
FROM
200+
user_category AS a
201+
JOIN user_category AS b ON a.user_id = b.user_id AND a.category < b.category
202+
)
203+
SELECT category1, category2, COUNT(DISTINCT user_id) AS customer_count
204+
FROM pair_per_user
205+
GROUP BY 1, 2
206+
HAVING customer_count >= 3
207+
ORDER BY 3 DESC, 1, 2;
208+
```
180209

210+
#### Pandas
211+
212+
```python
213+
import pandas as pd
214+
215+
216+
def find_category_recommendation_pairs(
217+
product_purchases: pd.DataFrame, product_info: pd.DataFrame
218+
) -> pd.DataFrame:
219+
df = product_purchases[["user_id", "product_id"]].merge(
220+
product_info[["product_id", "category"]], on="product_id", how="inner"
221+
)
222+
user_category = df.drop_duplicates(subset=["user_id", "category"])
223+
pair_per_user = (
224+
user_category.merge(user_category, on="user_id")
225+
.query("category_x < category_y")
226+
.rename(columns={"category_x": "category1", "category_y": "category2"})
227+
)
228+
pair_counts = (
229+
pair_per_user.groupby(["category1", "category2"])["user_id"]
230+
.nunique()
231+
.reset_index(name="customer_count")
232+
)
233+
result = (
234+
pair_counts.query("customer_count >= 3")
235+
.sort_values(
236+
["customer_count", "category1", "category2"], ascending=[False, True, True]
237+
)
238+
.reset_index(drop=True)
239+
)
240+
return result
181241
```
182242

183243
<!-- tabs:end -->
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
import pandas as pd
2+
3+
4+
def find_category_recommendation_pairs(
5+
product_purchases: pd.DataFrame, product_info: pd.DataFrame
6+
) -> pd.DataFrame:
7+
df = product_purchases[["user_id", "product_id"]].merge(
8+
product_info[["product_id", "category"]], on="product_id", how="inner"
9+
)
10+
user_category = df.drop_duplicates(subset=["user_id", "category"])
11+
pair_per_user = (
12+
user_category.merge(user_category, on="user_id")
13+
.query("category_x < category_y")
14+
.rename(columns={"category_x": "category1", "category_y": "category2"})
15+
)
16+
pair_counts = (
17+
pair_per_user.groupby(["category1", "category2"])["user_id"]
18+
.nunique()
19+
.reset_index(name="customer_count")
20+
)
21+
result = (
22+
pair_counts.query("customer_count >= 3")
23+
.sort_values(
24+
["customer_count", "category1", "category2"], ascending=[False, True, True]
25+
)
26+
.reset_index(drop=True)
27+
)
28+
return result
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Write your MySQL query statement below
2+
WITH
3+
user_category AS (
4+
SELECT DISTINCT
5+
user_id,
6+
category
7+
FROM
8+
ProductPurchases
9+
JOIN ProductInfo USING (product_id)
10+
),
11+
pair_per_user AS (
12+
SELECT
13+
a.user_id,
14+
a.category AS category1,
15+
b.category AS category2
16+
FROM
17+
user_category AS a
18+
JOIN user_category AS b ON a.user_id = b.user_id AND a.category < b.category
19+
)
20+
SELECT category1, category2, COUNT(DISTINCT user_id) AS customer_count
21+
FROM pair_per_user
22+
GROUP BY 1, 2
23+
HAVING customer_count >= 3
24+
ORDER BY 3 DESC, 1, 2;

0 commit comments

Comments
 (0)