平台论坛博客文库

› 论坛 › 程序设计 › Python › 为啥re.findall的结果出现多余的, " 等？

为啥re.findall的结果出现多余的, " 等？ [复制链接]

blackantt

家境小康

论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2021-04-20 19:14 |只看该作者 |倒序浏览

import requests
import re
url = 'http://www.shubang.net/book/66_2151.html'
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36'}
web_data = requests.get(url, headers=headers)
web_data.encoding = 'utf-8'
txt = web_data.text
items = re.findall(r'line_en\" \>(.*)<|line_cn\" title=\"(.*)\"', txt)
for item in items:
print(item)

结果如下所示
。。。。。
('"It doesn't look new. It looks old," one of the boys said.', '')('', '“房子一点也不新，旧死了，”其中一个男孩说。')('It just couldn't be.', '')('', '绝对不可能。')('The other members of his family turned to stare at me.', '')('', '其他人都把目光转向了我。')
............

请问：
1.上面的 ') , ( 是哪来的？
2.couldn't 变成了 couldn' 是咋回事？

文库|博客

blackantt

家境小康

论坛徽章:: 0

2楼 [报告]

发表于 2021-04-21 11:24 |只看该作者

知道了，要用 replace 函数做替换

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

返回列表

Chinaunix › 论坛 › 程序设计 › Python › 为啥re.findall的结果出现多余的, " 等？

为啥re.findall的结果出现 多余的, " 等？ [复制链接]

浏览过的版块

为啥re.findall的结果出现多余的, " 等？ [复制链接]