爬虫

前端保存结构代码(代码源于网络)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
(function (console) {
console.save = function (data, filename) {
if (!data) {
console.error('Console.save: No data')
return;
}
if (!filename) filename = 'console.json'
if (typeof data === "object") {
data = JSON.stringify(data, undefined, 4)
}
var blob = new Blob([data], { type: 'text/ json' }),
e = document.createEvent('MouseEvents'),
a = document.createElement('a')
a.download = filename
a.href = window.URL.createObjectURL(blob)
a.dataset.downloadurl = ['text / json', a.download, a.href].join(': ')
e.initMouseEvent('click', true, false, window, 0, 0, 0, 0, 0, false, false, false, false, 0, null)
a.dispatchEvent(e)
}
})(console)

// 使用方法:console.save(obj)

Python爬取的编码转换

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import requests
from bs4 import BeautifulSoup
# 获取当前网站编码并转换
r = requests.get(website)

# 获取当前网站编码并转换
if r.encoding == 'ISO-8859-1':
encodings = requests.utils.get_encodings_from_content(r.text)
if encodings:
encoding = encodings[0]
else:
encoding = r.apparent_encoding
else:
encoding = r.encoding
encode_content = r.content.decode(
encoding, 'replace').encode('utf-8', 'replace')
soup = BeautifulSoup(encode_content, features="html.parser")
文章目录
|