Python 官方文档:入门教程 => 点击学习
环境:python2.7安装lxml模块pip install lxml例子:from lxml import etree text = ''' <div> <ul> <li clas
环境:python2.7
安装lxml模块
pip install lxml
例子:
from lxml import etree
text = '''
<div>
<ul>
<li class="item-0"><a href="link1.html">first item</a></li>
<li class="item-1"><a href="link2.html">second item</a></li>
<li class="item-inactive"><a href="link3.html">third item</a></li>
<li class="item-1"><a href="link4.html">fourth item</a></li>
<li class="item-0"><a href="link5.html">fifth item</a>
</ul>
</div>
'''
html = etree.HTML(text) #这是一个地址
result = etree.tostring(html) #读出来源码,并且补全,如输出的《body》标签
print(result)
输出:
<html>
<body>
<div>
<ul>
<li class="item-0"><a href="link1.html">first item</a></li>
<li class="item-1"><a href="link2.html">second item</a></li>
<li class="item-inactive"><a href="link3.html">third item</a></li>
<li class="item-1"><a href="link4.html">fourth item</a></li>
<li class="item-0"><a href="link5.html">fifth item</a></li>
</ul>
</div>
</body>
</html>
#读取文件里的内容
from lxml import etree
html = etree.parse('hello.html')
result = etree.tostring(html, pretty_print=True)
print(result)
获取li标签里的东西
html = etree.parse('hello.html') print type(html) result = html.xpath('//li') print result print len(result) print type(result) print type(result[0]) |
参考文章:Http://cuiqinGCai.com/2621.html
说明:此篇博客仅仅是为了自己学习lxml模块,故没好好写,下面是我微信二维码
--结束END--
本文标题: python的lxml模块
本文链接: https://lsjlt.com/news/183943.html(转载时请注明来源链接)
有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341
2024-03-01
2024-03-01
2024-03-01
2024-02-29
2024-02-29
2024-02-29
2024-02-29
2024-02-29
2024-02-29
2024-02-29
回答
回答
回答
回答
回答
回答
回答
回答
回答
回答
0