Welcome to 16892 Developer Community-Open, Learning,Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I need the body part of different articles on a page. They've been written in a section tag including several p tags for each paragraph. like:

<section class="...">
 <div>...</div>
 <figure>...</figure>
 <p id='...' class='...'></p>
 <p id='...' class='...'></p>
 <p id='...' class='...'></p>
</section>

<section class="...">
 <div>...</div>
 <figure>...</figure>
 <p id='...' class='...'></p>
 <p id='...' class='...'></p>
 <p id='...' class='...'></p>
</section>

If I use code below :

import requests
import re
from bs4 import BeautifulSoup

r = requests.get('url')

all_bodies = soup.find_all('section')
for i in range(len(all_bodies)):
    print(all_bodies[i])

It returns the complete content of section and if I add p tag to find_all it returns each p tag as an element of the list, but I want whole p tags of a section in one list element.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
3.5k views
Welcome To Ask or Share your Answers For Others

1 Answer

Add an additional loop and find all <p>:

for i in all_bodies:
    for p in i.find_all('p'):
        print(p)

Or as alternativ use css selectors to avoid that additional loop:

for p in soup.select('section p'):
    print(p) 

Example with additional for loop

from bs4 import BeautifulSoup

html = '''
<section class="...">
 <div>...</div>
 <figure>...</figure>
 <p id='...' class='...'></p>
 <p id='...' class='...'></p>
 <p id='...' class='...'></p>
</section>

<section class="...">
 <div>...</div>
 <figure>...</figure>
 <p id='...' class='...'></p>
 <p id='...' class='...'></p>
 <p id='...' class='...'></p>
</section>
'''
soup = BeautifulSoup(html, 'lxml')

all_bodies = soup.find_all('section')

for i in all_bodies:
    for p in i.find_all('p'):
        print(p)

Output

<p class="..." id="..."></p>
<p class="..." id="..."></p>
<p class="..." id="..."></p>
<p class="..." id="..."></p>
<p class="..." id="..."></p>
<p class="..." id="..."></p>

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to 16892 Developer Community-Open, Learning and Share
...