[Python] 파이썬으로 보도자료 크롤링

Today

Total

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Recent Posts

Recent Comments

관리 메뉴

IT STUDY LOG

[Python] 파이썬으로 보도자료 크롤링 본문

IT study/dev

[Python] 파이썬으로 보도자료 크롤링

roheerumi 2023. 6. 8. 12:52

# source code

import requests
from bs4 import BeautifulSoup
import time

file = open('보도자료.txt','w')
file.close()

# 페이징
for page in range(1, 14):
    response = requests.get(f"https://www.*****.or.kr/*****/bbs/i-414/list.do?pageIndex={page}&searchCondition=&pageItm=10")
    html = response.text
    soup = BeautifulSoup(html, "html.parser")

	# 페이지에 있는 모든 기사
    articles = soup.select("div.board_cont")
    for article in articles:
        links = article.select("a")

		# 각 기사 링크를 통해 날짜, 제목, 본문 가져오기
        for link in links:
            #print(f"link : {link}")
            url = "https://www.****.or.kr/*****/bbs/i-414/detail.do?ntt_sn=" + link["data-ntt-sn"]
            
            response = requests.get(url, headers={'User-agent': 'Mozila/5.0'})      
            html = response.text                                                    
            soup = BeautifulSoup(html, "html.parser")
            
            subject = soup.select_one("div.subject").text.replace('\t', '').replace('\n', '')
            content = soup.select_one("div.view_cont").text                            
            date = soup.select_one("ul.board_info_list li").text
            
            # 텍스트 파일로 쓰기
            with open('보도자료.txt','a') as f:
                f.write(f"{date}\n")
                f.write(f"제목 : {subject}\n")
                f.write(f"본문 : {content}\n")
            time.sleep(0.5)
            
file.close()

# result

저작자표시 비영리 변경금지 (새창열림)

'IT study > dev' 카테고리의 다른 글

[JS] (링크 공유) JavaScript \| JS에서 점점점(…)은 무엇일까? (0)	2023.03.28

'IT study/dev' Related Articles

[JS] (링크 공유) JavaScript | JS에서 점점점(…)은 무엇일까? 2023.03.28

Comments

IT STUDY LOG

[Python] 파이썬으로 보도자료 크롤링 본문

[Python] 파이썬으로 보도자료 크롤링

# source code

# result

'IT study > dev' 카테고리의 다른 글

티스토리툴바