[Python] - 웹 크롤링 Parsing/Download (beautifulsoup4 예제)

Notice

Recent Posts

Recent Comments

Link

« 2026/04 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

main

[Python] - 웹 크롤링 Parsing/Download (beautifulsoup4 예제) 본문

Python

[Python] - 웹 크롤링 Parsing/Download (beautifulsoup4 예제)

1984 2022. 9. 1. 16:33

import requests
import os
from bs4 import BeautifulSoup
from urllib import request
from urllib.request import urlopen
from urllib.parse import quote_plus
from urllib.error import HTTPError

# 1) url 오픈을 위한 패키지
# 2) beautifulsoup4를 사용하기 위한 패키지
# 3) url을 구성요소로 구문 분석하기 위한 패키지

url = "https://www.atlassian.com/software/confluence/download-archives"
site = "https://www.atlassian.com"
rec = "/software/confluence/downloads/binary/"
file = open("C:/test/html.txt", "r", encoding='UTF-8')

overlap = []
s = set()

def get_download(url, fname, directory):
    try:
        os.chdir(directory)
        request.urlretrieve(url,fname)
        print('다운로드 완료\n')
    except HTTPError as e:
        print('error')
        return
    
def main():
    
    soup = BeautifulSoup(file, "html.parser")
    getA = soup.find_all('a', "product-versions accordion")


    for getLink in getA :
        data = getLink.get("data-version")
        s.add(data)
        
    for data in s:
        # https://www.atlassian.com/software/confluence/downloads/binary/atlassian-confluence-7.19.1-x64.exe
        downloadURL = "https://www.atlassian.com/software/confluence/downloads/binary/atlassian-confluence-" + data + "-x64.exe"
        downloadFileName = "atlassian-confluence-" + data + "-x64.exe"
        downloadDir = "C:/workspace/" + str(data).split('.')[0]
        get_download(downloadURL, downloadFileName, downloadDir)

    print(s)

if __name__ == "__main__":
    main()

728x90

'Python' 카테고리의 다른 글

[Django] 프로젝트 내부에 App 만들기 (0)	2023.04.28
[Django] Django 설치 및 프로젝트 생성 (Python 3.10 / Django 4.2) (0)	2023.04.28
[Python] Python3 설치하기 (v 3.8.5) (0)	2022.09.27
[Python] - 웹 크롤링 (selenium/beautifulsoup4 예제) (0)	2022.09.02
[Python] tkinter 사용 예제 - 슬래쉬, 역슬래쉬 변경 GUI (0)	2022.09.02

'Python' Related Articles

Comments

main

[Python] - 웹 크롤링 Parsing/Download (beautifulsoup4 예제) 본문

[Python] - 웹 크롤링 Parsing/Download (beautifulsoup4 예제)

'Python' 카테고리의 다른 글

티스토리툴바