“Python Arabic Web Scraping” Ответ

Python Arabic Web Scraping

import urllib.request,sys,time
from bs4 import BeautifulSoup 
import requests
import pandas as pd

pagesToGet = 1

for page in range(1,pagesToGet+1):
    print('processing page :', page)
    url = 'http://norumors.net/?post_type=rumors/?page=' + str(page)
    print(url)

    #an exception might be thrown, so the code should be in a try-except block
    try:
        #use the browser to get the url. This is suspicious command that might blow up.
        page = requests.get(url)                             # this might throw an exception if something goes wrong.

    except Exception as e:                                   # this describes what to do if an exception is thrown
        error_type, error_obj, error_info = sys.exc_info()      # get the exception information
        print('ERROR FOR LINK:',url)                          #print the link that cause the problem
        print(error_type, 'Line:', error_info.tb_lineno)     #print error info and line that threw the exception
        continue                                              #ignore this page. Abandon this and go back.

    soup = BeautifulSoup(page.text,'html.parser')
    texts = []
    links = []
    filename = "NEWS.csv"
    f = open(filename,"w", encoding = 'utf-8')

    Statement = soup.find("div",attrs={'class':'row d-flex'})
    divs = Statement.find_all("div",attrs={'class':'col-lg-4 col-md-4 col-sm-6 col-xs-6'})

    for div in divs:
        txt = div.find("img",attrs={'class':'rumor__thumb'})
        texts.append(txt['alt'])
        lnk = div.find("a",attrs={'class':'rumor--archive'})
        links.append(lnk['href'])

data = pd.DataFrame(list(zip(texts, links)), columns=['Statement', 'Link'])
data.to_csv(f, encoding='utf-8', index=False)
f.close()

Frightened Fowl

Ответы похожие на “Python Arabic Web Scraping”

Привязки Python 2 для RPM необходимы для этого модуля. Если вам нужна поддержка Python 3, используйте вместо этого модуль `dnf` ansible .. Модуль Python 2 Yum необходим для этого модуля. Если вам нужна поддержка Python 3, используйте вместо этого модуль `dnf`.

Вопросы похожие на “Python Arabic Web Scraping”

Больше похожих ответов на “Python Arabic Web Scraping” по Python

Смотреть популярные ответы по языку

Смотреть другие языки программирования

Shell/Bash

C++

CSS

HTML

Java

JavaScript

Objective-C

PHP

Python

Sql

Swift

Ruby

TypeScript

Kotlin

Assembly

VBA

Scala

Rust

Dart

Elixir

Clojure

Haskell

Matlab

Erlang

Cobol

Fortran

Scheme

Perl

Groovy

Lua

Julia

Delphi

Abap

Lisp

Prolog

Pascal

ActionScript

Basic

Solidity

PowerShell

GDScript

Excel