So i'm writing a script who parses sites and write retr

Thread replies: 5
Thread images: 1

Anonymous
2017-09-06 08:47:21 Post No. 62279848
[Report] Image search: [Google]

File: Screenshot (19).png (62KB, 1360x765px) Image search: [Google]

Anonymous 2017-09-06 08:47:21 Post No. 62279848 [Report]

So i'm writing a script who parses sites and write retrieved data on a csv file.
here is the code:

 from bs4 import BeautifulSoup
import requests
import csv 

r= requests.get('http://www.mediadata.it/en/aziende-comunicatori/elenco/{}/')
data = r.text
soup = BeautifulSoup(data, "html.parser")
with open('mbsmediadata.csv', 'w') as csvfile:
    fieldnames=['nome', 'responsabili', 'email','posizione']
    writer=csv.DictWriter(csvfile, fieldnames=fieldnames)
    for i,j,z,y in zip(soup.find_all('h5',attrs={'class': 'ng-binding'})):
        writer.writeheader()
        writer.writerow({'nome':i.text,'responsabili':j.text,'email':z.text,'posizione':y.text}) 
but the format is shit tier. i've tried reading many documentations and previous questions but even if the .format() don't give syntax errors it don't format at all.
second issue is that i am committed to write fieldnames in every row, and google sheets import only those fieldnames. 
do you know how to figure out the solution?

pic related, it's the shitty format output

Anonymous 2017-09-06 08:52:35 Post No.62279893
[Report]

Anonymous 2017-09-06 08:52:35 Post No.62279893 [Report]

>>62279848
this is the retarded way of doing it
read the page source and find the actual data source. the webpage is angular so there is obviously some sort of rest endpoint providing the data. find the endpoint scrape the endpoint

Anonymous 2017-09-06 08:56:08 Post No.62279922
[Report]

Anonymous 2017-09-06 08:56:08 Post No.62279922 [Report]

>>62279893
i wrote the wronge code bro:

from bs4 import BeautifulSoup
import requests
import csv 
r = requests.get('https://www.paginegialle.it/ricerca/pizzerie/Milano?mr=50')
data = r.text
soup = BeautifulSoup(data,"html.parser")
with open('mbsprprova.csv', 'w') as csvfile:
    fieldnames = ['nome','indirizzo','telefono']
    writer=csv.DictWriter(csvfile, fieldnames=fieldnames)
    for i,j,z in zip(soup.find_all('span', attrs={'itemprop':'name'}),soup.find_all('span', attrs={'class':'street-address'}), soup.find_all('div', attrs={'class':'tel elementPhone'})):
        writer.writeheader()
        writer.writerow ({'nome':i.text,'telefono':j.text,'indirizzo':z.text})

Anonymous 2017-09-06 10:09:21 Post No.62280601
[Report]

Anonymous 2017-09-06 10:09:21 Post No.62280601 [Report]

Here I like your chaining solution but I'm not sure how you will fix the address like that
Pbin ZRfd5Kch

Anonymous 2017-09-06 10:52:45 Post No.62281033
[Report]

Anonymous 2017-09-06 10:52:45 Post No.62281033 [Report]

here make something of yourself kiddo

from bs4 import BeautifulSoup
import requests
data = requests.get('https://www.paginegialle.it/ricerca/pizzerie/Milano?mr=50')
soup = BeautifulSoup(data.text,"lxml")
businesses = []
mapping = {
    'street-address' : 'address',
    'postal-code': 'postcode',
    'locality': 'city',
    'region': 'state'
    }
for i,j,z in zip(soup.find_all('span', attrs={'itemprop':'name'}),soup.find_all('div', attrs={'itemprop':'address'}), soup.find_all('div', attrs={'class':'tel elementPhone'})):
    data = {}
    data['name']=i.text.strip()
    
    for addressfield in j.find_all('span'):
        tomap = str(addressfield.attrs['class'][0])
        data[mapping[tomap]] = addressfield.text.strip()        
    
    data['telephones'] = z.text.strip().split(',')
    map(lambda x: x.strip(),data['telephones'])
#     print(z.text)
    print(data)
    businesses.append(data)

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible. Read more on this topic here - https://archived.moe/talk/thread/1694/

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/