[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Coders Rescue NASA's Earth Science Data.

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 37
Thread images: 1

File: BerkeleyDataTA.jpg (249KB, 1164x873px) Image search: [Google]
BerkeleyDataTA.jpg
249KB, 1164x873px
https://www.wired.com/2017/02/diehard-coders-just-saved-nasas-earth-science-data/

ON SATURDAY MORNING,the white stone buildings on UC Berkeley’s campus radiated with unfiltered sunshine. The sky was blue, the campanile was chiming. But instead of enjoying the beautiful day, 200 adults had willingly sardined themselves into a fluorescent-lit room in the bowels of Doe Library to rescue federalclimate data.

Likesimilar groupsacross the country—in more than 20 cities—they believe that the Trump administration might want to disappear this data down a memory hole. So these hackers, scientists, and students are collecting it to save outside government servers.

But now they’re going even further. Groups likeDataRefugeand theEnvironmental Data and Governance Initiative, which organized the Berkeley hackathon to collect data fromNASA’s earth sciences programsand the Department of Energy, are doing more than archiving. Diehard coders are building robust systems to monitor ongoing changes to government websites. And they’re keeping track of what’s already been removed—because yes, the pruning has already begun.

Tag It, Bag It

The data collection is methodical, mostly. About half the group immediately sets web crawlers on easily-copied government pages, sending their text to the Internet Archive, a digital library made up of hundreds of billions of snapshots of webpages. They tag more data-intensive projects—pages with lots of links, databases, and interactive graphics—for the other group. Called “baggers,” these coders write custom scripts to scrape complicated data sets from the sprawling, patched-together federal websites.

It’s not easy. “All these systems were written piecemeal over the course of 30 years. There’s no coherent philosophy to providing data on these websites,” says Daniel Roesler, chief technology officer at UtilityAPI and one of the volunteer guides for the Berkeley bagger group.

<cont...>
>>
>>111490
One coder who goes by Tek ran into a wall trying to download multi-satellite precipitation data from NASA’s Goddard Space Flight Center. Starting in August, access to Goddard Earth Science Data required a login. But with a bit of totally legal digging around the site (DataRefuge prohibits outright hacking), Tek found a buried link to the old FTP server. He clicked and started downloading. By the end of the day he had data for all of 2016 and some of 2015. It would take at least another 24 hours to finish.

The non-coders hit dead-ends too. Throughout the morning they racked up “404 Page not found” errors across NASA’s Earth Observing System website. And they more than once ran across databases that had already been emptied out, like the Global Change Data Center’s reports archive and one of NASA’s atmospheric CO2datasets.

And this is where the real problem lies. They can’t be sure when this data disappeared (or if anyone backed it up first). Scientists who understand it better will have to go back and take a look. But meantime, DataRefuge and EDGI understand that they need to be monitoring those changes and deletions. That’s more work than a human could do.

So they’re building software that can do it automatically.

Future Farming

Later that afternoon, two dozen or so of the most advanced software builders gathered around whiteboards, sketching out tools they’ll need. They worked out filters to separate mundane updates from major shake-ups, and explored blockchain-like systems to build auditable ledgers of alterations. Basically it’s an issue of what engineers call version control—how do you know if something has changed? How do you know if you have the latest? How do you keep track of the old stuff?

<cont...>
>>
>>111493
There wasn’t enough time for anyone to start actually writing code, but a handful of volunteers signed on to build out tools. That’s where DataRefuge and EDGI organizers really envision their movement going—a vast decentralized network from all 50 states and Canada. Some volunteers can code tracking software from home. And others can simply archive a little bit every day.

By the end of the day, the group had collectively loaded 8,404 NASA and DOE webpages onto the Internet Archive, effectively covering the entirety of NASA’s earth science efforts. They’d also built backdoors in to download 25 gigabytes from 101 public datasets, and were expecting even more to come in as scripts on some of the larger datasets (like Tek’s) finished running. But even as they celebrated over pints of beer at a pub on Euclid Street, the mood was somber.

There was still so much work to do. “Climate change data is just the tip of the iceberg,” says Eric Kansa, an anthropologist who manages archaeological data archiving for the non-profit group Open Context. “There are a huge number of other datasets being threatened with cultural, historical, sociological information.” A panicked friend at the National Parks Service had tipped him off to a huge data portal that contains everything from park visitation stats to GIS boundaries to inventories of species. While he sat at the bar, his computer ran scripts to pull out a list of everything in the portal. When it’s done, he’ll start working his way through each quirky dataset.
>>
>>111490
>The pruning has begun.....

The Truth must be making certain people nervous. Department Of Truth (Orwell) anyone?
>>
So a binch of people got together to download shit from websites. How exciting.

Also, do they understand how FOIA works?
>>
>>111668
I live in Canada, our previous Conservative government tried to do this shit.

The objective is to control information. Instead of making it easily accessible, you have to force people to go through a long FOIA process to get the data, which can be heavily redacted for political reasons.

This is a major end-run around the Trump regime, they can't pull that bullshit now because alternative servers can just be set up to distribute the data.
>>
>>111490
>epicenter of violent fascist "antifascist" protest
>now the beacon of easily marketed hope for liberal Climate Change(TM)

All they need to do now is convert a dorm building into housing for 20+ old "juvenile refugees" and they'll have the mainstream-simp trifecta going on.
>>
>>111744
Why are you supporting government censorship?
>>
>>111749
That's not censorship.

Actively attempting to destroy the data, punishing anyone who trades or possesses it, and making it a crime to talk about or mention the data or what's in it would be censorship.

Not storing/funding the data =/= censorship. Sort of like how not paying for birth control=/= denying someone's right to birth control.

You people need to stop being so hyperbolic, you're starting to make the Tea Party look well adjusted.
>>
>>111761
Actually, denying access to data that has been openly available in the past and that will be stored regardlessly (one would hope) is about as close to censorship as you can get in the U.S.
Also, where exactly is the line between "not storing the data" (data can't be "funded") and " actively trying to destroy the data"?

What's more, people actually do risk punishment for trying to talk about it: if the only ones who can access the data are govt. employees, and they are ordered not to speak publicly about their work (which is the case), the only people in a position to talk about it are under threat of punishment in the form of dismissal.

Do you realize the extreme double standard you've got going here? If Obama had ever attempted anything half as brazen as this in relation to an issue dear to conservatives, the outcry on the right would have been massive and you know it.
>>
>>111763
>denying access to data that has been openly available in the past and that will be stored regardlessly (one would hope) is about as close to censorship as you can get in the U.S.
Except that's not censorship, anymore than YouTube refusing to host a video it doesn't want to host is censorship. The data is still freely available.

>Also, where exactly is the line between "not storing the data" (data can't be "funded")
Yes it can, seeing as how it needs things like storage space and electricity to even be accessible. You're

>and " actively trying to destroy the data"?

>What's more, people actually do risk punishment for trying to talk about it: if the only ones who can access the data are govt. employees,
Which they're not.

>and they are ordered not to speak publicly about their work (which is the case), the only people in a position to talk about it are under threat of punishment in the form of dismissal.
They're not the only people in a position to talk about it, and not not talking about it while on the job doesn't mean you can't talk about it. It means you can't use your position to influence the topic. Sort of like how businesses might not want you going to a protest in your work uniform. Simple concept to grasp,. Refusing to do so doesn't make it go away.

>Do you realize the extreme double standard you've got going here?
Do you realize you're being a hyperbolic retard here?

>If Obama had ever attempted anything half as brazen as this in relation to an issue dear to conservatives, the outcry on the right would have been massive and you know it.
You mean like ordering a CDC study on firearms and then immediately quashing the results of the study when they didn't support his predetermined conclusions, wasting shitloads of tax payer money?
The outcry on the right WAS massive, and you all swept it under the rug because Dear leader Obama could do no wrong.

Why are you proud to be a useful idiot?
>>
>>111770
>anymore than youtube
Stopped reading there. The government is not a private corporation and must not behave like one.
>>
>>111787
Actually I kept reading after all. Let me make a few points and try to avoid getting caught in my own bias.

First, I have no interest in defending Obama's every action. I don't think he was the greates president ever, he was anti-press and in many other ways stood in the way of government transparency. He abused the espionage act and had a demonstrably ineffective to downright atrocious take on foreign policy. He got sucked into the political correctness/tranny bathrooms bullshit despite his academic aspirations. The guns study was obviously a bad call and he should have owned the results. All this notwithstanding, he was a much more capable politician than Trump could ever hope to be. That much should be obvious by this point in Trump's administration.

Now to the actual topic:
You seem to be arguing on multiple conflicting levels. On one hand, you say the data is freely available and there is no problem. On the other hand you say the government is and should be free to "drop" the data from its servers altogether, ostensibly in the interest of fiscal responsibility. Which is it? Are they trying to make the data disappear, at least from public view, or is it still publicly available for the foreseeable future? Do you acknowledge that making previously acquired data disappear is a very questionable policy decision, no matter what we're talking about?
>They're not the only people in a position to talk about it, and not not talking about it while on the job doesn't mean you can't talk about it. It means you can't use your position to influence the topic.
If the data is not available to the public anymore (which is clearly the administration's goal, although they're on track to miss it), who else but the people inside the administration can access it? Without the efforts outlined in the OP, I can't think of anyone. And you seem to be attacking those efforts' merit, although I can't see exactly what your argument against it is.
cont.
>>
>>111787
>Stopped reading there.
So you admit defeat.

>The government is not a private corporation and must not behave like one
It's not behaving like a corporation. It's behaving like any other institution that self regulates.

Again, if you're going to continue being a hyperbolic retard, you're going to continue driving people away from your "cause".
>>
>>111799
>Actually I kept reading after all. Let me make a few points and try to avoid getting caught in my own bias.
Way too late for that.

>On one hand, you say the data is freely available and there is no problem.
>On the other hand you say the government is and should be free to "drop" the data from its servers altogether, ostensibly in the interest of fiscal responsibility. >Which is it?
Both, because they're not mutually exclusive concepts.

Moreover, you want to see "conflicting arguments", so now you're going back and trying to find/create evidence that supports this conclusion. Stop thinking reductively.

>If the data is not available to the public anymore (which is clearly the administration's goal, although they're on track to miss it)
No it is not their goal. You're just outright lying and poisoning the well not. They're not supporting it with public funds and infrastructure. This is perfectly valid, and does not impede access to otherwise still freely available data.

>cont.
Don't bother. You can't stop being a hyperbolic retard and I've already fallen into the trap of having to repeat the same argument to you over and over to someone who keeps thinking they can invent something new to explain criticism away.
>>
>>111761
>It's not censorship, we're just making the public jump through innumerable bureaucratic hoops just so they can access data that they paid for with their own tax dollars!
Like I said, this has been tried before in my country. Forcing people through FOIA requests allows political staffers to censor documents released by redacting them for bullshit political reasons.

>But muh server costs!
The government has a fuck ton of servers and is going to pay for the bandwidth and electricity regardless of what is being hosted on the servers.
>>
Also, there's literally nothing to stop the government from deleting the data once no one is looking.

It's been done in the past by other governments that want to skew climate science to serve the big oil agenda:

http://www.macleans.ca/news/canada/vanishing-canada-why-were-all-losers-in-ottawas-war-on-data/
>>
>>111799
cont.
Moreover, how exactly do you expect those government employees to talk about the data without implicitly "using their position to influence the topic"? You know as well as I that access to the data is inextricably linked to their jobs. They can't help but "use their position" to make the data and their professional opinion - which they were originally being paid to formulate - public.

The fundamental issue at hand here is that politics and science are being conflated. I firmly believe that scientists working for government agencies should be free to research their area of expertise and publicize the results, whatever they might be. The fact is that the results of Nasa's climate studies are diametrally incompatible with the administration's political stances. I can't see any reason besides that for defunding Nasa's earth sciences division. If climate change actually wasn't real, or wasn't anthropogenic, shouldn't they be able to find some unbiased researchers to confirm that fact scientifically? Trying to stifle the debate isn't making them look very good or innocent of political motive at all.

>>111802
>They're not supporting it with public funds and infrastructure. This is perfectly valid, and does not impede access to otherwise still freely available data.
If they are the only ones hosting the data, deleting/"defunding"/dropping it, definitely does impede access to it and to assert anything else is mental gymnastics at its best.

And please, for fuck's sake, stop throwing around the phrase "hyperbolic retard". I was not being hyperbolic at any point in my posts and you're being a dick for no reason.
>>
Fuck this climate conspiracy shit. I hope Trump destroys all the servers on this hoax and Pruitt permanently cripples the EPA. Maybe then, politicians in the future will get the right idea and stop investing in this waste of money.
>>
>>111809
>Trying to stifle the debate isn't making them look very good or innocent of political motive at all.
Totally. The Harper government up here tried to do it and it came around to bite them in the end. People became disgusted with the petty authoritarianism and neurotic secrecy. Furthermore climate data is used for more than just "ebul liberal propaganda," many businesses (shipping, agriculture, software development) rely on accurate climate data to plan for future investments and development and the Conservative restrictions made doing business up here more difficult. It made people who never voted angry at the government, and they turned out in 2015 to elect the most recognizable face that wasn't Conservative.

People who are part of the resistance should start filing these FOIA requests, it's a god way to expose the government's unAmerican secrecy
>>
>>111814
>Totally. The Harper government up here tried to do it and it came around to bite them in the end.
More like liberal lies brought him down. People will believe anything nasty about the guy in charge, and left are better liars then the right.

>People became disgusted with the petty authoritarianism and neurotic secrecy.
No, that's the natural state of human societies. Besides, it's not like FDR, Johnson, JFK, and Obama, all "liberals," weren't some of the most authoritarian and secretive world leaders in history.

>Furthermore climate data is used for more than just "ebul liberal propaganda," many businesses
That's pretty much all it's used for. It's literally politcized science designed to line the pockets of liberal elites like Soros while stomping down on hard working conservatives like the Kochs. It is purely partisan, and a huge money sink. Also, big green is a big scam and will never be as good as oil.

>(shipping, agriculture, software development) rely on accurate climate data to plan for future investments and development and the Conservative restrictions made doing business up here more difficult.
Then they should fund their own research if they're so worried about it, but they don't because they know it's a waste of money.

>It made people who never voted angry at the government, and they turned out in 2015 to elect the most recognizable face that wasn't Conservative.
It's because swing voters will believe anything negative about their current leader. Independents are morons and will blame shit like a stubbed tow on the president.

>People who are part of the resistance should start filing these FOIA requests, it's a god way to expose the government's unAmerican secrecy
Haha, yeah. Let's see how you can FOIA this when we pull the plug on the data servers. These servers take money, our money, and that money can be better spent elsewhere.
>>
>>111818
>we
There's that word again. Who are your allies that you're planning on unplugging those servers with?
Also,
>Kochs
>hard working conservatives [as opposed to liberal megadonors]
Yeah man right on, keep drinking that kool aid.
>>
>>111818
tl;dr: Newfag anons are literally advocating government censorship of public data

>More like liberal lies brought him down. People will believe anything nasty about the guy in charge, and left are better liars then the right.
[citation needed]

>No, that's the natural state of human societies. Besides, it's not like FDR, Johnson, JFK, and Obama, all "liberals," weren't some of the most authoritarian and secretive
People don't like being misled and lied to.

>That's pretty much all it's used for.
This is a lie. I've already proved that it's used for more than just climate science, which you admit in the next sentence:

>Then they should fund their own research if they're so worried about it, but they don't because they know it's a waste of money.
They don't because it's ridiculously expensive to gather and governments have done this work for public safety (aeronautics, shipping) and national security (programming) reasons.

Why do you want to make air travel and shipping more dangerous to Americans who use and work on planes and vessels?

>It's because swing voters will believe anything negative about their current leader. Independents are morons and will blame shit like a stubbed tow on the president.
Insulting the average voter is a surefire way to re-election, yes sir

>Haha, yeah. Let's see how you can FOIA this when we pull the plug on the data servers. These servers take money, our money, and that money can be better spent elsewhere.
The data still exists, and even if it isn't connected to the Internet the U.S. public has a RIGHT to access it as it is public property, paid for by tax dollars
>>
>>111821
>>Kochs
>>hard working conservatives [as opposed to liberal megadonors]
>Yeah man right on, keep drinking that kool aid.
Don't the Kochs earn most of their money through government contracts anyways? I know that a lot of their business interests deal in road construction.
>>
>>111490
Tell me when the start making publicly funded research freely available to the public. Damn journal paywalls are ridiculous.
>>
Anyway, anyone actually concerned with the government censoring public data for political reasons should get themselves acquainted with the Electronic Frontier Foundation, one of the most consistently pro-liberty organizations fighting for digital freedom
>https://www.eff.org/

Phone calls and letter writing to all levels of government is also needed to put the brakes on this type of shit. Representatives listen to who talks to them. The best letter communicates quite clearly that you have voted for that representative before, but you can't in good conscience vote for him again if your congressman or senator votes for anti-freedom legislation or defends censorship on the Internet.
>>
>>111862
Never, since corporate society wants good research but doesn't want to pay for it with 0.001% of their corporate profits.
>>
>>111803
On top of that, the WIDOA (at least) uses shitty software (Oracle) that none of the employees like, but which costs twice as much or more than software they do like. Kickbacks.
>>
>>111490
Government is waged by governors (limiters). They drag down and hold back society because it's filled with liars (psychopaths) who make rules they don't believe applies to themselves.
>>
>>111812
Maybe Anon should stop investing in this waste of server memory.
>>
>>111818
Wew laddie.

But
>left are better liars than the right

I hope this is just a troll attempt. Conservatives wouldn't have a voting demographic if they didn't lie their tits off to uneducated hicks, who would directly benefit from liberal policy. I'm in no way defending the likes of Hillary Rottenbottom, but conservatives literally must lie about everything they stand for to get a decent amount of votes.
>>
>>111951
>denying that the left is the party of dishonesty

This is why it's pointless to talk politics with a lefty. You will not admit to any faults within your party. The closest you will get it is to say, "yes were all bad together." Most of the time, a lefty won't even ADMIT they're a leftist to begin with.

You're nuts, go die in a volcano.
>>
>>111919
Since all of the data is now in the hands of private citizens, hopefully they can use this opportunity to reform whatever they can to make the data more usable (if that is even possible)
>>
>>112145
Man, who knew the militant right could get triggered by words so easily?
>>
>>112152
It's not about getting triggered.

Your progressive opinions are garbage.
>>
>>112118
I never denied it. Neoliberalism is cancer, because it's founded in high places. Liberal populism is the only political philosophy that actually benefits the worker.
>>
>>112197
Are you trying to muddy the waters?

>>112181

>I never denied it
>but I deny being a neoliberal
>I'm a REAL liberal

LOL/ROFL
Thread posts: 37
Thread images: 1


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.