[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Big Data/Data Sciene General?

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 25
Thread images: 7

File: cover.jpg (37KB, 566x744px) Image search: [Google]
cover.jpg
37KB, 566x744px
Let's see if we can get something started. Anybody interested in diving into this area? Let's help each other out. I'll be posting the books I'm reading right now. All of them can be found on Library Genesis btw.
>>
File: cover.jpg (84KB, 550x690px) Image search: [Google]
cover.jpg
84KB, 550x690px
Anybody working in this field? I wanna hear your stories, how did you get in, self-taught or degree-holder, what techs to learn, etc.
>>
File: thickAss.webm (2MB, 800x770px) Image search: [Google]
thickAss.webm
2MB, 800x770px
>>58690728
One and for all

Is bid data a meme?

Is it only some statistic + operationg on data with programming language?
>>
File: cover.jpg (106KB, 580x727px) Image search: [Google]
cover.jpg
106KB, 580x727px
I have a PhD in Cognitive Psychology, tons of experience analysis experimental data, but now I'm trying to make the jump into Big data. Using pic related to see where I stand on the stats/math side, nothing new so far.

I have experience with SPSS, Matlab and R, will try to learn Hadoop and all the Hadoop-related platforms.
>>
File: cover.jpg (82KB, 600x738px) Image search: [Google]
cover.jpg
82KB, 600x738px
>>58690752
>Is it only some statistic + operationg on data with programming language?

I'm reading pic related as a sort of theoretical intro to Big Data (no stats nor programming on the book), and I'd say there's a bit more to it. Lots of database management, which is a lot more complex than just having a huge excell sheet or even an SQL-type database, due to the size and scale of it.
>>
File: cover.jpg (80KB, 666x1000px) Image search: [Google]
cover.jpg
80KB, 666x1000px
Pic related is a good read to get an idea of where the field is at the moment. Most essays are rather superficial, but they're good to give you a general idea. One guy argues that the future of Big Data is not just to analyze the data to answer a question, but build a decision-making and execution process on top of it to automate some activity.

The example is GPS's that show you a route (trivial today) vs. self-driving cars, where the GPS is just the base, and on top of that there's live-data processing to actually move the car from A to B.
>>
>>58690800
I am working as java spring/hibernate monkey right now and I am looking for something worth to learn for future. Big data or AI seems like the best option.
>>
What's the difference between data and big data.
Why does this buzzword exist.
>>
>>58690926
>data
mostly SQL-type databases, where everything is neatly organized, all nicely formatted, ideally you have no missing data

>big data
NoSQL (Not only SQL), which means data is whatever comes in, you don't always have predefined categories. Big data is analyzing video streams together with text, or analyzing youtube comments for thousands of videos, not stuff you could do with excel or SQL.
Also, 3 V's: VELOCITY, VOLUME, VARIETY. You get tons of data coming in really fast and it's all kinds of stuff, not a pre-made form like "Name, age, gender, zip code, phone number".

Read the 1st chapter of >>58690800 for a better description.
>>
there are some nice vid about big data on udacity
>>
nobody here doing big data?
>>
>>58692391
>"doing big data"
Are you?

I work with systems gathering data in the petabyte scale each day. It is hardly big data, just a lot of data.
>>
THere is a site with big data exercises, hadoop etc

anybody knows website name?
>>
>>58693229
no, I'm not, but trying to get into it. Tell us your education background, languages/skills to learn, any advice or tips
>>
>>58693619
Stop making inane posts on /g/ and educate yourself is a tip. 98% of people on /g/ are idiots with neither education nor jobs


I have a degree in CS and work with surveillance systems for electricity, water and gas grids. They are wireless sensors connecting to base stations (Ethernet, GPRS, and a few other varieties). Everything gets sent back to us for analysis, then back to the customer in a heavily condensed format. The purpose is to detect damages (leaks, broken equipment, etc.) or generally measure performance
>>
>>58693716
I am educating myself, working through all of the books I posted in the OP and following posts. Just looking for advice.

What framework do you use to manage your data, is it Hadoop or something of your own?
>>
>>58694168
>framework
Our own
(super old)
>>
>>58690744

Self-taught. Was working in a molecular biology lab at the time, and we had started some big sequencing projects. Nobody in the lab knew any bioinformatics, so I learned how to go through our data myself.

Good thing RStudio makes it so easy for beginners, but too bad its so bloated. Once I was experienced enough I switched to working within vim completely.
>>
>>58696009
but you still use R or have you switched to something better?

That's kind of my thing too, in my psych lab I was doing all the stats and the little bit of programming we had to do (run some experiments on Matlab, minor Python shit), so now I'm trying to capitalize on that and learn a bit more.
>>
>>58696009
What is your background? Degree, etc. I work in a molecular biology lab as an undergraduate, thinking of doing into big data genomics. But everyone seems to have a PhD in CS or Math.
>>
>>58696139

I still use R. Since it can be slow for large jobs, I kinda want to learn some faster languages (the current consensus among R package devs seems to be to use C++ for the slower parts).

>>58696177

All my background is in molecular bio. I have never taken a CS course in my life, which unfortunately may have led to some bad habits that my colleagues like to make fun of.
>>
File: 1460235670185.jpg (2MB, 3264x2448px) Image search: [Google]
1460235670185.jpg
2MB, 3264x2448px
>>58690728
>take big data and machine learning module
>oh boy, im sure ill be writing cool software that does cool machine learning shit
>nope
>3 months of Weka and Excel
>i took comp sci for this instead of just shaving my legs and being a slutty shitposter with actual income
>>
>>58696484
Dude you gay.
>>
>>58696484
dont worry man, the person that posts all those gay images has a god awful ugly face. anyone have the picture?
>>
>>58696631
because i took comp sci?

>>58696630
yup

>>58696644
doesnt make their legs any less smooth looking
Thread posts: 25
Thread images: 7


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

If you need a post removed click on it's [Report] button and follow the instruction.
If you like this website please support us by donating with Bitcoin at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties. Posts and uploaded images are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that website. If you need information about a Poster - contact 4chan. This project is not affiliated in any way with 4chan.