[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Data Crunching Server Build

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 39
Thread images: 5

File: 1447148745375_rn10-2013-Thema3g.jpg (814KB, 1600x774px) Image search: [Google]
1447148745375_rn10-2013-Thema3g.jpg
814KB, 1600x774px
So I'm building a server at work for data crunching massive amounts of data (up to 1.5 TB at a time). Our normal data crunching computers - which run Windows 10 with 512 GB RAM, two Titan X's, 2 CPU's (Intel, forget what kind, but they're good) aren't cutting it in terms of doing the work in a timely matter. One adjustment in the tweaking of a data set takes our computers about 5 minutes until we see the effects of the tweak.

So what I'm asking you all, is what equipment would be included in a data-processing server that could crunch these types of data sets? My budget is under $200K

Pic related
>>
With those specs maybe you should be looking at your algorithm efficiency instead of trying to upgrade your hardware.
>>
>>8480744
I call bull. No one working with that much data would be using Titans, they'd use something created specifically for GPGPU.
>>
>>8480744
I'd just like to interject for moment. What you're refering to as Windows 10, is in fact, GNU/Windows 10, or as I've recently taken to calling it, GNU plus Windows 10. Windows 10 is not an operating system unto itself, but rather another free component of a fully functioning GNU system made useful by the GNU corelibs, shell utilities and vital system components comprising a full OS as defined by POSIX.
Many computer users run a modified version of the GNU system every day, without realizing it. Through a peculiar turn of events, the version of GNU which is widely used today is often called Windows 10, and many of its users are not aware that it is basically the GNU system, developed by the GNU Project.
There really is a Windows 10, and these people are using it, but it is just a part of the system they use. Windows 10 is the kernel: the program in the system that allocates the machine's resources to the other programs that you run. The kernel is an essential part of an operating system, but useless by itself; it can only function in the context of a complete operating system. Windows 10 is normally used in combination with the GNU operating system: the whole system is basically GNU with Windows 10 added, or GNU/Windows 10. All the so-called Windows 10 distributions are really distributions of GNU/Windows 10!
>>
>data crunching
>windows 10
what the FUCK are you doing, anon

anyway with those filesizes, SSDs are an absolute requirement if you don't want to kill your runtime with file IO. also, i dont know what your problem is, but if you can make your computations easily parallelizable, you're a good candidate for gpu computation.

but to be absolutely honest - don't build this shit yourself. take the money you'd spend on this in-house equipment, and get a subscription to an actual server farm. something like https://aws.amazon.com/hpc/
>>
File: 1331517992187.jpg (79KB, 291x308px) Image search: [Google]
1331517992187.jpg
79KB, 291x308px
>>8480783
For the cost, Titan X's are better than using 6000's.

>>8480803

We ARE using SSD's mostly. We recently purchased PCIE Hard Drives, and we are focused on maximizing GPU performance for computations.

We can't use server farms as the acquired data must remain within closed networks.

So my question remains, what components would I need to build a server that would do the heavy-lifting of the data crunching.
>>
>>8481112
If you're spending 200k you shouldn't be asking us for what parts to get.
>>
File: 1301254519448.jpg (10KB, 288x306px) Image search: [Google]
1301254519448.jpg
10KB, 288x306px
>>8481134
most useless fucks here, I swear.
>>
>>8480744
What specifically are you processing? If your image in the OP is any indication, you're running finite element simulations, which is most definitely *not* "data crunching." The type of hardware you need will hugely vary depending on wtf you're doing
>>
>data crunching computers
>run Windows 10
What the fuck are you doing
>>
File: 1319230811289.jpg (12KB, 303x320px) Image search: [Google]
1319230811289.jpg
12KB, 303x320px
>>8481399
Windows 7 has a max ram of 192 GB
Windows 10 has 512 GB

>>8481395

Computed Tomography. It is not for finite element analysis. It is most assuredly data crunching because I said it is. How else would you recreate a 3D image based off of thousand of 2D xray images?

Can someone just answer the fucking question without questioning the question, jesus fuck
>>
>>8481414
What software package are you using for model reconstruction? They usually offer guidelines for supported hardware/ software.
>>
>>8481414
>Did you expect me to use Windows 7!?
...how can you miss the point so hard.
>>
>>8481421
They sure do, now I want something better. Hence my question on servers to handle large amounts of data.

>>8481431
Apparently you missed my point numbnuts. We can only use those two OS's.
>>
>>8481414
>CT data processing / reconstruction is not "data crunching".

Whaat?
>>
>>8481433
>We can only use those two OS's.
Yes, that was why I made my first post, numbnuts.
>>
>>8481433
>Only choose 2 OSes
Modern CPUs can virtualize operating systems quite well if you would need to.
>>
>>8481433
...

So what software package is it so that we can help you? FFS, it might not even support multicore or GPU's depending on what it is
>>
>>8481449

VGStudio Max

I said in my OP that we use 2 gpus, and 2 cpus.
>>
>>8481456
It uses OpenCL to do the reconstruction, so you'll want to buy an AMD GPGPU. A titanX would just be an expensive office warmer in this case since Nvidia only supports CUDA mainly
>>
>>8481456
>>8481465
Overall, you'll want an SSD, high memory bandwidth, and a powerful OpenCL enabled GPU.
>>
>>8481465
OpenCL is supported on both AMD and Nvidia. The only thing is that CUDA is Nvidia's in-house alternative to OpenCL and as such it runs faster on Nvidia hardware.

Also, most people prefer using CUDA over OpenCL because CUDA is just easier to write.

Titan X's are normally fine. That shit is listed as a recommended card in their manual (page 3).
http://www.volumegraphics.com/fileadmin/user_upload/flyer/vgstudiomax30_system_requirements_en.pdf

OP I suggest contacting the VGStudio people and asking them directly. There may be optimizations you can do related to your data set with regards how it's sent to the GPU. I don't know shit about VGStudio so I can't help you.
>>
>>8481478
>OpenCL is supported on both AMD and Nvidia. The only thing is that CUDA is Nvidia's in-house alternative to OpenCL and as such it runs faster on Nvidia hardware.
OpenCL support on Nvidia hardware is garbage. It exists, but you'll spend twice as much on a GPU with a third the performance.

>Also, most people prefer using CUDA over OpenCL because CUDA is just easier to write.
Who gives a fuck? That's not what the software supports.

Even if you were a retard and went with Nvidia, you would want one of their Tesla GPGPU's, not some NEET gaymer shit.
>>
>>8481433
>We can only use those two OS's.
What the fuck shithole company are you working at that won't let you install a *NIX variant on your custom server.
>>
>>8481478
CUDA is closer to the hardware in the sense that you can assume more about the underlying architecture which can give you some better aim at code efficiency.
>>
>>8481492
I said earlier, on our computers we can only use those 2 (for security reasons). The server is different, am I wrong?
>>
>>8481495
There's literally no security reason why you couldn't use a UNIX/LINUX system on your desktops. That's a sign of an incompetent and/or lazy systems admin.
>>
>>8481482
>a third the performance.
Citation please. I know CUDA outperforms OpenCL on Nvidia cards but I don't think it's by that much anywhere.

>NEET gaymer shit
Say what you will but the Titan X is way cheaper than a Tesla GPGPU and it has a decent amount of ram for consumer grade data crunching.

It is strange that OP's work chose Titan X's for their computers but I still think it's a better idea to first contact VGStudio and ask if there's some configuration they're overlooking or if this performance is typical for the current set up. Also, they could probably provide better feedback as to what hardware to upgrade to (even OpenCL can be finicky when it comes to running on lots of different hardware, so performance may vary in arbitrary and unintuitive ways).
>>
>>8481493
Yes. That was what I was implying. CUDA will always outperform OpenCL on Nvidia simply by virtue of it being designed for that hardware.
>>
>>8481482
>>8481497

We use Titan X's because we've experimented with them and found they have comparable performance for what we're doing but significantly cheaper than the Quadro 6000's we were using. Plus we have 2 GPU's per computer times 10 computers, so we save a lot of money going with Titan X's.
>>
>>8481516
why the hell is calculating a tomography so computationally expensive?
Are you using some new, experimental and horribly non-optimized algorithm?
Hire an assembly/parallel processing programmer to fix this.
>>
>>8481523
>t. retard

>>8481516
>Nvidia
Maybe that's why you're not satisfied with the current level of performance though.

>>8481497
According to benchmarks the r9 fury performs comparably on OpenCL benchmarks to the titan X. The difference is that the r9 fury costs $500 and the titan x costs $2000.

Op should just get a whole bunch of AMD GPU's for cheap.
>>
>>8481544
>calls me a retard
>uses terms such as "data crunching computer" and "crunching 1.5TB at a time"
>uses some stupidly unoptimized off-the-self software without knowing anything about it
>scales operations with the said software
>asks 4chan on how to spend $200k on equipment
>doesn't even know the class of intel CPUs they currently use

Clearly you aren't the person responsible to make the choice, so suck my dick and go back to not knowing how to write a better and faster reconstruction algorithm.
>industrial x-ray tomography of an undamaged car
I bet you're a QC retard or worse.
>>
>>8481574
>implying I'm him
>implying you can just make NP-complete problems not NP-complete with a little bit of assembly code
>hurr I know everything because I took java 101
>>
>>8481582
>completeness being relevant to this conversation
kys you dumb CS freshman
The optimization class is next semester.
>>
>>8481594
Just ignore him. He's retarded
>>
>>8481594
If it's so easy then why don't you go out and write your own CT postprocessor that can somehow run on a regular laptop? You would make millions and save tons of lives.

Oh right, you can't because you don't know what you're doing.
>>
>>8481594
>>8481595
You have to stitch millions of noisy 1D X-ray interferometry images into a 3D reconstruction of the object. No shit it's going to be challenging to process.

You should both lose your computer science license for not realizing this
>>
File: 1473957733278.jpg (31KB, 480x547px) Image search: [Google]
1473957733278.jpg
31KB, 480x547px
>>8481523
>>8481574
>>8481594
>>8481595
Thread posts: 39
Thread images: 5


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.