[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Hello /sci/. Here's the deal. I'm working in something

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 26
Thread images: 1

File: karat.png (497KB, 1309x656px) Image search: [Google]
karat.png
497KB, 1309x656px
Hello /sci/.

Here's the deal. I'm working in something real nice. And I'm hitting a computational power block. Basically, I need to multiply polynomials fast, and I can't risk losing accuracy. The polynomials have about 10^5 or so terms, 15 variables and are of degree about 10~20 in each variable.

For this matter I was considering FFT, which seems no good because of precision problems, and NTT and Karatsuba, which don't seem to improve anything with such sparse polynomials.

So, what would you do sci?
>>
did you try wolfram alpha?
>>
>>8226126
I have tried mathematica, yes. Simple "solve and eliminate", groebner bases, and an old implementation of Wu-Ritt. It falls quite short. I made my own C++ implementation with Wu-Ritt but it's struggling.
>>
Have you considered overclocking your computer?
It might help.
>>
>>8226168
a constant improvement like that isn't going to help. i'm going to need at least sqrt(n) improvements or so
>>
Why would FFT have precision problems?
>>
Quick tip for writing some hella efficient code to further combat your problem. Use assembley language.
>>
>>8226214
What's your idea? I'm working in Q. How would you implement an arbitrary precision FFT?

>>8226242
It's going to be hell to code a BigRational library in assembly. It was already heavy in C++.
>>
>>8226270
assembly is easy
>>
How about you show us how you are calculating it now?
>>
>>8227356
no, not worth the small speedup imo

>>8227363
calculating what? polynomial multiplication? just the naive O(n^2) algorithm, using binary search trees to store the terms
>>
I might be completely off base because I haven't looked this kind of thing but improving time by a factor of sqrt(n) sounds like a lot. Have you considered cuda or renting amazon cpu time and doing it in parallel?
>>
>>8227416
Katarsuba is a sqrt(n) improvement usually, but with very sparse polynomials, especially multivariate ones, it doesn't help at all

I did consider using GPU and parallelization techniques. I'm installing intel OpenCL right now and I'll see how much improvement I can get.
>>
>>8227421
You probably know this but if a cheap gpu does well, a more expensive one will do quite a bit better. Titan X has 3k cores. Good luck anon.
>>
>>8227437
>Hello /g/, requesting the Gaming Rig guide :^)
>>
>>8227442
That's legit not like me at all. I come on this site maybe 30 mins a week and half the time I post I fuck up and I get shit like this. Is there a problem with small posts like i did here if I think it can help someone?
>>
>>8227465
your post was good, the reply was good-mannered banter.
lighten up bro
>>
>>8227467
Righto no worries then.
>>
>>8227421
can you tell more about how you use karatsuba for multivariate polynomials? i don't see why it would work badly
Let's say i have polynomials P and Q in variables x, y1, ... yk, with degree 2n in variable x
then i forget about all the other variables, write P = Ax^n + B, Q=Cx^n + D, and proceed as usually. also choose different variables when iterating
>>
>>8227501
yes, the fact that the polynomials are multivariate itself is not that big of a blunder

the problem is how sparse they are. for example, at one point I have polynomials P with 5000 terms, Q with 500 terms, ,and I get polynomial P*Q with 250 000 terms. on average, there are 10 terms summed up per monomial in the answer. this isn't what you expect. the speedup for single variable happens because, if you have polynomials P of degree n and Q of degree m, the answer will have n*m terms and degree n+m. So terms will "collide" a lot and a lot of sums are involved.

the sparcity, again, is huge. there are 16 variables of degree about 5 or more. this means 5^16 (about 10^11) possible terms, with only about 10^3 of them being populated.
>>
>>8226136
Are you leveraging your GPU using a library like, for example, CUDA?
>>
>>8227670
...no
how do I into gpu
>>
>>8227776
Continuing with the Nvidia/CUDA example since that is what I am currently working with:

https://developer.nvidia.com/how-to-cuda-c-cpp
>>
>>8226126
wew lad
>>
>>8227932
i dont have an nvidia card though, I guess it's OpenCL for intel?
>>
>>8228213
OpenCL should work fine and there are those that will argue that it is a better idea. I've been using the CUDA toolkit for years now and it is comfy to me. Just pick whatever works with your hardware and run with it.
Thread posts: 26
Thread images: 1


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.