7 nm in 2018
5 nm in 2020
can't stop the progress train
Some exec at Samsung has been saying since 2009~ that they'd encounter no issue in scaling down to 5nm. Pretty positive outlook for the industry. The real important aspect to all of this is that the industry is expecting solid growth. IOT and emerging markets spells billions more in revenue for the industry. There'll be an IC in every light fixture before you know it.
If energy consumption doesn't fall directly in line with density then soaking up heat out of a die becomes much harder, self heating becomes more and more pronounced. It'll be a major area tackled by foundries in the next few years.
Man, I sure do miss the days when hardware pretty much doubled in speed every 6-8 months... sort of. It did make it impossible to invest in computer hardware, but it was interesting at least.
I am curious how cooling will work. Perhaps by that timeframe the graphene/carbon nanotubes will be manufacturable enough to be used in consumer chips. I thought that was the way they talked about drawing the heat out, anyway.
A smaller die lets you get more dies out of a wafer. It drives down costs.
The area scaling benefits also allow you to pack more transistors into a die which equates to performance or added functionality.
Smaller gates require less voltage to switch which can lower power consumption and heat generated
Depending on the libraries used, you can reach higher clocks with equal or less voltage
TSMC, the company that brought us 4+ years of 28nm GPUs due to fucking up 20nm so bad and taking forever with 16nm, is promising 7nm in 2 years?
Please excuse me while I dial for an ambulance, my sides have terminally been lost.
They didn't fuck anything up with their 20nm process, planar gates were just at their natural end of life, and they wanted to squeeze a bit more revenue out before doubling down on FinFET R&D.
There is a little thing called the short channel effect and if you want to understand anything at all about transistors this needs to be the singular thing you wrap your head around. A transistor is a little device sitting between two comparatively giant pillars known as the source and drain wells. The area between the wells which the transistor sits atop is the channel. The channel is the area of silicon in which electrons flow from source to drain. Where the gate comes in is altering the resistance of the channel to stop the flow of electrons. Thats your primer on transistors 101
Now the short channel effect:
a shorter channel is harder for a gate to control
the shorter a channel is the more leakage current there will be
when a channel becomes short enough the device is impossible to turn "off" and there will always be a relatively high level of leakage current
though drive current for the transistor is lowered, relative leakage current increasing can offset gained performance
The reason why FinFETs exist is to combat this. They actually grow the gate in length to increase its efficacy while the channel itself can continue shrinking. You can get all of the area scaling benefits without the increased leakage. In fact FinFETs control the channel so well that leakage can be reduced to less than 1% of what you'd get from a comparable planar device.
20nm planar nodes were literally never going to be used for large die high performance parts. In that neither TSMC nor Samsung fucked up anything. They were just squeezing a bit of life out of conventional processes for some ARM chips to vendors who paid top dollar for the highest binning.
When conventional scaling has come to a definite end things like photonic gates and quantum junctions will be bleeding edge process tech.
Also worth stating that GAAs can operate in the realm of .1v~, and if si filler is added around the completed nanowire it has perfect heat dissipation out of the die. It lends them to 3D stacking. Area scaling wouldn't improve, but transistor counts would.
thanks for the longwinded primer, pajeet, but I'm already familiar with the fundamentals of transistor geometries in modern lithographic processes.
no high performance version of a node, due to no FinFET yet or whatever reason = failure
terror attacks on semiconductor fabrication plants when?
he's talking about Gate All Around, where the gate wraps the channel completely.
they've been made in labs, but the industrial processes for making them economically in bulk for ASICs is still a ways off
Transistor size is somewhat decoupled from the node label. Node labels are more so marketing terms nowadays
yes that is the case with 20nm->16nm, I should of been more clear with my post. Node labels do represent generational improvements, but are somewhat decoupled from scaling. Foundries have implemented various techniques to obtain power and thermal improvements without having to heavily rely on physical scaling.
so basically it's the 20nm process with 16nm long FinFET gates, and calling it 16nm?
I just wish all these processes would be have meaningful bottom-line labels, like how many fucking 6T SRAM cells can you fit in a mm^2 or something.
Intel has been using 300mm wafers since 2002, TMSC has plenty of fabs using 300nm. 450mm is supposed to come soon (~2020).
it is basically their 20nm process with FinFETs. I would assume node labels given are by Xnm because of tradition and to communicate that each new node meets a few metrics (~35% power reduction at the same frequency and area scaling) that used to come with pure physical scaling. Far easier to communicate improvements with a single number than trying to explain why it meets a whole pile of metrics to the average consumer.
>so basically it's the 20nm process with 16nm long FinFET gates, and calling it 16nm?
Yes, Samsung did the same thing actually, though they did squeeze some back end features to help give a considerable area scaling advantage over TSMC's process. Both TSMC and Samsung/GloFo are offering a transitional FinFET process. Going this route saved them some substantial expenditures in getting a whole new node online.
Bottom line however is that process names are complying with established ASML guidelines, so its not the fabs just pulling things out of their asses.
That is the back end, its the metal stack that delivers power to the logic fabbed on the front end. It ultimately controls area scaling, but not necessarily front end feature size.
Your connected poly pitch could be 80nm on a given process where the gate length actually is 16nm. A process shouldn't ever be simplified to what equates to a marketing name anyway.
smaller gate lengths don't get me closer to parallax occlusion mapped nipples in 8k, guys.
lower power draw is nice and all, but consumer grade ASICs can't get much bigger, so it's all about actual transistor density.
Well unfortunately we still have to carry electrical signals, and you can't be willy nilly with that. Isolation and signal integrity is vital. The amount of time spent on designing the metal stack is substantial, and includes ridiculously complex metallurgy to ensure everything works as intended. Not many metallurgists work with isotopes, but in the semiconductor industry designing a BEOL you do.
5nm is just where companies dealing with process IP have laid out a clear path to as of a couple years ago. Its not exactly some magical limit where area scaling ends, though totally uninformed posters here would have you believe otherwise.
>This transistor could be said to be a 180 picometer transistor, the Van der Waals radius of a phosphorus atom; though its covalent radius bound to silicon is likely smaller. Making transistors smaller than this will require either using elements with smaller atomic radii, or using subatomic particles—like electrons or protons—as functional transistors.
Do you expect we could get to this point for commercial processors aimed at the consumer market?
Do you think we could actually engineer at subatomic level on a mass scale?
The point where that would be necessary is well beyond this decade, and the field of quantum junctions is already being explored.
For economic reasons it wouldn't necessarily be advantages to pursue area scaling beyond a point, if you can increase transistor count without accelerating self heating through more conventional means then you're able to deliver products without a hitch.
> if you can increase transistor count without accelerating self heating through more conventional means
And how do you do that?
Increasing the size of the die would result in more heat right?
Isn't that all that it comes down to:
- die size
- fab size
(- number of new rules added by decreasing the fab size)
note that i am not an engineer
>can't stop the progress train
technically you can't, but can you really call it progress if you're arbitrarily describing an asymptotic slowdown as linear or better progress.
> 7 nm
> 5 nm
> having anything to do with 7 or 5 nm feature sizes
tippity toppest kek, OP
>no high performance version of a node, due to no FinFET yet or whatever reason = failure
this.....I want fast flagship cards and CPUs not chips to automate my toilet
Has anybody thought about how Optane might be a way for Intel to keep up with Moore's Law functionally without actually attaining transistor count requirements?
It seems like a real game-changer for compute intensive applications
3D stacking logic dies with an inert thermally conductive fill material between them, or build vertically oriented structures from the getgo.
3D stacking and die stacking to create more specialized MCMs yields huge benefits for making production way more economical as well as extrapolating more performance.
So you're saying you actually know what TSMC's 7nm gate length will be? I'd love to hear it.
optane is little more than flash over a DDR4-like interface.
you'll see more IOPS at low queue depths, but it'll still be substantially slower than SDRAM DIMMs, which are already dogshit slow from a processor's point of view.
2.5D and full 3D DRAM integration are better approaches to general computation speedup.
Sounds good but wouldn't there be a communication overhead if you use many layers of transistors?
Or a sync problem like in dual gpu cards because as far as i understand what you are talking about is multiple processors, in a small, area that are glued together. Am I right?
> MCMs yields
what is that?
as said elsewhere, gate length is becoming an increasingly worthless metric for describing semiconductor manufacturing.
in any case, the entire industry is already going along with naming each 2-3 year generation a 29% smaller number regardless of any bearing in reality:
>Sounds good but wouldn't there be a communication overhead if you use many layers of transistors?
There can be, it depends on what parts you're integrating and how its done. If you segmented a dual core or a quad core processor then stacked the dies on top of one another you wouldn't want to have them all routed through the same TSVs. Though if you build a series of pillars each die in the stack can use their own without having any interfering signals. Something like a 4x4 grid of copper bumps. Core 1 the bottom layer uses the first row of 4 bumps, the second core uses the second row, and so on. Each core could handle cache coherency by sharing one bump.
Thats an extremely simplistic explanation, but a proven concept. Stacked DRAM such as HBM uses common bumps and the command processor in the stack figures out how to signal across all the slices, but thats not dealing with anything as complex as high performance logic.
Both posts you quoted were mine, good job.
I asked you a rhetorical question in jest because I know that you're clueless.
Does someone know what the limiting factor is for performance on GPUs atm?
Is it the number of transistors? shader units? Can there be something said about this?
Let's say in games for example.
Guess we will just have to wait and see what happens after we get the most out of silicone.
Hope it works out well (for my job security
and video games)
Thanks for your explanations!
GPUs are massively parallel processors, and they'll continue to scale in performance the more ALUs you have. Its just a matter of efficiently delivering instruction to them, and providing ample bandwidth. They're far more complex beasts today with dedicated geometry processors, compute queues, scalar engines, and other added bits that certain APIs can make use of, but the bulk of their performance comes straight from their number crunching ability. Its one case where moar coars is always the answer.
so you post the ASML graphic about feature size being decoupled from node name, then ask leading impossible questions (assuming no TSMC execs in /g/) when somebody else dares to say the industry is a bunch of lying kikes?
tiptop baiting, friend.
now fuck off and get back to work. semi manufacturers have been doing a shit job lately, and the industry doesn't need you dicking around on 4chins.
The M2 half pitch is a part of the BEOL, its not a front end feature. TSMC's 28nm node produced about a 33nm long gate because of their RMG approach. GloFo's offered 28nm process produced a 25nm gate. You can find die analysis photos done by chipworks of assorted 20nm bulk parts, as well as 14nm parts from intel, and 14nm parts from Samsung. Simple things like gate pitch and fin height are publicly available information, and its not hard to look at a photo and get really damn accurate measurements when you have a known reference.
Again, I asked you a rhetorical question because I know you're clueless.
The public marketing name of a process is not a strict technical measurement, but it is not necessarily wrong either. Your baseless assumption about the transistor profiles of future 7nm and 5nm nodes makes you a buffoon. I don't expect anyone here to know a single thing about process technology, but a moderately educated person would know not to make an ass of themselves on a topic they were totally foreign to.
Ok, do you know what this measure exactly means they seem to be using here on the y-axis:
HGEMM / W
Is it some sort of matrix multiplications per watt?
Its Nvidia's way of showing compute performance per watt, they've just chosen a metric that makes them look best. Its fast matrix-matrix multiplication, but half precision. They've introduced FP16 hardware to lower power in certain ops.
So you hit the nail on the head.
They've showed similar graphs using that metric or another. I think they first started really pushing that at CES 2015 when they introduced their big carputer initiative. They were highlighting the half precision performance of a Tegra SoC, and they told everyone it had a 1TFLOP IGP, but hid in the footnotes that it was only when doing half precision. I do believe they've talked about possibly going down to quarter precision for some ops as well, thats their vision for "mixed" precision. It makes sense for pushing perf/watt, but from an advertising perspective I think its kind of scummy.
Thats like me telling an auditorium full of Olympic sprinters that I can run faster than a formula 1 car then leaving the room and whispering "from a dead stop and only while accelerating over a distance of 5 feet under ideal conditions."
>I just wish all these processes would be have meaningful bottom-line labels, like how many fucking 6T SRAM cells can you fit in a mm^2 or something.
Even within a process node, SRAM cells can have different areas depending on need.
> Samsung 6T cells at 14nm:
> High Density cell (6T-111) : 0.064 µm2
> High Performance cell (6T-122) : 0.080 µm2
it all depends on what gates you want to switch faster at the cost of making gates longer, use two fins, etc.
people in the high performance network world are starting to sweat that SRAM shrinking won't be sufficient for 400Gb networking with on-die buffers, since the projected die sizes with needed buffer size and speed will be bigger than the reticle limit.
full 3D stacking or even interposers mean substantial price hikes.
> TSMC expects to start production of 7nm chips in the first half of 2018
> TSMC "expects" to start "production" of "7"nm "chips" in the first half of 2018
given that Intel let 10nm (Cannonlake) slip to 2H'17, color me EXTREMELY skeptical on this one.
sub-10nm feature transistors may be physically possible, but manufacturing costs rise dramatically as soon as you start needing shit like EUV light sources, water-cooled dielectric mirrors, and triple/quadruple patterning.
fuck knows how they'll ever get x-ray lithography to work given the physical impossibility of even dielectric mirrors.
I can only imagine what non-shrink improvements they'll try to pass off as being "equivalent" to 7nm/5nm geometries.
Has the crowd really changed that much? Or has everyone conveniently forgotten the long trail of faecal matter?
don't give them ideas
how many fabs are there? about 20 big ones? I think they are guarded better than nuclear plants
i wouldn't be surprised if they got AA missiles on site - seriously.
ITT: People duped by pop-sci into thinking durrrr anything is possible, progress is inevitable
The fact is that progress is only inevitable when it is easy - now we are literally pushing die manufacturing technology to its physical limits. 14nm barely came out, and the challenges here on out are exponential.
A lot of people here are literally believing in magic bullets like graphene or quantum particles. If these things were so fantastically easy to get working, why didn't they try to manufacture 14nm with them considering what a disaster it has been to do it with silicon?
this topic is interesting to me
maybe even with current cpu+gpu computers can display any combination of pixels at 1920x1080@60 + any sounds, and photorealistic games are possible even today with enough programming skills
there's only the one usual semiconductor industry shill here trying to whisper sweet things in our ears, who then goes on to claim if you're not an insider too, any doubts about them are completely unfounded.
yes, improvements beyond 14/16nm will happen, but you'd have to be a fucking idiot to have unquestioning faith in the industry after the last 10 and especially last 5 years.
Yeah. Also there was no reason to feel bad for not having top of the line shit because in 6-12 months, new ones would be out and have better performance for much lower price. Now CPU speeds have stagnated, and consumer level chips are stuck with 4 cores. 6-8 core CPUs cost a fortune. GPU market is a little better but it will be a long time until you can get something better than a GTX 970 for less than 400€.
How and why?
I have worked with small systems and find I get all manner of problems popping up around 15~12nm as other types of physics become more significant.
I know one can make adjustments to get smaller even down to 5nm, heck I have seen single atom systems technically work in labs I have visited. However it gets insanely complicated and I start to ask why, as countless other problems arise. More so as there are easier ways to get performance gains, like better software and more end application optimization.
Progress does not come free, and is not always guaranteed. Why did 14nm get delayed, why do they spend tens of billions on R&D every year, or why are there people with triple PhD working 24/7? Because it takes I N S N A E amounts of trial and error to make new technology work. Some technology is even harder, like nuclear fusion. Boatloads of money and millions of man hours but no success.
I'm sure certain sectors will push for it hard as a means to sell people replacements for all their shit, but I think customer inertia is stronger than you give it credit for, especially is a very weak global economy.
> wanting to let russian mobs and the chinese hack not just your PC and smartphone but your car, vacuum cleaner, and electronic dragon dildo collection
Every seemingly insurmountable issue is nothing but a matter of man hours applied to a solution. Like most any other industry with every step you take forward you end up learning something new and refining your grasp on already known concepts. The idea for FinFETs has been around since before 2000, over 16 years ago now, and when it was first envisioned it was based on SOI. We're fabbing FinFET devices now on bulk silicon, strategically employing SiGe, utilizing fully depleted channels, and a whole host of other things to increase performance while keeping costs down relatively. We can effectively lock electrons inside of the channel ensuring they don't tunnel outwards, what some people call a quantum well FET has all the properties displayed in a modern FinFET now.
You've got to keep putting one foot in front of the other, keep investing in R&D. In a few years these current 14/16nm FinFET nodes will be commodity, every cheap ARM SoC from companies like Rochchip and Allwinner will be FinFET. Thats the benefit of pursuing progress. What was once a tremendous feat will become mundane.
The billions of dollars and billions of man hours shouldn't be taken for granted, and timely technological progression is never a guarantee. TSMC could just as soon incur massive debt and dissolve in 5 years time. I tend towards optimism because on the bigger picture someone will always find a way forward. Delays may abound but so long as there is profit to be made there are engineers willing to work.
>company designs a printed circuit with a sensor array on it
>one of the sensors detects certain gases
>can be used to monitor bacterial growth in spoiling food by measuring the byproducts
>because super specialized products have less appeal, they're also useful for inventory tracking
>companies put these in milk cartons to better track inventory
>a small gyro and accelerometer are used to ensure all the cartons remain right sight up, aren't abused or ruptured in transport
>can sync with "smart devices" to show consumers how close to spoiled a food product is
>gyro and accel data can be used to track distance traveled, chip's unique ID is read like a beacon anytime it syncs to something
>anyone with malicious intent now has a viable vector for tracking you with a carton of milk
The NSA probably loves this shit. I guarantee someone in DARPA is working on this too.
> bickering between pessimists and optimist, The Thread.
yes, chip makers have a lot of resources and a generally great track record, but sometimes physical limits really do come into play for an industrial sector.
we're not driving 200 mpg cars, flying to Paris on Mach 4 jets, or taking a daily commute to our job in GEO on a Saturn V for real reasons, and there may (or may not) be similar impending but unforeseen limits with chip fabbing despite the huge pool of talent and money.
please name another enterprise that has gotten half a billion a year for 50+ years and not produced substantial progress?
We've known since forever that magnetically contained quasineutral plasma fusion simply can't break even without using highly neutronic reactions, so we're just trying to blanket tokamak cores with Li layers for breeding or whatever.
Are plasma physicists simply unable to give any meaningful progress without blowing their entire budgets on magnets or something?
If fusion is at all practically possible, I expect the Chinese to pull it off by simply being able to iterate designs quickly and more cheaply than western scientists.
What always shocks me is the narrow worldview of engineers. Some software or hardware engineer sitting in an office, helping to design a spying tool never thinks that, "hm, maybe I'll be victim to this myself once it becomes commonplace."
The lack of self-awareness and wide horizons in engineers is depressing.
I don't know.
Maybe it's actually hard and needs lots of time. Only the time for planning, building and testing ITER is decades. That's fucking ridiculous. I still can't believe how slow this shit is.
Apparently it's viable trying to achieve it so I cannot see how throwing much more money at this doesn't seem like a good idea since it could really solve the fucking energy problem.
Maybe money is the problem why this shit is so slow, but I'm not claiming to know what's going on.
Just seems like very little money compared to how important it is.
I don't know if that syncs with the actual attitudes of engineers. I'm not an engineer, but I am a junior software dev. All my friends are engineers, though. The lack of holistic thinking and the dearth of self-awareness they regularly demonstrate is hard to miss.
They're not stupid people, either. It's either a minor sort of idiot-savantism, where all their intelligences are concentrated within a single cluster of abilities, or just willful ignorance. Most people are like that, really.
Perhaps it stands out to me because my friends often behave as if they're undupable.
No, refining your grasp on already known concepts and learning something new are very different.
I think you are taking about how we often get great branching developments from other aggressive R&D. Like how post-it notes came from advanced super glue R&D, or Teflon frying pans came from advanced refrigeration fluid R&D.
That is a great argument for why we need to push for new things and do research, but new things may or may not include smaller transistors in CPUs. So assuming it is included seems a bit premature, more so given our current understanding of physics.
All I know is my personal experience with CPUs has led me to believe that we have nearly reached the physical limitations of silicon.
>get p4 with hyper threading at 3.4GHz in 2004
>read about frequency limits being reached. see them cap out at about 4GHz
>read about multiple cores
>oh cool I wait until 4 core is cheap and upgrade
>single cores are cheap, dual cores are normal, quad cores are expensive
>wait until core 2 duo comes out.
>single cores are cheap, dual cores are normal, quad cores are expensive
>core 2 not much better than regular core
>get 2 core pentium d and overclock the shit out of it
>wait until skylake
>oh man I can't wait to see the improvements when I buy a new desk top
>single cores cheap, dual cores normal, dual core hyperthreading expensive, quad cores super expensive, more cores are jesus christ and only on xeons, can't overclock
>motherfucker. I'll just save money and get a core 2 quad core
>nope they're still high because they perform just as well as skylake, but good motherboards are super expensive.
>figure I'm going to settle for a 2 core skylake and try to take advantage of pci passthrough and super cheap RAM to runs VMs.
It really feels like I had better options years ago than I do now.
WHERE THE FUCK ARE MY 64 CORE PROCESSORS?
AND QUIT WASTING DIE SPACE ON INTEGRATED GRAPHICS
What ever happened to putting a shitty integrated graphics chip on the motherboard just so you could rule out the video card on debugging a failure to POST? I just don't get it.
Its because intel's desktop chips are simply mobile chips that failed the binning process for laptops.
They make 2 classes of CPU at this point: Mobile chips, and Server chips. Anything else is binned from one of those two classes of chip, most often the former.
Well from what I can google they cost about $2000. Really what I want it a 6 core core 2 duo with hyperthreading for $120 and a 64 core 2 duo at $500. Instead we've got onboard ddr controllers, on board gpus, and locked multipliers.
just what i said, maybe modern or even 2000s computers can privode far more realistic gaming graphics (that is graphics that can change depending on input), and it is software that limits graphics (polygonal 3d graphics, devs not wanting to program in assembly or even directly in 1&0)
Well then you're fucked buddy. 12 threads at the cheapest can only be had with LGA 2011 or LGA 1366 chips, and those aint cheap for various reasons.
If you want cores on the cheap, its either AMD or low tier intel server chips, and even then not at the prices you want.
Besides, the onboard memory controller actually helps a great deal vs running everything over FSB as was done with pre-Nehalem chips.
use too much power at the specified clocks and voltages for mobile. On desktop power consumption or heat output isnt as tightly constrained as it needs to be on mobile.
Not the same Anon, but the way software and hardware work is very poor. The old do everything, but nothing well problem. Also related is investment costs and compatibility issues.
It is like we built a really good car. Then wanted more hauling ability. So we added a roof rack that completely ruined our aerodynamics and stretched the trunk out to get more room, but we didn't rebalanced the frame so crash safety and handling suffered. When what we really should have done is build a truck, rather than mutilate the car.
This is why some old video game consoles could do such amazing feats considering their sad specs, they were build to play video games. Sure they borrowed a lot from PC development and were technically computers, but they were application optimized.
As software is a good part of a computer it has a big impact, more so given how little we have really improved it over time.
A good example is the raspberry pi. Hardware it is a no powerhouse, but the software is so refined it allows it to actually do things that are very impressive all things considered.
You are seeing this coming back with embedded markets as it is simple way to get significant gains, often at the expense of flexibility and compatibility. Give them long enough and I joke you will see dedicate picture taking devices become a new market.
It's also a matter of better tools and experience in generating content.
This picture isn't really the best example, but it's commonly believed that the DS version of Super Mario 64 has a higher polygon count than the N64 version due to the characters looking more geometrically complex.
That isn't actually true: the N64 version has characters that use more polygons. It's just that the 3D modelling tools in 1996 weren't very refined and the teams weren't very experienced in using them as compared to 2004.
>Far easier to communicate improvements with a single number than trying to explain why it meets a whole pile of metrics to the average consumer.
Average consumer can't even tell what node their CPU/GPU/smartphone SoC is made on, let alone why that would matter. Shit, they most likely can't even tell what their CPU is without looking it up, except maybe that it's an i5 made by intel.
>Well then you're fucked buddy.
Yeah I've noticed. I'm just saying that as a consumer I have my eye on performance per price when it comes to buying silicon and that ratio hasn't increased much in the last 10 years.
So do they use the same sockets for the desktop as the mobile processors? Or are they testing them before choosing which package to put the chip in? I have no idea how mobile processors are packaged. I haven't opened a notebook in 8 years.
The latter. They test the chips before they cut them free of the wafer, and the ones that fail mobile bins but are otherwise functional end up in the desktop chip package.
Also, in buying chips, you also have to take into account per-core performance has also jumped by an enormous amount vs Netburst.
I cant suggest anything for a modern setup, partially from not having purchased anything recently made, and partially from having ownership of an 8c/16t 2.9/3.3ghz Xeon that i got as a freebie.
Yeah the step up from netburst to core was big. Core 2 was a little bit better. i3-i7 is not much. It seems like we just keep on taking smaller steps like we're getting closer to a wall.
Nehalem significantly (like 50%) improved multithreaded performance over Core 2, single threaded performance was only slightly improved (10%).
So there was a pretty notable improvement. Of course Netburst was garbage so the transition to Core was massive.
So you're saying nehalem cores worked together better? I didn't know that. So a 2 threaded program would run in (1/1.5)*(1/1.1) = 60.61% amount of time on nehalem versus core 2? While a single threaded program would run in 90.91% of the time required on core 2? All of that is assuming the same clock frequency right?
Well if that's true I won't feel so ripped off buying a skylake. I need to relearn assembly. I learned some of it in a microprocessors course but they didn't teach anything but basic instructions and I never used it for anything so I forgot it. I've never written anything that used more than one thread.
Out of curiosity how do the generations stack up when it comes to floating point and int32/64 multiplication?
Nehalem cores worked better because instead of being a pair of cores coupled together through L2 cache, and a pair of CPU dies both fighting each other for FSB resources, all 4 nehalem cores have dedicated and smaller (thus lower latency and faster) L2 caches, w/ a nice big fat block of L3 that also contains the contents of of ALL of the higher tier caches. Communicating through an internal bus and the L3 cache is far far faster vs bouncing data between 2 sets of cores while trying to keep their fuckhuge L2 caches coherent.
Then there's hyperthreading. Some highly threaded programs make good use of it even though it doesnt provide a doubling in performance vs 4 threads.
As for float performance, its one of the things that has consistently jumped by large gains each generation, but thats more in part due to the FPU being made wider every generation. I also may be wrong about this, but I'm pretty sure an AVX512 capable FPU can do multiple smaller float operations simultaneously.
Integer hasnt gotten quite as powerful, but theres still gains to be had. I'd have to look up articles detailing the differences.
Is it possible to buy just a plain FPUs or ALUs? The only stuff I can find is small integer adders. I'd like to save space in an FPGA by using one instead of some kind of shift and accumulate clusterfuck.
FPGAs already have tons of fixed blocks for common purposes.
even garbage-tier student kits will have a few dozen of these blocks, and the highest end kits targeted DSP uses have upwards of ten thousand.
Does Altera have those too?
Picking a FPGA seems difficult. I downloaded a couple spreadsheets of FPGA models and calculated how many logic blocks I get per dollar and found Cyclone V E with 18480
logic blocks gave the best blocks per dollar at 374 LABs/$. But then there's things like dsp blocks which depending on the block can replace many more generic blocks. I also want to use the FPGA to mine bitcoins when I'm not tinkering. Any tips on picking the right one?
I haven't even heard of an FPGA made in the last 20 years that didn't have at least a little mult/add acceleration.
pic related is literally the oldest shit still actively sold by Altera, about 9 years old...