[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Sites that still use SQL

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 91
Thread images: 6

File: kuumotus voimistuu.jpg (15KB, 314x364px) Image search: [Google]
kuumotus voimistuu.jpg
15KB, 314x364px
Hey guys! Do u know any sites that still use SQL based data saving system ?
>>
Don't you have a street to shit in or something
>>
>>58261242
literally every website that stores any data at all that doesn't use nosql uses sql you fucking mongoloid
>>
>>58261242
4chan.org
>>
>>58261242
>storing data not streaming
What could possibly go wrong?
>>
>>58261281
No, you're wrong. My website uses files for data storage. It works really fine and I can even mimic complex requests.

Just learn how to code.
>>
Any wordpress site.
>>
>>58261985
>Just reimplement SQL and you don't need SQL
>>
>>58262020
No, it's just plain csv I scan sequentially for data. Not that hard.
>>
>>58262046

What if I told you that you could probably get better performance and database security -- especially for complex operations -- just using SQLite?
>>
>>58262178
I've tried it but SQLite is unable to handle more than 140 terabytes, so I can't.

And actually, I don't have performance issue so I feel changing my system would be premature optimization anyway.
>>
>>58262214
Do you think SQL might help pare down all that data redundancy?
>>
>>58262214
>+140TB data
>csv
kys
>>
>>58262238
Probably not SQL as a language, but any database system certainly would. However, the main bottleneck I have for now is on the frontend. I am forced to use a JS framework that alter dynamically the dom tree of an HTML template. This is so inefficient, I don't get it.
>>
>>58262214

>SQLite is unable to handle more than 140 terabytes, so I can't.
First and foremost, it is unlikely that you will ever reach this size of data. The developers for SQLite are actually unable to make any tests around this upper bound because there are no hard drives big enough to ever reach that limit.

Second of all, if you are storing your CSV files on an ext4 filesystem (the default filesystem used for Linux), the maximum filesize is 16 TiB. Since you'd be using a single file database either way, you'd be foolish not to use SQLite.

>I don't have performance issue so I feel changing my system would be premature optimization anyway.
The term "premature optimization" has more to do with micro-optimizations, rather than whole architectural design decisions, such as the decision to use SQLite. And performance is not the only reason to use it. Consider the fact that SQLite is transactional. How does your database design handle power loss during a commit?
>>
>>58262376

>I am forced to use a JS framework that alter dynamically the dom tree of an HTML template
>This is so inefficient, I don't get it.
Literally the whole purpose of using JavaScript is to manipulate the DOM while the page is in the browser. What the hell is the problem?
>>
>>58262214
>140TB csv
>premature optimization

Code bootcamp, copy/paste kiddie spotted
>>
>>58261985
How fast are your point queries while doing entire table scans you absolute idiot?
I can tell you my sql is a fuck ton faster than your file setup. Also how do you support multiple views?
What about when you have two csv files and want to join the data in each one? How fast is that?
>>
>>58261985
I'm happy for you, anon

Come back when your little project grows in scope
>>
>>58262433
>there are no hard drives big enough to ever reach that limit.
You're right, and that's why I'm currently writting a distributed filesystem which enables multiple part files to be seen as a single file by the OS. It'll prevend the problem, I guess.

Secondly, you're right about the "premature optimization", it could have been a bad choice of mine. However, I made my design able to handle power loss by adding a boolean csv field which is set to true only if the record has been completely written. Then, my app scans the database on startup and removes the records when this field is false.
>>
>>58262615
or you could just use SQL which handles database distribution just fine, guarantees ACID, uses way more efficient algorithms than your brain can even comprehend etc. and spare yourself reinventing the wheel you fucking imbecile
>>
>>58262632
Yes, but based on my experimentations, the b-tree based lookup - used by Posgres for instance - is far behind the performance I can reach with my system. But you're right, I still can enhance that ACID part.
>>
>>58262615

Anon... you are trying to solve problems that have been already solved very well by other software. Even if you are not using SQLite due to file size limitations, there are other RDBMS that handle extraordinarily large datasets rather well. Though it has a table size limit of 32 TiB, PostgreSQL has no maximum database size.
>>
>>58262711
Who am I to question your genius. I'm sure a sequential scan of a 140TB file distributed across a million parts is going to work great in the long run.
>>
>>58262214
>140 terabytes
>>58262376
>I am forced to use a JS framework
kek
>>
I do have to wonder what this lunatic is doing that needs 140 TiB of data that made him think, "yes, I am going to make my own database system using CSVs."
>>
>>58262845
If he hadn't said csvs, I would have guessed that he worked on video streaming software. That's the only case I can think of that would need a different kind of data structure, would have enormous files and very specialized software to handle very unique types of scaling issues.
>>
File: pg_vs_csv.jpg (38KB, 841x877px) Image search: [Google]
pg_vs_csv.jpg
38KB, 841x877px
>>58262790
Here. It's scientific. The test is driven over a 50GB database.
>>
File: SCIENTIFIC.png (12KB, 718x569px) Image search: [Google]
SCIENTIFIC.png
12KB, 718x569px
>>58262954
>>
>>58262845
even more... he needs to use a JS framework to read all that data
>>
>>58262994
How many samples you had to make this a significant proof ?
>>
>>58263070
over 9000
>>
>>58261242
>Hey guys! Do u know any sites that still use SQL based data saving system ?
http://passmedia.sqlpass.org/media/24hours/2012_fall/PDF/06.pdf

>>58262433
>First and foremost, it is unlikely that you will ever reach this size of data. The developers for SQLite are actually unable to make any tests around this upper bound because there are no hard drives big enough to ever reach that limit.
Because RAIDs and SANs dont exist?

>>58262744
>Though it has a table size limit of 32 TiB, PostgreSQL has no maximum database size.
There is always a maximum size. MS SQL doesnt have table size limits, database size limit is 524,272 TB.
>>
>>58262994
How retarded are you?
The graph clearly shows what it compares, what's the measured data, the metholody and anon told us on how big data sample was the comparison conducdted.
>>
>>58263070
Fuck you it's scientific
>>
>>58263080
Ok, then... Happy new year, faggot.
>>
>>58263095
Was the data sorted? What did the tables look like? Were datasets equal? Did you test worst possible scenario, best possible scenario, both? What about simultaneous queries? What about two of the same queries in a row? Your "test" doesn't prove shit.

I can also make a graph for "select * from TABLE" and give MySQL 100 ms execution time and your CSV a million years and claim it's scientific based on rigorous tests conducted on a 100 petabyte sample and not supply any more information. That's pretty much what you did.
>>
>>58263158
>>58262954
Let me guess, Postgres didnt have any indexes?
>>
>>58262954

Now scale that data up. If you grow it to 500 GB, will Postgres still be slower?
>>
>140 TB
>csv files
I knew I'd have a laugh if I came into this thread. Thanks anon.
>>
>>58263158
> Was the data sorted?
Yes, they are.

> What did the tables look like?
Basically Product(id, name, description) for t1 and Pricetag(product_id, price, discount) for t2.

>Were datasets equal?
Strictly equals. However, the plain CSV is ASCII where Postgres uses its own data representation.

Did you test worst possible scenario, best possible scenario, both?
It's an average over 10000 queries. No scenarios here. Obviously Postgres sometimes overcomes my solution but that's on specific cases, while comparing dates for instance.

What about simultaneous queries?
Here, I lack or hardware to have something significant, but I plan to go with AWS to do so soon.
>>
>>58262214
>140 terabytes
>Does not use PostgreSQL or other database systems
>Uses files

You literally have the worst possible solution. You could easily optimize it thousandfold by moving to a proper database.
>>
>>58263256
>Hosting 140TB of data on Amazon

Holy shit, this has to be a troll.
>>
>>58263295
Given the performance I can obtain with my solution, I plan to conduct serious tests and then open a business to sell my database. So it's worth the cost.
>>
>>58263256
So let me get this right, it took 1 second for your computer to read a 50GB file, process it, and then a presumably larger result set? Yet you can run simultannious queries because of hardware limitations. I have a 8x SSD RAID it would still take 10 seconds just to read the file. What kind of bullshit are you trying to pass off on us?
>>
>>58263339
>and then
and then produce

>>58263339
>Yet you can
Yet you can't
>>
>>58263339
Just a clever in memory cache system. Easy since my solution is mostly based on scanlining the file. Not a big deal.
>>
>>58263371
>Just a clever in memory cache system.
So you have 50GB of RAM to store the table, even more to store the result set. Enough cores to join the tables together. Yet you can't run two queries at once?

>>58263334
>, I plan to conduct serious tests and then open a business to sell my database
No one is going to buy your retarded idea of storing a CSV file in a ram disk.
>>
>>58263387
> So you have 50GB of RAM to store the table, even more to store the result set.
I obviously don't have 50GB of RAM. Just I use a mechanism similar to double buffering.
> Yet you can't run two queries at once?
I didn't say that. Just that I can't run enough queries in parallel to make the results significant.
>>
>>58263421
>I obviously don't have 50GB of RAM.
Then how did you possibly read 50GB of data in less than 1 second? If you can't manage to afford that little RAM, you certainly dont have a storage subsystem faster than mine.

>I didn't say that.
You did literally right here:
>>58263256
>Here, I lack or hardware to have something significant, but I plan to go with AWS to do so soon.
>>58263158
>What about two of the same queries in a row?
>>
this thread is funny
>>
>>58263474
>Then how did you possibly read 50GB of data in less than 1 second?
Ah, I got your point, sorry I'm not English native. As I said before, I am writing a distributed filesystem. I have a few spare disks and thus, I can access data in parallel and overcome current bandwith limitations. Coupled with in memory cache, it's really performant.

> to have something significant
The point is I can run a handfull of queries, but not enough to prove my point already. And that's why I will soon rent some more performant machines on AWS to do so.

> What about two of the same queries in a row?
The graph I posted is an average of the same query run 10000 times in a row.
>>
File: Boot Disk.png (40KB, 866x746px) Image search: [Google]
Boot Disk.png
40KB, 866x746px
>>58263568
>As I said before, I am writing a distributed filesystem. I have a few spare disks and thus, I can access data in parallel and overcome current bandwith limitations.
Again as I said before, if you can't afford 64GB of RAM, you certainly can't afford a storage subsystem which can read 50GB of data in less than a second. I have 8x 480GB Seagate 600 Pros behind an Areca 1883ix-24 in a RAID 0. What the fuck storage system are you using which can read fast enough to process your query in less than a second.
>>
>>58263598

>using windows 8 meme
>>
>>58263598
Well, I won't give you all the technical details here, but here's the thing : my filesystem is quite raw. Thus I can reach near optimal bandwidth with sata 3, a bit less than 500MB/s. I have 9 drives, so we're alreadly barely at the 50GB/s with parallel reading. Moreover, remember the graph shows an average value over multiple runs, and thanks to optimizations on the caching system, things are made a little bit faster while the same data is access several times in a row. So that's it.
>>
>>58263671
thats a old screenshot before I switched it to ESXi, and it is Server 2012 R2
>>
>>58263687
90 drives*, sorry.
>>
>>58263687
>Well, I won't give you all the technical details here,
Your disks are literally incapable of providing that level of performance

>Thus I can reach near optimal bandwidth with sata 3, a bit less than 500MB/s. I have 9 drives, so we're alreadly barely at the 50GB/s with parallel reading.
Are you incapable of doing basic multiplication? 9x 500MB/sec is 4.5GB/sec, or less than 10th of the bandwidth needed to even read your dataset in the time you claimed, let alone actually process it.
>>
>>58263721
>i have 90 SSDs at home, and enough SAS expanders to connect them all; or some large all flash SAN
im guessing not. pic or it didnt happen
>>
>>58263721
also if even if you did that is still 45GB/sec, and would assume you could process the data basically instantly to the join the two tables. again, you're full of shit.
>>
File: mainframe.jpg (77KB, 375x500px) Image search: [Google]
mainframe.jpg
77KB, 375x500px
>>58263744
Here's my rig. I got it from my uncle who is a retired engineer from IBM.
>>
whatever dude

if you think you've solved the problem of storing data better than people who have been working on it for decades then go ahead
>>
>>58263687
Fuck these amateurs, tell me more about your high-performance CSV database management system (CSVDBMS). I myself am looking for a high-performance database since I do REAL work and I'm having trouble optimizing my XML database further (and, of course, shitty slow SQL can't keep up). Really pisses me off that these SQLkiddies with their toy projects think they know shit about databases.
>>
>>58263784
thats a MDF rack you retard
>>
>>58261985
You know there's built in libraries for Web apps that allow you to write database management without writing any SQL?

person = Customer()
person.name = form["name"]
person.age = form["age"]
person.email = form["email"]
person.add()
>>
>>58263806
I think the main key for performance, my friend, is using node.js. At first, I made the mistake to use C, but it's retarded. V8 is much more performant, and the access time to your database file overcomes by far any direct system calls. I still don't really understand how or why, but my hypothesis is there might be some data mining algorithm involved. Keep it up my friend, we both know SQL is seriously deprecated.
>>
>>58263817
I know, but with a little tweaks and hack, there is still a lot you can get from those. I'm planning to sell cluster of refurbished ones with CSV my database system.
>>
You know, Anon, the maximum memory bandwidth of a Core i7-6700 CPU is 34.1 GB/s, and this is considering some very extreme conditions (everything in L1 cache all the time, more or less). How, dare I ask, are you achieving this miracle of modern engineering?
>>
>>58263198
>>58263221
>>58262994
I don't know why you people are doubting this. There's nothing magical about PostgreSQL.

PG's job is to allow flexible, general, ACID queries, updates and restructurings for data sets of unknown characteristics.

Anon's job was to do one specific, exact, read-only query on a known data set.

Obviously Anon will do better.

Even the PostgreSQL developers agree. Remember their MySQL burn of "You can make a really fast database if you don't need A, C, I or D"
>>
>>58263912

Right, it's just that his test figures require a level of bandwidth not possible on modern hardware.
>>
>>58263905
You dont know what a MDF rack or a Cisco Catalyst 6500 is do you?
>>
>>58263912
Do you honestly think it'll do better for every single case? If he has a million entries in his CSV and a user queries 999999th entry, that user will have to wait until the whole list is scanned, if another makes the same query, same wait time.

B-tree ensures there's no such worst case, at worst the wait time will be equal to scanning to the 250000th place (I think), plus SQL has caching which will prevent scanning the whole thing all over again if queried for similar data

Especially since that anon plans on distributing that 140tb across several files/nodes in a network.
>>
File: link_waker.jpg (10KB, 250x250px) Image search: [Google]
link_waker.jpg
10KB, 250x250px
I dunno guys who ever came up with NoSQL meme, by I'd fucking beat him up to serious injuries.
It's just horrible to work with it. There is fucking tool language to work with databases, why in the world you want to reimplement fucking same functionality in Python scripts?
And Python (2) is a horrible language. Fucking everything up because of non-ASCII values in variables, what the fucking hell?
I got fired from my job tho.
>>
>>58261281
Even nosql uses sql.

The no in nosql means not only sql, not no sql.
>>
>>58262376
I think the problem might be that you don't know what you are doing.
>>
>>58263865
Ah, I had not considered node.js. I had a similar experience when I switched from C++ to Java: The JVM's XML optimizations simply blow any hand-written C++ out of the water, especially at scale.
>>
>>58264045
but the query language part isn't standardized, every SQL database system uses the same query syntax (as far as I know), but NoSQL can do whatever they please
>>
>>58264019
I think you're confusing the SQL language and NoSQL databases, which are actually databases without relational model.
>>
>>58264076
>every SQL database system uses the same query syntax
who the fuck writes only ANSI SQL and ignores all the features whatever manufacturer adds? I use convert all the time instead of cast in MSSQL
>>
>>58264076
Including sql
>>
>>58264045
Not in most cases. If there is any sort of SQL backend its technically classified as NewSQL
>>
>>58264079
I'm not sure I'd call things like MongoDB databases. Datastore seems like a more appropriate nomenclature
>>
>>58264116
People who write applications with the intention of supporting multiple SQL implementations without having to spend time learning each of their intricacies?
>>
>>58264066
Right ! You should try the -XX:-UseSerialGC JVM flag, it really gives a performance boost.
>>
>>58264157
Datashambles.
>>
>>58264175
>People who write applications with the intention of supporting multiple SQL implementations without having to spend time learning each of their intricacies?
So basically no one? How many applications are there even which support multiple SQL implementations? The only thing I can think of off the top of my head is VMware vCenter which supports MSSQL and Postgres
>>
>>58264207
They are also stuff like ODBC/ JDBC.
>>
>>58264207
Lots of imageboard scripts support MySQL, SQLite, and PostgreSQL without any issue
>>
>>58264246
>Lots of imageboard scripts support MySQL, SQLite, and PostgreSQL without any issue
The only image board software I know of is Infinity and it requires MySQL
>>
>>58264229
And how does ODBC prevent you anyone from using vendor specific SQL features? Just because you use ODBC doesnt mean you automagically can run your queries on any implementation.
>>
>>58264306
Right, I've been retarded on this one.
Thread posts: 91
Thread images: 6


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.