[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Is regex even useful?

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 151
Thread images: 12

Is regex even useful?
>>
you can be the office hero a few times a year if you become a regex expert.

it's not enough of a skill to land you a job on its own though.
>>
>>61028969
it has many many uses
strip malicious input from input fields is one example of many
>>
it's the most useful thing ever
>>
>>61029012
Yeah data sanitation is the only real use I can find. Everything else just seems like a bunch of search and replace methods
>>
File: lrg.jpg (122KB, 500x656px) Image search: [Google]
lrg.jpg
122KB, 500x656px
>>61028969
If you are working with text, read and enjoy
>>
>>61029036
>i never have to search for anything
?
>>
Better question. How can it be applied? What language should I use? Perl? Python? A basic command line tool?
>>
It's probably the most useful single thing I can think of in computers. Other than basic operation like typing.
>>
>>61029055
Perl has the best regex system available. look into awk and sed, see which one fits your purpose best and use it.
>>
File: Vqr4p1E.png (116KB, 1515x663px) Image search: [Google]
Vqr4p1E.png
116KB, 1515x663px
>>61028969
yes pajeet it will help you avoid your code like this
good luck!
>>
>>61029074
Thanks! What's the difference between just using "perl -pi -e 'regex' filename" and using sed for the same task?
>>
>>61029095
sed is called 'stream editor', more designed for taking input from a command & filtering or replacing text on the fly and printing it out, whereas awk & perl have other uses
>>
>>61029090
you've enlightened me
>>
>>61029055
that's the beautiful thing about regex, it doesn't matter

pretty much every common programming language supports regex in some way, some languages might support some features some other don't but the basics are the same
>>
>>61029131
holy crap I'm actually learning something on /g/.
It seems like it's much more tedious in some programming languages. I'll probably stick to perl
>>
I'm so shitty at using regex with python I literally use the os module to do it with perl:

os.system("perl -pi - e 's/\s//g' file.txt")
#gets rid of spaces as an example
>>
>>61029055
It depends on the problem you're trying to solve. I will say that regex is like the one thing that Perl is legitimately really good for.
My only other advice is that if you're using regex to clean/format data, do so as early in your data pipeline as reasonably possible.
>>
It's a shitty and limited tool, but it's better than nothing at all.

For some reason I have only ever seen one browser extension to do regex search, and even that one doesn't handle it very well.
>>
>>61029154
just stick with whatever language you're comfortable with, it really doesn't matter

also websites like regexr.com or regex101.com can be really helpful
>>
>>61029055
Perl6 is the most advanced "scripting language" currently available, as it have the type system of Haskell, the expressiveness of Perl 5, the simplicity of Python, the metaprogramming capabilities of Common Lisp, the declarativeness of Prolog, a fucking good OOP design almost as good as Smalltalk, and the soundness of ATS. And of course, it also have the most powerful regexp engine ever integrated into a programming language.

Basically if you are not learning Perl6 right now you are a moron.
>>
>>61029176
that's gross
>>
>>61029259
Hey it's easier than fucking piping the file into python and then using five different functions from the re module
>>
>>61029275
re.sub is one function, what five functions are you talking about?
>>
I never grep $HOME without it.
>>
>>61029318
Underrated post
>>
>>61029324
>>61029318
include me in the screenshot xd
>>
>>61029090
Everyone knows that if/else is faster than regex or loops.
This code is optimised, stop writing unoptimised code, you're the reason why my PC lags playing video games!
>>
>>61029365
m/faggot/gi
oh look a match.
>>
>>61029372
holy shit you're retarded
>>
>>61029372
God you're dumb. Regex is just a state machine often defined with strings. Most implementations 'compiles' the state machine at runtime and caches it.

But if you want you could do it at compile time for the most part.

Regex isn't slow unless the specific problem or implementation is made slow. For instance excessive lookahead is a problem.
>>
regex is extremely slow
Unless you compile all your regexes far in advance and intend to use them millions of times a second, they're much slower than a long if-else chain with !strcmp
>>
>>61029402
>>61029436
guys i think hes being ironic
>>
>>61029438
Oh no the lag time
I should spend an extra five minutes making a two page long if else structure so this is two milliseconds faster
>>
>>61029036
You can't "sanitise" data. You have to ensure that it's not interpreted in the wrong context when it's used e.g. a query parameter isn't interpreted as part of the problem query.
>>
>>61029490
That's literally the definition of data sanitsation
>>
File: impressionist pépé.jpg (66KB, 600x600px) Image search: [Google]
impressionist pépé.jpg
66KB, 600x600px
>>61029402
>>61029436
niggas got meme'd good
>>
File: 1498099087702.jpg (48KB, 492x449px) Image search: [Google]
1498099087702.jpg
48KB, 492x449px
>>61029438
code readability/maintainability > speed 99.9% of the time
>>
>>61029540
>>61029487
>t. CRUD monkey
>>
find pipe grep regex is faster than find -regex.
even find with 2 greps is faster.
>>
>>61029505
It's literally not. You're talking about stripping "bad" input, I'm talking about keeping it there and using whichever technique prevents it from being misinterpreted. In SQL this would be prepared statements, in web it might be escaping the output e.g. make sure that angle bracket isn't interpreted as the beginning of a script tag.
>>
>>61029074
It doesn't.
See https://swtch.com/~rsc/regexp/regexp1.html
>>
best tool in the world for parsing html
>>
>>61029578
Wait wouldn't that be stripping the bad input by using regex to remove angle brackets in whatever SQL program your using?
>>
>>61029438
>Strcmp
A well compiled regex can be faster than strcmp since strcmp relies on null terminated strings. A regex can use more advanced string matching techniques since it isn't constrained to null terminated strings.

And that's ignoring the vast complexity you'd face with trying to cover all the cases of a regex with strcmp.
>>
>>61028969
Why are you even asking? Do you want to do the equivalent work with if/else clauses?! It'd be the option in most languages where you can't really nicely specify a PEG or such.
>>
File: Doug.jpg (183KB, 1144x1716px) Image search: [Google]
Doug.jpg
183KB, 1144x1716px
>>61029402
>>61029436
Wow... I feel amazing!
Now I know why anons are baiting...
>>
I find regex ultra useful simply for being maybe two lines to implement a search, and the ability to have a search represented by a single string rather than something like a specifically designed function
I do a lot of data parsing and scripting implementations at my job though, so it's like almost designed for me. Most people I know find it useful a few times a year.
>>
File: heh39.png (393KB, 462x497px) Image search: [Google]
heh39.png
393KB, 462x497px
>write-only declarative language
>>
>>61029716
Can't sympathize. I'm more pleased by discussing things or explaining thinks to computer illiterates like you. So it's fine that you're 'baiting'.
>>
>>61029372
Regex is faster.
>>
>>61029627
Not that guy but no.
>stripping:
"The pipe | is harmful here." -> "The pipe is harmful here."
>what prepared statement does to your string content internally:
"The pipe | is harmful here." -> "The pipe \| is harmful here."
>>
>>61028969
No. There is literally no use for it. CTRL+H will suffice for every find-and-replace you will ever need to do.
>>
>>61029865
Show me a CTRL+H to find all formatted emails in my file directory
>>
>>61029055
The guy who invented Regexes deserves a Nobel.

I have used Regexes in Javascript and Python.
POSIX has C (and therefore C++) libraries for regexes as well.

How fast would an arithmetic parser based on Regexes be?
>>
>>61029883
>formatted emails
Define "formatted"? What format?
>>
>>61029904
A series of letters and numbers possibly with a period + an @ symbol + a TLD
>>
>>61029886
>The guy who invented Regexes and Larry Wall and the grep implementer deserves a punch in the face.
FTFY. I rage everytime I need to fix some shitty mail adress/URL filter writtenn in JS and Python and POSIX C and it was that guys fault for encouraging them.
>>
>>61029236

Python has too much momentum.
>>
>>61029923
And you don't want to replace anything? Well, then, you'll only need a CTRL+F and you can just search for the @ symbol, because it'll be present in every email address.
>>
>>61029236
But is there a small handy Perl6 interpreter that can be built easily and embedded easily?
Last time I checked it was a bootstrapping clusterfuck with NQP and all that jazz.
>>
>>61029808
Yeah, I know, but on /g/ there are so many repeating threads about fundamental stuff that I don't even care anymore...
>>
>>61029987
as well as every other situation in which you need an email address. You have to filter out the other shit by making sure it's a TLD on the righthand side. this is very easy with regex
>>
Low quality b8
>>
>>61030066
I was replying to >>61029987
This thread actually spawned interesting discussion
>>
>>61028969
yes, but it is a way more specific tool than people use it for
>>
>>61030101
Eh? What other tools should be used instead of Regex, according to you?
>>
>>61030130
>Eh? What other tools should be used instead of Regex, according to notards?
All kinds of prebuilt parsers. Like proper HTML/XML/[your_nested_syntax_here]/URI parsers.
So that I can use mail adress comments instead of having another trash mail account because some faggot uses the 5 line mail adress regex instead of the correct, bigger version.
>>
>>61029852
A true prepared statement will be evaluated independently of a query parameter, so the pipe would still be there unescaped but it could never even be assumed to be part of the query.

That example is valid for e.g. HTML escaping though
>>
>>61030205
95% chance of them using these parsers in a way that ignores your comment fields anyhow.

You should have more success in just publishing the correct regex in many places so almost everyone on the internet picks it up.
>>
>>61029225
>regex101.com
my favorite
thanks for reading my blog
>>
I use regex a lot in parsers and web crawlers. It's a nifty tool to have.
>>
>>61030331
do you maintain it or something?
also why no perl?
>>
>>61028969
A previous employee wrote 's script that took 1-2 hours to complete without using regex

I rewrote it with regex and it takes 5 mins. Same number of api calls.
>>
>>61030378
what did it do.
What kind of data parsing takes 1-2 hours ever
>>
yes, i use them pretty often when doing shell scripts
>>
>>61030378
>>61030393
>some slow ad hoc parser in a shit-tier scripting language by someone with no idea how to write either vs JIT compiled regex parser
gee I wonder how that happened
>>
>>61030393
i'd guess from restarts after a fail?
>>
>>61030393
It exported products from a website using an api, then generated an XML file for Google Merchant. He had hundreds, if not thousands, of if statements to handle each variation of sizes and colors, which were extrapolated from product names.

E.g if( name contains("83mm") else if name contains("84mm"))

Imagined thousands of else ifs. I wonder if I still have his code saved...
>>
>>61030355
Perl is for pussies real men use php
>>
>>61030454
What a disgusting pajeet.
>>
>>61030454
>I wonder if I still have his code saved
google "cs grad meme"
>>
>>61030454
holy shit that's amazing
please look for it
>>
>>61030454
>E.g if( name contains("83mm") else if name contains("84mm"))
nice CS beginners first program meme
>>
>>61029886
regex was invented by Barack Obama.
>>
>>61030487
Here is a small sample.

https://pastebin.com/Tx4UA8uP
>>
>>61030583
>small sample
>small
holy shit the meme is real
>>
>>61030583
MY FUCKING GOD
Guys how can we publicize this
>>
>>61030583
Did you fire the employee after this piece of shit?
>>
>>61030640
No, he got promoted to management.
>>
>>61030611
>>61030597
>>61030640

No, he moved on because he was too expensive and now he makes much more money writing iphone apps.
>>
>>61030583
>Indians
>>
>>61030738
He was asian.

Here's the story: He got a job, didn't know any coding. Then he learned coding on his own while working for the company and started applying his coding skills to help.

Even though he was a really bad coder, he did in fact provide great benefit to the company and got higher and higher pay as his skills increased.

I think he wrote this code 10 years ago and didn't bother to refactor it, he just patched it up.If he wrote it today, he would definitely use regex. Right now he's making 6 figures writing iphone apps.

He's never been to college.
>>
>>61030652
>>61030657
toppest of kek
>>
File: 1495734301408.jpg (1MB, 2048x2048px) Image search: [Google]
1495734301408.jpg
1MB, 2048x2048px
>>61030583
>https://pastebin.com/Tx4UA8uP
good lord
>>
>>61029036
Regex are very useful for editting. Just read Linux From Scratch and see how much sed is used.
>>
>>61030583
Finally found a reason to kill myself.
Thanks anon. :^)
>>
>>61028969
>>
>>61030885
>apple
lost it
>>
>>61030795
God that makes me so mad
>>
>>61028969
YES
YES
YES
A MILLION TIMES YES.

People often think of regex as funny over-complicated strings for pattern matching.

This is wrong.

Regex are PROGRAMS. That's right, when you write a regex you are actually declaring a full program with an initial state all the actions that change such state and its final states (if it even halts).

Not only are they programs, they are insanely fast programs. It's a billion times easier to make a machine execute a regex than a regular booleanic (if else, loop-like statements) program written in some other programming language.

Of course, they are harder to write and somewhat less expressive than "regular" booleanic programming languages. But their power is insane.

It's a pity people only uses them for pattern matching as they can do so much more.
>>
>>61029225
>>61030331
Regex101 is fantastic. I consider my regex skills pretty solid but I still use it for testing regex just because of how great the ui is. Reference in the bottom-right is nice, too, esp when switching between pcre and js.
>>
>>61030583
>https://pastebin.com/Tx4UA8uP
THIS IS WHY WE NEED REGEX
>>
>>61029095
Fyi - some versions of a lot of commands that use regex (other than perl) don't support pcre (perl regexes) or require the -P option for it, otherwise using a crappy primitive version of regex. So if you try a regex and things like \d don't work, that's why.
>>
>>61029074
>see which one fits your purpose best
You need both.
>>
>>61029036
>just seems like a bunch of search and replace methods
Search, replace -- by extension, data extraction and manipulation... you underestimate how complex these things can get. A simple regex can often replace anywhere from a dozen to a thousand (>>61030583) lines of code.
>>
>>61034397
>It's a pity people only uses them for pattern matching as they can do so much more.

What are some good examples of using a regex program for something that's entirely outside the realm of pattern matching?
>>
>>61029055
>What language should I use?
Awk.
>>61029236
>the type system of Haskell
>the declarativeness of Prolog
>the soundness of ATS
You're funny, anon.
>the most powerful regexp engine
Perl 6 isn't about the regular expressions, it's all about the grammars.
>>
>>61028969
very
>>
>>61028969
I'm working on a file system, many of the OpAcc tests use regex to confirm that the files that should exist do and without unintended side effects.

Not only do I use regex but they also use back references. In your FACE, op
>>
>>61030583
Did he actually write this code or did he use some sort of a Noob-Friendly Code Generator(R) tool to generate it?
>>
>>61034086
Mad? Sounds like a good story to me. Dude learned coding on the job just to help with something that presumably wasn't his job, was useful, got better at programming over the course of TEN YEARS, pay going up accordingly as you'd expect, and is now a (presumably) competent developer making a salary a competent developer with 10+ years experience makes.

If it's the fact that he didn't go to college that makes you mad, it's your fault for wasting your time and money on college in a field where it's not required.
>>
>>61034768
https://github.com/bolknote/SedChess
>>
I'm a systems engineer and programmer, I use a regex almost every day and in almost every script I write and in pretty much 100% of every service or app i create. It's one of the most used programming concepts I use.
>>
>>61035030
I'm a nigger and criminal, I use a crowbars almost every day and in almost every crime I commit and in pretty much 100% of every service or item I steal. It's one of the most used concepts I use.
>>
>>61028969
Yes, but don't write an HTML parser with it
>>
>>61035061
For those wondering... https://stackoverflow.com/q/1732348
>>
>>61034889
>Noob-Friendly Code Generator(R)
Whenever I make intentionally bad code, I tend to generate it with Java.
>>
>>61034975
I was hoping for a much smaller examples, like these kinds of classic programming programs:

- FizzBuzz
- Sieve of Eratosthenes
- Determine if a string contains balanced parentheses
- Sort a list
- Fibonacci sequence
- Convert decimal to binary
- Towers of Hanoi

It would be interesting to see these kinds of programs written as a regexp.
>>
>>61029372
D's standard library makes precompiled optimized regexes via template magic.
>>
>>61030495
Stop trying to fake a legacy, barry
>>
>>61035641
>template
I hope you misspoke. Templates and people who use them should be shot on sight.
>>
>>61035140
I did think of that, but I was specifically referring to the joke that is the /g/ "browser"
>>
>>61035732
>can't into generic programming ir TMP
babby detected
>>
>>61029236
>Perl6 is the most advanced "scripting language" currently available, as it have the type system of Haskell, the expressiveness of Perl 5, the simplicity of Python, the metaprogramming capabilities of Common Lisp, the declarativeness of Prolog, a fucking good OOP design almost as good as Smalltalk, and the soundness of ATS
Its user base is also the intersection of all of those language's user bases, aka zero.
>>
>>61030205
Regular expressions are just a DSL for writing parsers for regular languages.
>>
it's a terrible meme that should have died with perl.

the fact that it's even called regular is offensive.

literally any fucking shitty PEG parser would be better and every meme scripting language has one, pegjs, treetop, etc.

the only thing regular expression is safe for is basic transformations and tokenizing.

also f your regexp has a fuckload of recursive constructs, your an asshole.
>>
>>61028969
>notepad
>ctrl + f
>@
>>
File: ,.png (168KB, 727x682px) Image search: [Google]
,.png
168KB, 727x682px
>>61029090
That kode does not even work.
It will only use the else-case if there are neither numbers nor letters in the input.
>>
>>61037725
bad variable name too for the input aaannnnnd python has isalpha() and isdigit()
>>
>>61034975
>https://github.com/bolknote/SedChess

at what point does it make more sense to just use assembly
>>
>>61034653
as additional info: in vim you can set which version is used by adjusting the magic level
>>
>>61037722
>200 MB of emails
>@
>203495823576834 matches

they were refering to a street adress, retard.
>>
>>61029074
.net has the best regexps.
>>
>>61030583
So ridiculous its highly possible he splashed a previously existing data file containing the list elements in there. Quick and dirty and done.
>>
>>61028969
Mildly useful for searching but insanely useful for replacing, especially if it supports functional argument for replacement instead of a simple string.
>>
>>61028969
Regular expressions have probably been the most useful side skill I've ever picked up as a developer / admin.

I can't imagine working in either field without them.
>>
File: 1490478621720.jpg (40KB, 609x414px) Image search: [Google]
1490478621720.jpg
40KB, 609x414px
>>61028969
>>
>>61028969
>regex
>data sanitation
>not writing your own parsers to validate the data properly
you people are the reason I still have a job xD
>>
>>61029055
You can use regex with Notepad++ if you don't want to have to do any coding.
>>
>>61028969
Its a tool. Its like asking if print statements are useful. Use it all the time for field validation in code.
>>
>>61029540
When coding crappy programs in java yes, but in highspeed C/C++ Applications?
>>
>>61028969
nah
>>
File: 208.png (98KB, 609x414px) Image search: [Google]
208.png
98KB, 609x414px
>>61039880
That xkcd is outdated and has been renounced by Randall Munroe for its imperfections. Please use the digitally remastered version.
>>
>>61040902
Lol
>>
>>61040771
that's the 0.1%
>>
>>61029540
>regexps
>readability
Seriously.
>>
>>61029007
>it's not enough of a skill to land you a job on its own though.
True, but the Amazon support engineer jobs that start at 80k/year have interviews which consist of about 1/3 regex questions.
>>
>>61042121
Amazon is doing some interesting things. They place high value on redex and people with MBAs from top school. Not many other companies value these things... really jogs the noggin.
>>
>>61040902
>Randall Munroe

i went to college with him. he's a socially awkward, subtly pretentious faggot.
>>
>>61041978
short regexes are more readable

^-?\\d*(\\.\\d+)?$


try replicating this with a loop. of course the regex will be more readable
Thread posts: 151
Thread images: 12


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.